AMTA 2010 Workshop -- Collaborative Translation: technology, crowdsourcing, and the translator perspective
Table of contents
BackgroundThis is the official page for the AMTA 2010 workshop on Collaborative and Crowdsourced Translation, which was held in Denver Colorado, on Sunday October 31st 2010.
Collaborative and social networking technologies like Wikipedia, Facebook and Amazon Mechanical Turk, are having profound effects in many spheres of human activity. Translation is no exception, as evidenced by the recent emergence of collaborative technologies and paradigms such as:
- Translation teamware: systems that allow multidisciplinary teams of professionals (translators, terminologists, domain experts, revisers, managers) to collaborate on large translation projects, using an agile, grassroots process instead of the more assembly-line, top-down approach found in most translation workflow systems.
- Collaborative terminology resources: Wikipedia-like platforms for the creation and maintenance of large terminology resources by a crowd of translators, terminologists, domain experts, and even general members of the public.
- Translation Memory sharing: platforms for large scale pooling and sharing of multilingual parallel corpora between organizations and individuals.
- Online marketplaces for translators: eBay-like, disintermediated environments for connecting customers and translators directly, with minimal intervention by a middle man.
- Translation crowdsourcing: Mechanical Turk style systems for splitting translation projects into small chunks, and distributing them across large crowds of mostly amateur translators.
- Post-editing by the crowd: systems allowing a large crowd of mostly amateurs to correct the output of machine translations systems, to suggest better translations.
The aim of this one-day workshop was to bring together a multidisciplinary group of researchers and practitioners from both fields of technology and translation, in order to discuss and explore the impact, present and [[http://www.truckaccidentlawyersource.com|truck accident lawyer]] future, of this type of technology. In particular, we aimed at starting a constructive two-way dialogue between developers and potential users of these technologies. To that effect, the workshop used a participatory, attendee-driven format.
ThemesBelow is a list of themes that were suggested in the Call for Participation. Participants were encouraged to think about and suggest additional themes at the event.
Impact on the translation profession
Some of these technologies (translation teamware, collaborative terminology resources, Translation Memory sharing, online marketplaces) present clear benefits and new opportunities for professional translators. But others (translation crowdsourcing, post-editing by the crowd) could present a threat to their livelihood. How can professional translators prepare for these developments? Will these technologies decrease demand for professionals, or will they increase the pie and be used only for content which currently is not being translated at all (for example, by allowing speakers of small languages like Haitian Creole, to translate content that is particularly relevant to them)? Will professional translators still play a remunerated role, even in cases where tasks are crowdsourced to amateurs (for example, by performing quality assurance or coaching the amateurs)? Which skills/computer resources/qualifications are needed for staying in the business of translation in this new context?
Impact on translation technology
How could these technologies be used to improve the performance of machine translation systems? Can millions of people, professional translators and amateurs, teach machines how to do a better job at translating? Can machines be used to facilitate collaboration between humans, for example by connecting customers with translators who seem particularly suited for a given translation project? In a crowdsourcing context, how should we adapt tools originally developed for professionals, so that they are better suited to the specific needs and limitations of amateurs?
Quality assurance and appropriateness of the technology
All of the above technologies lead to environments which are more grassroots, and less tightly controlled from the top than is typical found in most professional contexts. This is true even of technologies that specifically target professionals. What effect does that have on quality? How can we characterize circumstances where such collaboration will increase quality, instead of decreasing it? How should these technologies be used in contexts with different quality requirements, ranging from “fit for gisting” to “fit for dissemination” quality? Can quality assurance itself be done collaboratively? How can tools be designed to make the crowd collectively smarter than its individuals (wisdom of crowds effect), instead of having it act as a mindless mob?
Co-development and mutual understanding between stakeholders
How do we foster constructive dialog between stakeholders, to ensure that these technologies reach a balance point that meets their respective needs? How can developers learn more about professional translators and their work, in order to build collaborative environments that leverage the unique skills of that constituency? How can professional translators and their customers learn more about the possibilities offered by these new technologies, so that they can use them to improve productivity while still ensuring fair compensation and quality? How can professional translators reach out to translation buyers to make them understand the benefits and limitations of such technologies (e.g., why would it not be a good idea to crowdsource translation of a patent)?
Workshop CommitteeThe workshop committee consisted of the following people:
- Alain Désilets (chair), Institute for Information Technology, National Research Council of Canada, alain.desilets at nrc-cnrc.gc.ca
- Naomi Baer, Director, Microloan Translation and Review at Kiva.org, naomi at kiva.org
- Renato Beninatto, CEO, Milengo, renato.beninatto at milengo.com
- Chris Callison-Burch, Center for Language and Speech Processing, Johns Hopkins University, ccb at cs.jhu.edu
- Kyo Kageura, Library and Information Science Department, University of Tokyo, kyo at p.u-tokyo.ac.jp
- Elina Lagoudaki, Humanities Department, Imperial College London, e.lagoudaki at imperial.ac.uk
- Dorothee Racette, ATA President-elect (2009-2011), dracette at hughes.net
- Philip Resnik, Department of Linguistics and Institute for Advanced Computer Studies, University of Maryland, resnik at umd.edu
- Willem Stoeller, Lingotek, Director accounts, wstoeller at lingotek.com
ParticipantsThe following people participate in the actual event in Denver.
|Last Name||First Name||Title and Affilliation|
|Désilets||Alain||National Research Council of Canada (workshop chair)|
|Munro||Rob||Stanford University (keynote speaker)|
|Zetzsche||Jost||International Writers Group (keynote speaker)|
|Baer||Naomi||Director Microloan Translation and Review, Kiva.org|
|Callison-Burch||Chris||Johns Hopkins University|
|Chen||Jiangping||University of North Texas|
|Hardt||Daniel||Copenhagen Business Scool|
|Holland||Rod||The MITRE Corporation|
|Jurica||Vanessa||The MITRE Corporation|
|Kronrod||Yakov||University of Maryland|
|Racette||Dorothee||President Elect, ATA|
|Riedl||John||Translating Cultures LLC|
|Sennrich||Rico||University of Zurich|
|Seo||Jin Hyung||DooBee Inc|
|Tenney||Merle, D.||Language Technology Consultant|
|van der Meer||Jaap||Taus|
|Vogel||Stephan||Carnegie Mellon University|
|Kumaran||A||Multilingual Systems Research, Microsoft Research India|
|Stoeller||William||Director accounts, Lingotek (confirmed)|
Issues brainstorming and breakout sessionsIn order to maximize discussion, the workshop used a facilitated, participatory format. We started with a brief self-introduction by each of the participants. This was followed by a 45 minutes brainstorming exercise where participants expressed issues or thoughts that were on their mind. During the first break, three volunteers collaboratively arranged these issues into clusters of related questions.
The result of this exercise was a "map" of the participants concerns about collaborative/crowdsourced translation, which included the following eight clusters.
- Crowd motivation
- Monolingual contributors
- Professionalization in crowd-sourcing
- Business case
- Output quality
- Source quality
- Massive data sharing
- Platform requirements
Attendees then formed breakout groups to discuss each of those clusters for 60 minutes each. A summary of each breakout discussion is available by clicking on the corresponding link above.
Keynote talks and position papersWe also had two keynote talks:
- Rob Munro, Crowdsourced translation for emergency response in Haiti: the global collaboration of local knowledge
- Jost Zetzsche, Crowdsourcing and the professional translator
- Request to all atttendees: Please Go to this page and writeup what you remember from that discussion: http://ietherpad.com/WqeZkreDiS
- Once it stabilizes, Alain will copy it to the wiki.
- Relevance: the paper should be on a topic that is clearly related to the theme of the workshop.
- Usefulness: content of the paper should be informative and useful for at least one of the following constituencies: developers, translators or translation customers.
- Style: the paper should be written in a style that is appropriate for an academic publication or trade journal. Although not a strict requirement, we encourage authors to support their arguments with references and empirical evidence whenever possible. Papers which are deemed too commercial or sales-oriented will be rejected. Also, while we welcome essays and opinion papers, the workshop committee reserves the right to reject submissions whose tone is deemed inflammatory or disrespectful.
- Format and length: papers should have a maximum of 4 pages, and follow the formatting guidelines specified here: http://amta2010.amtaweb.org/cfp-mt.htm
- Kumaran et al, WikiBABEL: A System for Multilingual Wikipedia Content
- Kronrod et al, Improving Translation via Targeted Paraphrasing
In addition, the following last minute paper was submitted, but was not subject to peer-review:
AMTA 2010 Workshop — Insights of the day. It was also felt by all participants that there was a need for this group, or similar multidisciplinary groups of translators and technologists, to meet again to discuss issues of common interest. If you would are interested in participating in such an event, and or, would like to help planning it (even if it's just to make some suggestions), please put some information on this page: Planning a followup to the AMTA 2010 Workshop on Collaborative and Crowdsourced Translation.
Page aliases: AMTA 2010 Workshop