Imprimer
Partage :

Artificial intelligence is no stranger to translation

By Debbie Folaron

The recent bombardment of news on artificial intelligence (AI) in the media and throughout the professional translation industry might lead one to conclude that AI is a newcomer in the domain. Yet, in reality, the professional sector and academic discipline have been dealing with various forms of AI in translation for quite some time. From the mid-20th century, much of this work has focused on research in the field of natural language processing (NLP). As one of several branches and subsets of AI, NLP deploys machine learning (ML) techniques with the goal of enabling computers to recognize, communicate, and translate language. In concrete terms of machine translation (MT), methods that were “rule-based” (RBMT), “example-based” (EBMT), and “statistical” (SMT) yielded important but mixed results, requiring considerable post-editing (PE) for most languages.1 For many years, MT research and expertise were concentrated within the spheres of computational linguistics and computer sciences, and limited within translation and terminology studies to specialized niches that had emerged and developed in tandem with them. 

Additional layers of complexity…

Much of the public first became aware of neural machine translation (NMT) when it was launched by Google in 2016. Building on earlier artificial neural network (ANN) research initially inspired by psychologist Frank Rosenblatt’s Perceptron2 in 1957 and later by computer scientist Marvin Minsky’s and mathematician Seymour Papert’s Perceptrons in 19693, NMT today is the application of ANNs to translation on the basis of data trained on a parallel corpus (i.e., a corpus and its translation), whereby layers of neural units build numerical representations of words, mapping their connections and relationships in terms of similarity and co-occurrence4. A rapid, exponential increase in computing capacity and proliferation of data (from kilobytes to terabytes and petabytes) has facilitated this transition. So has the evolution in ML, which now uses ANN techniques, algorithms, and observational data more effectively through “deep learning” (DL). As stated clearly by the MT researcher Mikel Forcada, “the [deep] representations […] are not built in one shot but in stages from other shallower representations or layers [that] usually contain hundreds of neural units” (2017)5. The capabilities of DL to recognize patterns, and to map, represent, and process images and written and spoken language with higher accuracy and quality, have had an impact on MT techniques, with a result that NMT or a hybrid model of combined SMT/NMT have been gradually integrated into translation technology systems and platforms for almost a decade.

A generational turn…

In November 2022, the American AI intelligence research organization founded in 2015 made headlines when it released its generative AI (GenAI) chatbot ChatGPT to the general public. “Innovative” and “disruptive” in commercial lingo, GenAI used for translation is at the same time linked to the history of MT (e.g., Google Research’s “zero-shot” multilingual NMT6) but also a major breakthrough due to better ways of contextualizing language use and analyzing training data. Although still in the early days, GenAI is a more advanced type of AI technology based on DL with large language models (LLMs) aptly trained to generate “synthetic” data, text, images, and video7 in response to user “prompts”. The value of well-crafted prompts lays in their being able to provide more precise context to queries, thus allowing for more relevant and meaningful responses. Although not specifically created as an MT system, ChatGPT can be prompted to produce a translation, with results that are comparable in quality for basic translation tasks in certain high resource language pairs8. Officially, its interface currently supports 60 languages9, but test results indicate that it can also respond in many more languages. 

In the steps of science: public- and commercial-oriented research

Alongside traditional scientific MT research, and in particular since Google’s release of Google Translate to the public in 200610, there has been a vibrant vein of MT and translation technology research for public and commercial use in an expanding number of languages equipped to handle multilingual contexts. In the professional translation sector, larger companies and agencies endowed with greater capacity for computing, funds, and internal management processes have implemented these evolving technologies to automate and manage parts of the translation production cycle11. The localization of software, websites, mobile devices, and multimodal media is a prime early example, where computer-assisted translation (CAT) and localization systems could process the complex files used to compile software or to design, create, and maintain websites and a wide variety of online platforms. Once anchored mainly by translation memories (TMs) – an offshoot of early MT research – the processes associated with producing translations for and within “new media” formats (e.g., translating segments in consultation with TM databases or using specialized software to isolate tags generated from markup languages like html and xml) have since integrated functionalities for MT. There is now support for GenAI, with AI “agents” and assistants set in place to guide users in the translation process.12 At the same time, automatically generated translation is not confined solely to expert professionals in the domain. Popular MT apps are already in use worldwide by the general public in hundreds of language pairs. More and more, AI agents serve as the interface between humans and technology, appearing in the many software apps online and on computing devices and smartphones. They assist not only in writing, recording, organizing information, and the handling of finances and myriad other common tasks; but also in translating. Google Lens, for example, lets users focus their smartphone camera on a menu or a road sign and have the text translated in return. In many ways, it is clear that the relation of humans to MT output has changed.

Forging a collaborative way forward…

New relationships can be exhilarating or intimidating, and so it is between humans and AI. A considerable amount of work has been done in academic and professional literature already, with some researchers like computer scientist Yoshua Bengio13 urging caution on how we proceed. Deeper integration of AI, including GenAI-MT, in the current technologies and platforms we use have impacts that far exceed linguistic concerns. They are social, political, economic, and ideological too. As the mechanisms for quality control continue to improve translation results at faster rates, and as monitoring for translation bias based on the biases (gender, racial, ethnic, cultural) embedded in datasets on which GenAI models are trained is fine-tuned, it is clear that AI technologies in translation will increasingly play a major role in many facets of human life. Translation studies research has already laid a foundation that addresses relevant issues and challenges. They include, among others: 1. Dissemination of MT literacy information for both translators and non-translators14; 2. Responsible AI use and questions of trust15; 3. Initiatives to label translation output inspired by consumer labeling16; 4. Ongoing studies to test results between specialized MT and GenAI translation17; 5. Issues of privacy and MT18; 6. Disinformation and misinformation19; and 7. Centering the “human” in translation20. It is only by understanding how AI works, and its implications and repercussions on human communication, that we can safeguard the intrinsic value of the many complex neural and linguistic processes involved in translating and interpreting that make us human.

Debbie Folaron is Associate Professor of Translation Studies at Concordia University in Montreal. Her research focuses on general and minority translation practices in the digital age.

Cited References: 

Bowker, L. (Sept 9, 2024). “What is Machine Translation Literacy?” The MT Literacy Project. https://sites.google.com/view/machinetranslationliteracy/

Brockmann, D. (May 2, 2024. Updated June 12, 2024). “Accelerating innovation: New generative translation capabilities in Trados Studio 2024” entry in Trados blog. https://www.trados.com/blog/new-generative-translation-capabilities-in-trados-studio/

Cannavina V. (Jul 8, 2022). “Empowering localization project managers with linguistic AI” entry in RWS blog. https://www.rws.com/blog/empowering-localization-project-managers-with-linguistic-ai/

CBC. The Current with Matt Galloway (Sept 5, 2024). “Does Yoshua Bengio regret helping to create AI?”. https://www.cbc.ca/listen/live-radio/1-63-the-current/clip/16092371-does-yoshua-bengio-regret-helping-create-ai

Forcada, M. (2017). “Making sense of neural machine translation” in Translation Spaces 6:2, 291–309 (https://benjamins.com/catalog/ts). (preprint available at https://www.dlsi.ua.es/gent/mlf/docum/forcada17j2.pdf). 

Harby, A. (Sept 9, 2024). “Can I Use ChatGPT for Translation?” in Slator. https://slator.com/resources/can-i-use-chatgpt-for-translation/

Kenny, D. (2022). Machine translation for everyone: Empowering users in the age of artificial intelligence. (Translation and Multilingual Natural Language Processing 18). Berlin: Language Science Press. DOI: 10.5281/zenodo.6653406. https://www.multitrainmt.eu/index.php/en/neural-mt-training/book-machine-translation-for-everyone

Melby, A. and G. Lester (Sept 9, 2024). “Consumer Protection Labels for Translations (v7f). An overview for language service companies and translation publishers based on ASTM F2575 and ISO 11669” published on Tranquality website https://www.tranquality.info/wp-content/uploads/2024/05/ConsumerProtectionLabelsforTranslations-524.pdf

Minsky, M. and S. A. Papert (1969, updated in 2017). Perceptrons: An Introduction to Computational Geometry. MIT Press Direct.

Nunes Vieira, L., C. O’Sullivan, X. Zhang, and M. O’Hagan (2023). “Privacy and everyday users of machine translation” in Translation Spaces 12:1, 21-44 (https://www.jbe-platform.com/content/journals/10.1075/ts.22012.nun). 

O’Brien, S. (2023). “Human-Centered augmented translation: against antagonistic dualisms” in Perspectives, 391-406. https://www.tandfonline.com/doi/full/10.1080/0907676X.2023.2247423?src=

O’Brien, S. and M. Ehrensberger-Dow (2020). “MT Literacy—A cognitive view” in Translation, Cognition, Behavior. 145-164. John Benjamins Publishing. https://www.jbe-platform.com/content/journals/10.1075/tcb.00038.obr

Open AI (Sept 9, 2024). “DALLE-E”. https://openai.com/index/dall-e-2/

—. (Sept 9, 2024). “How to change your language setting in ChatGPT”. https://help.openai.com/en/articles/8357869-how-to-change-your-language-setting-in-chatgpt#h_513834920e

—. (Sept 9, 2024). “SORA”. https://openai.com/index/sora/

Quelle, D., C. Cheng, A. Bovet, and S.A. Hale (2023). “Lost in Translation — Multilingual Misinformation and its Evolution”. arXiv:2310.18089 [cs.CL]. https://arxiv.org/abs/2310.18089

Rothwell, A., J. Moorkens, M. Fernández-Parra, J. Drugan, and F. Austermuehl (2023). Translation Tools and Technologies. Routledge. https://www.taylorfrancis.com/books/mono/10.4324/9781003160793/translation-tools-technologies-joanna-drugan-joss-moorkens-mar%C3%ADa-fern%C3%A1ndez-parra-andrew-rothwell-frank-austermuehl

Schuster, M. (Nov 22, 2016). “Zero-Shot Translation with Google’s Multilingual Neural Machine Translation System” in Google Research blog. https://research.google/blog/zero-shot-translation-with-googles-multilingual-neural-machine-translation-system/

Sinitsyna, D. (Feb 9, 2024). “Generative AI for Translation in 2024” entry in Intento blog. https://inten.to/blog/generative-ai-for-translation-in-2024/

“The Montréal Declaration for a Responsible Development of Artificial Intelligence”. https://montrealdeclaration-responsibleai.com/the-declaration/.

W3Schools (2024). “Perceptrons”. https://www.w3schools.com/ai/ai_perceptrons.asp

Walker, C. (2022). Translation Project Management. Routledge. https://www.routledge.com/Translation-Project-Management/Walker/p/book/9780367677732. Wikipedia (Sept 9, 2024). “Google Translate”. https://en.wikipedia.org/wiki/Google_Translate.


1 RBMT has three basic types: direct, transfer, and interlingua, and makes use of dictionaries and grammatical information input into the system to generate a translation. EBMT and SMT are data-driven, with the former relying on bilingual parallel corpora and the latter on statistical probability algorithms applied to both a language model trained on a monolingual target text (TT) corpus and a translation model trained on a large parallel corpus. See Rothwell et al. (2023, 98-101).

2 See W3Schools “Perceptron”: https://www.w3schools.com/ai/ai_perceptrons.asp

3 https://direct.mit.edu/books/monograph/3132/PerceptronsAn-Introduction-to-Computational

4 One method by which this is done is through the encoder-decoder model. See Rothwell et al. (2023, 101-106) and Forcada (2017) https://www.dlsi.ua.es/gent/mlf/docum/forcada17j2.pdf.

5 Forcada (2017): https://www.dlsi.ua.es/gent/mlf/docum/forcada17j2.pdf

6 Schuster, M. November 22, 2016.
https://research.google/blog/zero-shot-translation-with-googles-multilingual-neural-machine-translation-system/ 

7 For example, OpenAI’s DALLE-E for images (https://openai.com/index/dall-e-2/) and SORA for video (https://openai.com/index/sora/).

8 Harby, A., writing for Slator (2024). https://slator.com/resources/can-i-use-chatgpt-for-translation/ 

9 OpenAI (2024). https://help.openai.com/en/articles/8357869-how-to-change-your-language-setting-in-chatgpt#h_513834920e

10 See “Google Translate” in Wikipedia: https://en.wikipedia.org/wiki/Google_Translate

11 See also Callum Walker’s book Translation Project Management, published in December 2022.

12 Trados, for instance, very recently released its “generative translation engines (GTEs) and a “Trados CoPilot-AI Assistant” in its 2024 version of Trados Studio. (Brockmann, D., writing for the May 2nd Trados blog, updated on June 12th: https://www.trados.com/blog/new-generative-translation-capabilities-in-trados-studio/.) See also Valeria Cannavina’s July 8, 2022 entry on the RWS blog on how to prepare project managers for a “future with linguistic AI”: https://www.rws.com/blog/empowering-localization-project-managers-with-linguistic-ai/

13 Bengio helped draft the “Montréal Declaration for the Responsible Development of Artificial Intelligence” (https://montrealdeclaration-responsibleai.com/the-declaration/) and has been actively voicing his concerns (https://www.cbc.ca/listen/live-radio/1-63-the-current/clip/16092371-does-yoshua-bengio-regret-helping-create-ai).

14 See O’Brien and Ehrensberger-Dow (2020) and Bowker (2024) in Cited References.

15 See Kenny (2022) in Cited References.

16 See Melby and Lester in Cited References.

17 See Sinitsyna in Cited References.

18 See Nunes Vieira et al. in Cited References.

19 See Querelle et al. in Cited References.

20 See O’Brien in Cited References.


Partage :