Nowadays, machine translation (MT) is at everyone’s fingertips, as services like DeepL or Google Translate make the process seem quick and effortless. However, when it comes to emotions, artificial neural networks cannot guarantee the accuracy of their translations, as nuances in context and feelings go way beyond understanding the meaning of individual words.
Institut für maschinelle Sprachverarbeitung (IMS), University of Stuttgart, Germany
Emotions are a central aspect of successful communication. This makes evaluating the capacity of machines to distinguish emotional nuances critical in determining the accuracy of translations produced by artificial intelligence. This question is arguably relevant for the professional and casual user alike, especially if the latter has a limited grasp of the target language and might even struggle with recognizing changes in emotional content when they happen.
Current machine translation systems are based on machine learning frameworks, known as artificial neural networks, hence the name neural machine translation (NMT). These systems are built on the basis of large collections of texts and their respective translations by learning to extrapolate useful correspondences. Such correspondences are not based on rules, as their older counterparts were, but take word meaning into account. To “grasp” meaning, the system creates a language-independent “semantic space” (called “interlingual representation”) to bridge the gap between the two languages.
This strategy has led to a huge improvement in usefulness: NMT models fit much better than previous approaches with the vast number of language pairs and text types that one might wish to translate. However, unsurprisingly, meaning is very hard to describe comprehensively, and the way in which MT systems capture it is rather basic. In fact, when such systems are trained, success is generally measured simply in terms of word-by-word matches with the human reference translation.
The process described above shows that systems have no explicit concept of affective aspects such as emotions. Emotions are translated faithfully only if the system manages to choose words with the correct emotional connotation. However, just a minority of words in a language is emotional, and computational models tend to struggle with infrequent phenomena. It would be reasonable to think, therefore, that MT systems might produce translations that either reduce emotional intensity(feast → meal) or even change emotions(feast → slop). To find out if this is true, the authors of this article decided to conduct a study.1
The study adopted the standard psychological viewpoint of treating the basic emotions as a small set of discrete categories, which—according to American psychologist Paul Ekman—are anger, disgust, fear, guilt, joy, sadness, and shame. The team worked with an existing collection of about 7,000 emotional short texts that were self-reports of events associated with specific dominant emotions. The texts were then translated using a state-of-the-art, freely available machine translation system. Once the system-produced translations were ready, the researchers measured the emotional intensity with an emotion classifier, a machine-learning-based module that analyzes the intensity of the different Ekman emotions in a given text, making it possible to analyze the changes in emotional content brought about by the NMT system.
The outcome of the study validates its hypothesis: the original emotions remain mostly preserved, but their intensities decrease to some extent, and other emotions start to appear. Put differently, the emotional content of the translations tends to become somewhat blurred. Of course, this effect manifests itself in each sentence. For example, the sentence used to indicate guilt, “Feeling guilt after greed, buying chocolate and pigging out to the point of feeling sick,” was translated into “Feelings of greed, buying chocolate and exploitation to the point of nausea.” The translation omits the explicit emotion marker “guilt” and replaces the emotional “pigging out” with the more neutral, and rather ill-fitting, “exploitation.” This leaves the emotional cue “nausea,” arguably shifting the dominant connotation from guilt to disgust.
A general solution to this problem calls for fundamental changes to the machine translation mechanism, such as taking emotion preservation into account. However, the study also found a straightforward heuristic. Under the hood, NMT systems create numerous translation candidates, but usually only the one deemed to be the best is returned to the user. With a system that gives the user more general access, for example, to the best 20 translation candidates, an emotion classifier can be used to select the candidate whose emotion intensity corresponds best to the input text. This simple procedure reverses the blurring effect of the translation on emotional content. However, depending on the sentence, this may involve a trade-off between emotion preservation and other desirable properties of the text, such as syntactic well-formedness—after all, there is typically a reason why the candidate with the best emotion match was not top-ranked by the translation system.
The study concluded that current NMT systems do a reasonable job of translating emotional content but are still far from perfect. Casual users need to be aware that translations are approximate and should double check them as far as their proficiency in the target language permits. Professional users should carefully review the choice of words and look out for subtle emotional shifts, as some pairs of emotions—including guilt and shame, as well as disgust and anger—are quite easy to confuse, not just for computers, but also for human beings.
Sebastian Padó is Chair of Theoretical Computational Linguistics at IMS, University of Stuttgart, Germany. His research focuses mostly on modelling and analyzing how meaning is conveyed in language, often in a multilingual setting. Such models find application in translation, semantic processing, digital humanities, and social sciences.
Enrica Troiano is a PhD student at IMS, University of Stuttgart, Germany. Her research focuses on the interplay between the automatic treatment and theoretical models of emotion phenomena and stylistic aspects of language.
Roman Klinger is Research Group Leader at IMS, University of Stuttgart, Germany. His goal is to enable computers to understand language both regarding factual and non-factual information, with a focus on emotion and sentiment analysis and modelling of psychological concepts as they find realization in written language.