Predicting Human Translation Difficulty Using Automatic Word Alignment

Zheng Wei Lim, Trevor Cohn, Charles Kemp, Ekaterina Vylomova

Findings: Multilingualism and Cross-Lingual NLP Findings Paper

Session 4: Multilingualism and Cross-Lingual NLP (Virtual Poster)
Conference Room: Pier 7&8
Conference Time: July 11, 11:00-12:30 (EDT) (America/Toronto)
Global Time: July 11, Session 4 (15:00-16:30 UTC)
Spotlight Session: Spotlight - Metropolitan West (Spotlight)
Conference Room: Metropolitan West
Conference Time: July 10, 19:00-21:00 (EDT) (America/Toronto)
Global Time: July 10, Spotlight Session (23:00-01:00 UTC)
Keywords: multilingualism
Languages: malay, chinese
TLDR: Translation difficulty arises when translators are required to resolve translation ambiguity from multiple possible translations. Translation difficulty can be measured by recording the diversity of responses provided by human translators and the time taken to provide these responses, but these beha...
You can open the #paper-P2629 channel in a separate window.
Abstract: Translation difficulty arises when translators are required to resolve translation ambiguity from multiple possible translations. Translation difficulty can be measured by recording the diversity of responses provided by human translators and the time taken to provide these responses, but these behavioral measures are costly and do not scale. In this work, we use word alignments computed over large scale bilingual corpora to develop predictors of lexical translation difficulty. We evaluate our approach using behavioural data from translations provided both in and out of context, and report results that improve on a previous embedding-based approach (Thompson et al., 2020). Our work can therefore contribute to a deeper understanding of cross-lingual differences and of causes of translation difficulty.