X

Our Analysts' Insights

08Feb

Zero-Shot Translation is Both More and Less Important than you think

Recent advances in neural machine translation (NMT) represent a significant step forward in machine translation capabilities. Although most media coverage has significantly oversold the technology, one of Google’s announcements may actually be the most important one in the long run – the first successful deployment of zero-shot translation (ZST)

Just what is zero-shot translation? It is the capability of a translation system to translate between arbitrary languages, including language pairs for which it has not been trained. 

To understand why this is important, consider how traditional statistical machine translation (SMT) systems work. They build up bilingual phrase tables that correlate text in two languages. Because they connect individual tongues on a one-to-one basis, these systems require training data and a separate customized instance of the MT system for each pair. They cannot translate between two languages for which no engine exists unless they can use a shared third language – called the pivot. 

For example, if a system needs to translate from Finnish into Greek, but has no Finnish↔Greek training data, it might use a Finnish↔English engine to translate into English – the pivot – and then use a separate English↔Greek one to arrive at the target. Although this approach can produce readable output, the results are often unreliable and inferior because errors in the first language pair tend to compound in the second one. 



By contrast, Google’s neural system feeds all training data into one engine, which allows it to build connections across multiple languages rather than individually between them. If no data exists for a particular pair, the software can then use inferential logic to deduce correct translations from other ones that do contain relevant data. To return to the previous example, if no useful data exists for the Finnish→Greek case, it would observe correlations between other languages to produce output. The output is likely to not be as good as for cases where in-pair data exists, but it is better than nothing and it helps fill in gaps in in the more than 3,000 language pairs that Google’s 113 languages can produce. 

Two crucial aspects stand out: 

  1. Google’s system does not use a pivot. It does not translate from one language to another and then feed its output back in to reach the third. Contrary to Google’s press announcements and tech bloggers, the system did not invent its own language (much less one called “Interlingua,” a term that has a specific meaning in MT research) that serves as a pivot. Instead, it uses all available data to move directly from one language to another.
     
  2. It can leverage data from multiple language pairs. Unlike the pivot scenario where only one language pair at a time participates in the translation, Google’s system can potentially use as many language pairs as contain relevant data at the same time. The result is better than trying to bridge a gap using a single intermediary.

Why Is ZST important? In a field often driven by hype and hyperbole, this may be the rare case where a development is more important than media coverage makes it out to be. The biggest benefit comes for under-resourced language pairs such as Finnish↔Greek that remained stubbornly out of reach for SMT systems. 

These benefits are likely to be especially important in the European Union, which faces ongoing difficulty in providing access to its institutions through all of its 24 official and working languages. Covering all of them would require 288 bidirectional engines with SMT technology, but the European Commission does not have sufficient training data for most of these pairs. It had planned to create training data for some pairs and use pivot translation for others. Zero-shot systems should produce better results at a much lower cost and help the EU address its language-blocking problems

Google’s development significantly raises the bar for machine translation. But Google needs to be careful not to characterize the system’s accomplishments in ways that may play well with the tech press but that ultimately oversell its “cool factor” while underselling what is truly disruptive.

About the Author

Arle  Lommel

Arle Lommel

Senior Analyst

Focuses on language technology, artificial intelligence, translation quality, and overall economic factors impacting globalization

Related

Making the Best of a Bad Year: Five Lessons for 2021

Making the Best of a Bad Year: Five Lessons for 2021

As we look back at the annus horribilis that was 2020, what are some things we can learn and take fo...

Read More >
Where Is Your Translation Technology?

Where Is Your Translation Technology?

Long gone are the days when only the biggest enterprises or language service providers had their own...

Read More >
Microsoft Custom Translator: A Big Step Forward

Microsoft Custom Translator: A Big Step Forward

In late 2015 most developers still treated Neural Machine Translation (NMT) as a future technology t...

Read More >
Augmented Translation: Are We There Yet?

Augmented Translation: Are We There Yet?

In 2017, CSA Research introduced the concept of “augmented translation,” a technology-centric appr...

Read More >
The Linguist of the Future: Skills that Cannot Be Replaced by Automation

The Linguist of the Future: Skills that Cannot Be Replaced by Automation

In recent conversations, many enterprises and language service providers have expressed concern abou...

Read More >
Writing for the World – Optimizing Your Global CX

Writing for the World – Optimizing Your Global CX

For those who have worked within the localization industry for a while, the concept of “writing for...

Read More >

Subscribe

Name

Categories

Follow Us on Twitter