X

Our Analysts' Insights

Blogs & Events / Blog
08Feb

Zero-Shot Translation is Both More and Less Important than you think

Recent advances in neural machine translation (NMT) represent a significant step forward in machine translation capabilities. Although most media coverage has significantly oversold the technology, one of Google’s announcements may actually be the most important one in the long run – the first successful deployment of zero-shot translation (ZST)

Just what is zero-shot translation? It is the capability of a translation system to translate between arbitrary languages, including language pairs for which it has not been trained. 

To understand why this is important, consider how traditional statistical machine translation (SMT) systems work. They build up bilingual phrase tables that correlate text in two languages. Because they connect individual tongues on a one-to-one basis, these systems require training data and a separate customized instance of the MT system for each pair. They cannot translate between two languages for which no engine exists unless they can use a shared third language – called the pivot. 

For example, if a system needs to translate from Finnish into Greek, but has no Finnish↔Greek training data, it might use a Finnish↔English engine to translate into English – the pivot – and then use a separate English↔Greek one to arrive at the target. Although this approach can produce readable output, the results are often unreliable and inferior because errors in the first language pair tend to compound in the second one. 

pivot_vs_zero_shot


By contrast, Google’s neural system feeds all training data into one engine, which allows it to build connections across multiple languages rather than individually between them. If no data exists for a particular pair, the software can then use inferential logic to deduce correct translations from other ones that do contain relevant data. To return to the previous example, if no useful data exists for the Finnish→Greek case, it would observe correlations between other languages to produce output. The output is likely to not be as good as for cases where in-pair data exists, but it is better than nothing and it helps fill in gaps in in the more than 3,000 language pairs that Google’s 113 languages can produce. 

Two crucial aspects stand out: 

  1. Google’s system does not use a pivot. It does not translate from one language to another and then feed its output back in to reach the third. Contrary to Google’s press announcements and tech bloggers, the system did not invent its own language (much less one called “Interlingua,” a term that has a specific meaning in MT research) that serves as a pivot. Instead, it uses all available data to move directly from one language to another.
     
  2. It can leverage data from multiple language pairs. Unlike the pivot scenario where only one language pair at a time participates in the translation, Google’s system can potentially use as many language pairs as contain relevant data at the same time. The result is better than trying to bridge a gap using a single intermediary.

Why Is ZST important? In a field often driven by hype and hyperbole, this may be the rare case where a development is more important than media coverage makes it out to be. The biggest benefit comes for under-resourced language pairs such as Finnish↔Greek that remained stubbornly out of reach for SMT systems. 

These benefits are likely to be especially important in the European Union, which faces ongoing difficulty in providing access to its institutions through all of its 24 official and working languages. Covering all of them would require 288 bidirectional engines with SMT technology, but the European Commission does not have sufficient training data for most of these pairs. It had planned to create training data for some pairs and use pivot translation for others. Zero-shot systems should produce better results at a much lower cost and help the EU address its language-blocking problems

Google’s development significantly raises the bar for machine translation. But Google needs to be careful not to characterize the system’s accomplishments in ways that may play well with the tech press but that ultimately oversell its “cool factor” while underselling what is truly disruptive.

About the Author

Arle  Lommel

Arle Lommel

Senior Analyst

Focuses on language technology, artificial intelligence, translation quality, and overall economic factors impacting globalization

Related

Sentient AI: Parrot, Parity, or Parody?

Sentient AI: Parrot, Parity, or Parody?

Last week, the Washington Post published an article about Blake Lemoine’s claim that his employer G...

Read More >
Airbnb: A Lesson in How to Implement Language at the Platform Level

Airbnb: A Lesson in How to Implement Language at the Platform Level

Are you ready to implement language as a feature at the platform level? Do you know how to gain exec...

Read More >
Responsive Machine Translation: The Next Frontier for MT

Responsive Machine Translation: The Next Frontier for MT

CSA Research’s recent survey-based examinations of machine translation deployment at language servi...

Read More >
2020: A Year of Superlatives in the Language Industry

2020: A Year of Superlatives in the Language Industry

CSA Research recently released our list of the 100 largest LSPs and langtech providers, along with e...

Read More >
Building a Comprehensive View of Machine Translation’s Potential

Building a Comprehensive View of Machine Translation’s Potential

It is no secret that machine translation (MT) has gone from a relatively niche solution to seeming u...

Read More >
Making the Best of a Bad Year: Five Lessons for 2021

Making the Best of a Bad Year: Five Lessons for 2021

As we look back at the annus horribilis that was 2020, what are some things we can learn and take fo...

Read More >

Subscribe

Name

Categories

Follow Us on Twitter