The Next Revolution in Conference Interpreting
The glory days of globetrotting for simultaneous interpreters are slowly fading. Now, you don’t necessarily need to fly in a linguist specialized in motorcycle engines or clinical psychiatry for a conference. Over the last few years, remote simultaneous interpreting (RSI) has enabled either interpreters, speakers, or attendees to participate in events despite being off-site. Technology vendors in this area include names such as Interactio, Interprefy, Kudo, Olyusei, or ZipDX. The list of RSI vendors we track has actually quintupled in the last three years, showing great expectations among developers for mass application of this technology.
2019 is bringing yet another disruption to the conference interpreting market through real-time on-demand language access powered by automated speech technologies and machine translation, which together produce machine-based interpretation. wordly, a US-based startup, recently launched a product bound to make some people who derive revenue from interpreting services rather nervous. We talked to CEO & Founder Lakshman Rathnam and COO Kirk Hendrickson to learn more and test-drive the system.
Machine interpreting, sometimes also referred to as “spoken translation” involves systems to process speech to text using speech recognition systems, then processes the text to machine translation (MT), and registered text to speech (TTS) through speech synthesis. wordly’s solution enables real-time automated translation – one-to-one or one-to-many between any of 13 supported languages. Either the mobile or web app lets you view the transcript in the source language, the translation in one of the available languages, and hear the source or translation spoken through voice synthesis.
Our usual reaction when we hear about such integrations is to assume that language quality issues will block any real-life application by conference organizers. Why? Because the multiple technologies that underpin machine interpreting do not offer systematic and consistent quality across all the components. However, the proof is in the output. We put the product to the test and the results were pleasantly surprising. Of course, the system made mistakes – and some were certainly comical. Yet, the software adequately conveyed the meaning of the spoken English (the source) into French and Russian.
When is adequate good enough in the world of interpreting? Consider some examples:
- An Italian attendee to an English only conference struggles to follow the details of a complex presentation due to insufficient fluency in English. Imagine that she can now read the real-time transcript of the session on her mobile in Italian. The information loss is reduced and she will now reap much greater benefits from the conference. Even though it’s not perfect, it’s much better than no interpretation, the unfortunate reality of too many conferences.
- An event organizer plans for simultaneous interpreters in Spanish and French, the languages where they have the most foreign participants. However, attendees from Japan, Poland, and Turkey are left with no language support. In this case, imagine a Japanese attendee being able to listen to the spoken translation on his phone. Attending the conference becomes possible, even if some details will be lost. And the conference organizer does not break the bank to support a handful of attendees from countries for which they are not planning on providing a human interpreter.
- A market research analyst relies on a local language specialist to interview consumers but has to wait several days to get a translation of the transcript of the session for languages he doesn’t know. Imagine the analyst being able to follow the dialogue in real time and communicate on the spot follow-up questions. The outcome of the interview is bound to provide richer details relevant to the research project.
You can easily envision other similar cases. The real value here is not in replacing qualified expert interpreters, but rather providing language access in the too many cases that don’t provide interpretation. According to Lakshman, “the biggest competitor for the offering is not interpreters but conference organizers not doing anything at all and pretending it’s not a problem.”
wordly may be the first in providing machine interpreting designed for meetings large and small, but we expect a lot more activity on this front in the next few years. Machine transcription is already commonplace with providers such as Rev.com, Temi or Trint. It is even used in computer assisted interpreting systems such as the one from InterpretBank. Real-time transcription opens the door to a broad range of applications to augment the work of interpreters or the understanding of conference attendees.
The speech recognition component will only get better with increased investment in supporting conversational interfaces for mobile, IoT, and other mainstream devices. Amazon, Google, iFlytek, Microsoft, Nuance, Philips, and a host of AI-based start-ups will whack away at the challenges of speech such as non-verbal communication, irony, or sidebar descriptions of what’s on the slides, in English and every other language.
About the Author