CSA Research

04Dec

Human Vs. AI Interpreting – a Real-Life Comparison

For the last 10 years, I have written hundreds of pages of research on interpreting in its various forms. I personally tested countless interpreting management systems (IMSes), interpreting delivery platforms (IDPs) including telephone (OPI) or video remote (VRI), remote simultaneous interpreting (RSI), and even a broad range of automated interpreting systems. Yet, when running demos on these products, I usually knew the language used to prove how the technology works. Even in the few instances when I didn’t, the session was very short and did not give me a chance to feel any frustration at language limitations. I had never needed to rely on interpreting as a true user would in a situation when they don’t know the language at all.

Recently, I had the unique opportunity to attend a conference where I relied entirely on interpreting to navigate presentations, discussions, and interactions. The sessions combined simultaneous interpreting for keynotes with automated interpreting for breakout discussions – offering a firsthand look at two of the most prominent modalities shaping the interpreting landscape today. Despite some quick language-learning attempts before the trip, I arrived as a true outsider to the language, experiencing the technology just as a typical end-user might.

Perceptions on Human Interpreting

The first part of the conference relied on human conference interpreters, which, in my mind, was the best-in-class service for such events. My experience however dampened that impression:

I couldn’t hear the interpretation well. Attendees were provided with a standard earpiece receiver. Once I figured out how to put the earpiece in and presenters started to talk, I realized that the volume of the source language coming in from the loudspeakers completely overpowered the audio from my earpiece. Cranking up the volume on the earpiece would just hurt my ears, so I stopped that. I missed a tremendous amount of the interpretation because what I really needed was an earplug in my other ear to block out the source language.
The quality of the interpretation was subpar. The organization relied on volunteers who clearly became overwhelmed by how technical the conference content was, especially with the breakneck pace of the speakers. The interpreters did a good job with what I knew well already. But any time a presenter spoke about something a bit new, the interpreters faltered and I ended up missing the whole point that was new to me.
Interpreter preparation didn’t seem to have made much difference. I sent my slides ahead of time to interpreters and even went to check in with them before the conference started to ask if they needed any clarifications. They reported being ready. But when I later asked conference attendees who listened to the interpretation of my session, it seems to have been hard to follow. The feedback about my content in the target language was average – yet those who listened to me present in English had rave reviews.
I don’t trust my notes. As an analyst, numbers that people cite are the most interesting to me during a presentation. Interestingly, interpreters spoke numbers with less confidence than the rest of the text, lowering their voice and hesitating. This left me unsure how accurate those numbers were and I ended up deciding not to retain any number from a research standpoint – I just couldn’t guarantee that I heard numbers that were error free.
Earpieces are not practical when you present. Once it was my turn to present, I had a microphone in one hand and a clicker in the other. It was just not practical to also handle the interpretation receiver. Therefore, I left it on the lectern while presenting, which was probably not ideal if handling questions had been part of the exercise.
Interpreters just handle the session. There is more to a conference than presentations. I found myself very limited in my ability to network with attendees, especially as few of them spoke English. I wish I had had a personal interpreter for the breaks or to talk with vendors on the tradeshow floor whose marketing collateral was 100% in a language I don’t speak or read and whose booths rarely had English-speaking staff. This communication challenge reduces the benefit of attending a conference in another language.

Perceptions on Automated Interpreting

The rest of the conference relied on automated interpreting, a subject I am well versed in after nearly a year working on the topic (“What Language Access Teams Must Know about Automated Speech-to-Speech Interpreting”).

Conference organizers didn’t tell us to bring earbuds. And alas, earbuds were not on my packing list. There was no option to buy earbuds on site. This meant that I was never able to listen to any of the interpretation and instead I relied fully on the real-time translated transcript.
Having a charged device requires planning. My cell phone is towards the end of its life and cannot handle hours of interpretation on one charge. So, I had to juggle battery packs – that luckily had made my packing list – to not miss content. It’s not a huge deal but more of an inconvenience as an attendee.
You can’t use your phone for other things at the same time. If you’re busy reading subtitles, you can’t go check when your next appointment is or read comments in the event group chat. Multitasking is not an option.
Results were all over the place. Some sessions had very coherent results where I could follow the presentation well. When one presenter spoke in a language that wasn’t his native language, the AI botched his speech. Others didn’t quite finish their sentences which resulted in incomprehensible translation. The most entertaining part occurred when I giggled as a presenter started talking about “porn” – yet when I saw that no one else was laughing, I realized it was a translation mistake. The presenter later confirmed to me she was talking about “pork” and had no clue what could have led AI to get it wrong.
The AI platform had an unexpected perk. At times, I would leave conference sessions for a meeting with a client. I discovered that if I didn’t close the session, the transcript kept building up in the background. Once I was done with my meeting, I was able to scroll through the translated transcript to catch up on missed session content, which proved to be a bonus for me.
Taking notes was easier. By being able to scroll through the transcript, I was able to re-read sections of interest to take notes. However, the quality of the source speech made a big difference in whether the transcript was usable for notes – sentence segments were hard to figure out. The app also seemed to have struggled quite a bit with numbers, leaving me in no better position than had been the case with human interpreters.
Slides remained an issue. While some apps enable you to snap a picture and see automated slide translation, the app used didn’t offer this option. Presenters tended not to repeat what was on the slides and just added extra information – that left a big gap for me, reinforcing the need to train presenters too.
The app was designed only for session content. Just like for human interpreters, the benefit didn’t extend past speeches. The conference app did not build a conversational component to assist attendees in talking to others during breaks.

The Bottom Line

Experiencing interpreting the way it is meant to be was an enlightening experience for me. I uncovered new hurdles to account for from a research standpoint. At the end of the day, neither human nor automated interpreting left me satisfied. I feel like I missed the most interesting aspects of the conference and that my notes are not very actionable due to the low-trust factor.

Now, of course, different human interpreters or a different AI app could have made an impact. But as an attendee, you must deal with the tools provided and have no influence to control the volume of the source language, the caliber of chosen professionals, or even the pace of the presenters being interpreted.

If you asked me to choose the best experience for this event with dual communication modalities, I would have to pick automated subtitling – the inability to properly hear the human interpreting was a deal breaker. Other foreign attendees at the event commented that they wished the AI app had been made available for the keynotes so they could at least follow what was said.

What I cannot rule on is what I would have preferred had the sound level been better managed and interpreters more experienced and qualified for the job. The jury remains out on that.

Photo credit: Adil Chelebiyev

About the Author

Hélène Pielmeier

Director of LSP Service

Focuses on LSP business management, strategic planning, sales and marketing strategy and execution, project and vendor management, quality process development, and interpreting technologies