Enhancing EuroVOX with real-time translation and smarter AI support

Benjamin Poor (EBU)

The original vision for EuroVOX was to create a single point through which EBU members could access external transcription, translation and voice synthesis technologies for use in their own workflows. This was achieved: the service is in regular use by nearly 40 members, is a key part of the European Perspective project, and is integrated into EBU services – for example, translating Media Intelligence Service reports and in the News Exchange. It is also used to make EBU gatherings multilingual, with translated captions in each member language and synthetically spoken captions accessible through a headset – as demonstrated at the recent Production Technology Seminar.

So, what’s next?

We discovered quite quickly that easy access to different external vendors is only the first step. The real value lies in the workflows that run on top of this. Our first workflow – the EuroVOX Studio – was intended as a demonstrator, but it is now the most visible part of the project and in production usage across the membership. It’s the focus of most development work and will be the central hub of new features in the coming period.

File-based plus real-time

The ability to transcribe and translate in real time is of value, both for unlocking breaking news in different languages and for more fluid workflows that need to receive and publish live content in different languages.

Adding real-time transcription to the Studio, with simultaneous translation to a huge range of languages, is a priority. Having the ability to follow a news conference translated into your own language is useful for a journalist. Being able to clip out the media, transcribed and translated, helps get the story published quicker.

Making the experience as straightforward as the current one for file-based content is the key. Making it possible to do this from anywhere – streams, meetings, laptop or phone – means it can be used by anybody.

Many voices and languages

One challenge we must face is that much content is not in a single language. For file-based content this presents fewer technical challenges beyond ensuring that the user interface remains intuitive and easy-to-use. For real-time content, it’s more difficult. The aim is to get to the point where EuroVOX can be deployed in an EBU meeting where we don’t need to share a common language nor to know in advance who will be speaking what language.

How can we remove the requirement for our EBU member colleagues to be proficient in English when joining our meetings?

While AI is already in use inside EuroVOX to provide the transcriptions, translations and synthetic voicing, we’ve been going further. Last year, we added an experimental feature to provide judgement and propose corrections to translations. The idea is to help a user focus on possible issues without having to scan the entire translation text.

Expanding on this with a more complete range of AI-based ‘assistants’ will help streamline the process of making media multilingual. Having the ability to ‘prompt’ EuroVOX or link it to other AI-based workflows could allow someone to perform several steps at once (e.g., “translate this into French and German and give me a subtitled copy of each”), or even to process the content in a more editorial way (e.g., “Give me all the parts of the content where the speaker is talking about foreign policy, translated into German”).

We want to make EuroVOX a part of the daily experience of all EBU members, whether for their own workflows or through their touchpoints with the EBU. By breaking language barriers, we can reach both each other and our audiences more effectively.

This article first appeared in the March 2026 issue of tech-i magazine.

 

Latest news