Ben Poor

Translation and transcription tools have become commonplace at broadcaster facilities, as journalists and programme-makers make use of and produce content in multiple languages and for multiple purposes. In a globalized world, the ability to shift seamlessly from one language to another helps to ensure the flow of news, culture and entertainment.

Typically, EBU Members make use of language tools from different vendors to fulfil different use cases. Some tools might be stronger in a given language, or others for a particular type of content. And the quality and performance of these machine-learning-based tools will vary over time as models are updated and refined.

Integration costs and lock-in

One of the challenges of using multiple tools from multiple vendors is that each one involves integration into your existing production workflows. This leads to added costs over and above the usage fees for the tool itself, and a tendency towards vendor lock-in.

Fortunately, the EBU has developed a solution for this problem.

EuroVOX is an open toolkit that allows single integration to multiple language tool vendors. It consists of a core layer that serves as a single open API for machine-learning-based language tools, along with additional tools that provide a tailored  interface for specific production tasks.

For EBU Members, the EuroVOX core layer is already freely available for deployment. It is currently being deployed by IRT and the Eurovision News Exchange on a trial basis, with more members interested in either using an EBU-hosted service or running it on their own infrastructure.

To deploy the EuroVOX core layer, both code and Docker images are available to deploy in the cloud or entirely on-premise. Adapters for several different vendors are already available, meaning you need only to add your credentials to start using them. Alternatively, writing adapters for additional vendors or your own technology is straightforward.

The advantage of implementing EuroVOX on your own infrastructure, even if you use a limited set of vendors, is that it allows the freedom to easily change vendors at a later stage. For example, if your organization decides to start producing content in a new language that requires a new vendor, you can change or add providers at the push of a button. A one-time integration of EuroVOX avoids the need for additional integrations of other tools later on.

EuroVOX to AudioVOX

Other tools available in the project include AudioVOX (see screenshot above), which uses the core layer to provide a web-based tool for easily transcribing and translating audio and video content. This makes it possible to ingest video content, transcribe it using a choice of vendors, add multiple transcriptions using different vendors and re-render the video in a choice of outputs. These include a re-spoken audio track, burned-in subtitles in the original language, or embedded subtitles in multiple translated languages – a versatile tool for making content truly multi-lingual.

AudioVOX also allows an editor to rapidly make corrections to the transcription, translations, or even the timing of sentences. This ease-of-use increases the amount of content that can be transcribed and translated. Additionally, the tool is designed to be integrated into existing production workflows.

EuroVOX users can also access regularly scheduled benchmarking to compare the performance of different vendors and automatically propose the right vendor for the right task.

The roadmap for the project includes adding real-time transcription and speaker diarization, as well as methods to enhance tools like AudioVOX with collaborative editing – allowing teams of producers to work on the best transcription and translation for content.

To learn more about EuroVOX, to try it out for yourself, or to join the collaboration visit:


This article was first published in issue 45 of tech-i magazine.

Latest news