Unlocking the potential of AI for data journalism

To mark the opening of nominations for the EBU Technology & Innovation Awards 2021, we're looking back at the projects honoured in 2020. France Télévisions' Data & AI team was a joint winner of the T&I Award 2020. This article first appeared in Issue 46 of tech-i magazine.

Matthieu Parmentier (France Télévisions)

At the start of 2020, France Télévisions decided to create a new department. Called DaIA – Data et Intelligence Artificielle – it consists of a team of about fifteen specialists: data scientists, data engineers, data analysts. It functions as a centre of excellence at the service of the company’s project teams, collaborating with all of France tv’s research units and subsidiaries.

The rise of AI tools has created powerful new possibilities for the classification and analysis of content. At a time when the Big Tech companies are developing ever more sophisticated AI-based offerings, it is critical for public service media organizations to equip themselves with their own resources. Such resources must be adapted to the needs of producing media content on a day-to-day basis. This was the driver behind the creation of the new department.

Political debates

The municipal elections that took place shortly after the creation of the DaIA team provided an opportunity for a first research project to explore how journalists could be empowered with new information and analysis.

As political campaigning began across 36,000 French cities, almost 200 debates were hosted on the regional channels of the France 3 network. The candidates took these opportunities to raise a very wide range of topics. The debates were thus ideal source material for testing a range of tools and techniques that would enable comparative analysis across thematic, geographic, political, and demographic axes. To be of maximum use to journalists and editors the analysis needed to take place quickly.

Face recognition was combined with text recognition – applied to the ‘lower thirds’ or captions – to identify the speakers. Speech-to- text technology was used to conduct automatic transcription of what they said. Natural language processing was used to analyse the vocabulary, both to detect noteworthy terms and for categorization of topics.

Data visualization

By recognizing and classifying topics associated with their respective speakers, these tools allowed the production of meaningful statistics for each debate. All of this information was made available to data journalists. Combining data from several debates enabled new stories to be told, making use of data visualizations such as charts, timelines or maps.

This system also benefited from experience gained through research carried out previously in the Information Department, the Innovation Department and the MediaLab on the subject of indexing and the rapid availability of content for the editorial staff.

The analysis tools are now are in production, available for other applications. Beyond political debates, the same tools help with describing content and extracting insights to enrich metadata. They can serve concrete use cases such as indexation, recommendation or marketing. They were used, for example, to analyse the 16,000 sentences written by France tv employees asked to describe their work experience during the COVID-19 lockdown.

01 Apr 2021

Unlocking the potential of AI for data journalism

Political debates

Data visualization

Latest news

Annual BBC gathering focuses on women in STEM, with EBU highlighting activites around Europe

EBU joins early implementers of new standard to track media content provenance

EBU highlights progress on harmonized media data space at Gaia-X Summit 2024

Navigating change in broadcasting: a small broadcaster’s perspective