BBC Radio news scripts 1937–1995: Using an automated tagger to enable journeys across time and space

Presenter(s): Jake Berger, Andy Armstrong (BBC)

The BBC has centralized and OCR’d 165,000 radio news scripts from 1937–1995, providing a unique window to inspect the past in great depth and detail. A small team from the Archive Content & Partnerships department used BBC Starfruit to automatically identify entities within the scripts, such as people, places and things. BBC Starfruit is an in-house built system which has been trained on past manual tagging choices by BBC journalists. These tags will enable the public to find and navigate subjects of interest, and to geo-locate the millions of stories within the news scripts, allowing exploration across time and space. This presentation talks about the project, the data, the methodology used, and demonstrates the user interface, in advance of being made available to the public later in 2022.