Autotagging of video based on automatic transcripts

Presenter(s): Are Tverberg (TV 2 Norway), Christoffer Krona (Imatrics)

TV 2 Norway's newsroom has rapidly moved towards more live and breaking news content for both linear and VOD platforms. But low metadata quality for live content in the archive is a big challenge for a commercial public broadcaster like TV 2. To address this, the team started an autotagging pilot based on automatic speech recognition (ASR) data, inspired by the excellent live and dynamic article recommendation prototype built by the EBU Eurovox and EBU PEACH teams. In cooperation with iMatrics from Sweden, TV 2 will soon conclude a pilot on extracting topics and named entities from ASR transcribed text from news programs. This presentation explains the project as well as the challenges which arise from recognition errors, particularly for less widely spoken languages such as Norwegian, which is a disadvantage in terms of commercially available algorithms.