Teuken-7B – LLM for the European broadcast industry

Presenter(s): Christoph Schmidt (Fraunhofer IAIS)

This presentation introduces a large language model specifically developed for the European broadcast industry. Emphasizing the German-led Open GPT Project, Christoph Schmidt highlighted its rigorous focus on multilingual capabilities, encompassing the 24 official European languages to reduce linguistic and cultural biases prevalent in US-centric models.

This open-source model, licensed under Apache 2, promotes technological sovereignty and data privacy. Efforts by Fraunhofer and industry stakeholders underscore its science-driven approach, making it easily downloadable, fine-tunable, and commercially versatile.

Despite using less data than its counterparts, the model's robust filtering enhances quality, pushing it to rival major models with a smaller footprint. Applications include: search engine optimization, subtitle generation, semantic search, and content personalization.