AI-supported metadata generation for multilingual audio content