Today’s Automatic Speech Recognition (ASR) systems typically employ a large – but fixed – vocabulary for the recognition and transcription of audio data. Whereas the size of the vocabulary is typically in the hundreds of thousands of words, there will always still be more words that do not form part of this vocabulary. Specific domain-dependent expressions, proper names, jargon or newly emerging terms – such as COVID during the Spring of 2020 – are examples of such “missing” words.
However, words that do not form part of the vocabulary also cannot be recognized by an ASR system. So what can we do in case these words become crucial in achieving our task, e.g. when wanting to detect the mentions of COVID in a series of newscasts?
The SAIL LABS Language Model Toolkit (LMT) provides a framework for the extension and adaptation of ASR-models to particular domains. It allows end-users and partners to flexibly include new terminology and thus to extend and adapt vocabularies to particular needs and tasks. If the task is to monitor mentions of contagious diseases, we may want to add words like COVID to the vocabulary. If we aim to detect mentions of CEOs of companies, we may want to add a name such as Elon Musk. In either case, the LMT allows users to tailor an ASR model to a particular domain and thus to increase the accuracy and value of transcribed text.
The LMT can be operated in an interactive manner or it can be embedded in a programming environment. Furthermore, it is possible to link the LMT with further components to create a framework that gathers data from relevant sources, select important new vocabulary and builds ASR models from this data.
All of these steps can be carried out automatically and together form the core of CAVA – the Continuous Automatic Vocabulary Adaptation – SAIL’s environment to keep up with news as they evolve and change over time.
In this webinar, we will elaborate on the above topics and demonstrate the use of the LMT for more accurate transcription. We will outline the use of CAVA and talk about some of the mechanisms behind these technologies.
The webinar takes place on the 21st of October, at 15:00 (GMT+2). You can save your spot clicking on the button below.
Head of Research
Gerhard Backfried is one of the founders and currently holds the position of Head of Research at SAIL LABS Technology GmbH. Prior to joining SAIL LABS, Gerhard worked in the fields of expert systems for the finance sector and personal dictation systems (IBM’s ViaVoice). His technical expertise includes acoustic and language modelling as well as speech recognition algorithms. More recently he has been focussing on the combination of traditional and social media, particularly in the context of multilingual and multimedia disaster-communication. He holds a master’s degree in computer science (M.Sc.) from the Technical University of Vienna with specialty in Artificial Intelligence and Linguistics and a Ph.D. degree in Computer Science from the University of Vienna. He holds a number of patents, has authored several papers and book chapters, regularly participates in conference program committees and has been contributing to national and international research projects, such as KIRAS/QuOIMA, FP7/M-ECO, FP7/SIIP, H2020/ELG or H2020/MIRROR.
Erinc Dikici is a Researcher at the core speech recognition team of SAIL LABS Technology GmbH. His interests include acoustic and language modelling for speech recognition. He obtained his Ph.D. in the Department of Electrical and Electronics Engineering at Boğaziçi University, Turkey in 2016. His dissertation was on the subject of supervised, semi-supervised and unsupervised methods for discriminative language modeling for Turkish. Erinc has been participating regularly in national and internal research projects and is currently working on Neural Network-based ASR. Within the FP7-project SIIP, Erinc plays an active in the development of ASR, keyword-spotting and the integration of technologies into the common project platform.
Upcoming Webinars (free-of-charge)
- 16.09. CoffAI talks – Mark Pfeiffer, Gerhard Backfried & Katja Prinz
- 07.10. Language Technology Industry – Christoph Prinz & Philippe Wacker
- 21.10. Language Model Toolkit: How to catch up when your vocabulary is running away – Gerhard Backfried & Erinc Dikici
- 28.10. US Elections – Data Statistics: Risks & Hinge Factors – Dominika Betakova, Mark Pfeiffer & Karl-Heinz Land