The SAIL LABS Media Mining Indexer and the CAVA Framework

The SAIL LABS Media Mining Indexer and the CAVA Framework

submitted to Interspeech 2019, September 15-19 2019, Graz, Austria

In today’s attention-driven news economy, rapid changes of topics and events go hand in hand with rapid changes of vocabulary and of language use. ASR systems aimed at transcribing  contents pertaining to this fluid media landscape need to keep upto-date in a continuous and dynamic manner. Static models, potentially created a long time ago, are hopelessly outdated within a short period of time. The frequent changes in vocabulary and wording need to be reflected in the models employed for optimal performance of transcription if one does not want to risk falling behind. In this demonstration paper we present the audio processing capabilities of the SAIL LABS Media Mining Indexer, and the CAVA Framework allowing semi-automatic and periodic updates of the ASR vocabulary and language model from relevant and new data.

This article was presented at Interspeech conference on 2019. To access the full article, please fill in the form below.

Request Form for Research Projects

    Your name and e-mail are going to be used in order to send you only the research file and not any additional commercial material. You can change your mind at any time by clicking the unsubscribe in the footer of the email that you receive from us, or by contacting Please find out about your rights and choices and how we use your information in our Privacy Policy.