#38: MLDublin meets Voysis- 1 min
Voysis, can you hear us? We were sponsored by Voysis and took to the opportunity to talk about speech with talks from Peter Cahill, Voysis; Piyush Arora, ADAPT DCU; and John Kane, Cogito.
Wake words are what enable voice applications to become truly 'hands free', where users can interact with a system without needing to press a single button or tap a screen. "Hey Siri" and "OK Google" are well known examples of wake words. This talk will present a deep dive on approaches to wake word detection and present results obtained on a variety of neural network architectures and test environments.
We have witnessed tremendous growth of speech and language based interactions with machines in the last decade, with the advent of better tools and applications such as siri, cortana, google now, alexa. Most of these tools support monolingual interactions. But how do we provide information access and support interactions across different languages is a complex challenge. In this talk I will describe our recent work on the task of Open Cross Lingual Information Retrieval (OpenCLIR). The goal of this task is to develop effective methods to locate text and speech documents in low-resource languages, using English queries. This task depends on a complex combination of different aspects of Information Retrieval (IR), Machine Translation (MT), Speech Processing (SP) and Natural Language Processing (NLP). I will describe our technology pipeline to address the OpenCLIR task and present various alternatives that we explored and investigated for each of the underlying components (MT, SP and IR) and their performance. Furthermore, I will pinpoint the challenges and open issues related to handling Cross Lingual Audio and Text Retrieval with low resource languages.
Machine learning can unintentionally encode and amplify negative bias and stereotypes present in humans, be they conscious or unconscious. This has led to high-profile cases where machine learning systems have been found to exhibit bias towards gender, race, and ethnicity, among other demographic categories. Negative bias can be encoded in these algorithms based on: the representation of different population categories in the model training data; bias arising from manual human labeling of these data; as well as modeling types and optimization approaches used. In this talk I will discuss the problem of negative bias in machine learning generally and also specifically the case of gender bias in the applied area of emotion recognition from speech. I will demonstrate that lower recall for emotional activation in female speech samples can be attenuated by applying an adversarial de-biasing training technique.