Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation Article Swipe

View

Rishika Agarwal , Xiaochuan Niu , Pranay Dighe , Srikanth Vishnubhotla , Sameer Badaskar , Devang Naik ·

YOU? · · 2020 · Open Access · · DOI: https://doi.org/10.21437/interspeech.2020-3238

False triggers in voice assistants are unintended invocations of the assistant, which not only degrade the user experience but may also compromise privacy. False trigger mitigation (FTM) is a process to detect the false trigger events and respond appropriately to the user. In this paper, we propose a novel solution to the FTM problem by introducing a parallel ASR decoding process with a special language model trained from "out-of-domain" data sources. Such language model is complementary to the existing language model optimized for the assistant task. A bidirectional lattice RNN (Bi-LRNN) classifier trained from the lattices generated by the complementary language model shows a $38.34\%$ relative reduction of the false trigger (FT) rate at the fixed rate of $0.4\%$ false suppression (FS) of correct invocations, compared to the current Bi-LRNN model. In addition, we propose to train a parallel Bi-LRNN model based on the decoding lattices from both language models, and examine various ways of implementation. The resulting model leads to further reduction in the false trigger rate by $10.8\%$.

Related Topics To Compare & Contrast

Process (Computing)

Artificial Intelligence

Algorithm

Programming Language

Concepts

Computer science Decoding methods Language model Classifier (UML) Process (computing) False positive rate Artificial intelligence Natural language processing Speech recognition Algorithm Programming language

Metadata

Type: preprint
Language: en
Landing Page: https://doi.org/10.21437/interspeech.2020-3238
OA Status: green
References: 12
Related Works: 20
OpenAlex ID: https://openalex.org/W3070633328

All OpenAlex metadata

Raw OpenAlex JSON

No additional metadata available.