Explanipedia

Streaming on-device detection of device directed speech from voice and touch-based invocation Open

Ognjen Rudovic, Akanksha Bindal, Vineet Garg, Pramod Simha, Pranay Dighe , et al. · 2021

When interacting with smart devices such as mobile phones or wearables, the user typically invokes a virtual assistant (VA) by saying a keyword or by pressing a button on the device. However, in many cases, the VA can accidentally be invok…

Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation Open

Vineet Garg, Wonil Chang, Siddharth Sigtia, Saurabh Adya, Pramod Simha , et al. · 2021

We present a unified and hardware efficient architecture for two stage voice trigger detection (VTD) and false trigger mitigation (FTM) tasks. Two stage VTD systems of voice assistants can get falsely activated to audio segments acoustical…

Streaming Transformer for Hardware Efficient Voice Trigger Detection and\n False Trigger Mitigation Open

Vineet Garg, Wonil Chang, Siddharth Sigtia, Saurabh Adya, Pramod Simha , et al. · 2021

We present a unified and hardware efficient architecture for two stage voice\ntrigger detection (VTD) and false trigger mitigation (FTM) tasks. Two stage VTD\nsystems of voice assistants can get falsely activated to audio segments\nacousti…

Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering Open

Saurabh Adya, Vineet Garg, Siddharth Sigtia, Pramod Simha, Chandra Dhir · 2020

We consider the design of two-pass voice trigger detection systems. We focus on the networks in the second pass that are used to re-score candidate segments obtained from the first-pass. Our baseline is an acoustic model(AM), with BiLSTM l…

Pramod Simha YOU? Author Swipe