Mittul Singh
YOU?
Author Swipe
Effect of Speech Modification on Wav2Vec2 Models for Children Speech Recognition Open
Publisher Copyright: © 2024 IEEE.
Automatic Rating of Spontaneous Speech for Low-Resource Languages Open
Funding Information: This work is part of Digitala project which is funded by the Academy of Finland (grant numbers 322619, 322625, 322965). The computational resources were provided by Aalto ScienceIT. Publisher Copyright: © 2023 IEEE.
View article: Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children
Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children Open
Computer-assisted Language Learning (CALL) is a rapidly developing area accelerated by advancements in the field of AI. A well-designed and reliable CALL system allows students to practice language skills, like pronunciation, any time outs…
End-to-end Ensemble-based Feature Selection for Paralinguistics Tasks Open
The events of recent years have highlighted the importance of telemedicine solutions which could potentially allow remote treatment and diagnosis. Relatedly, Computational Paralinguistics, a unique subfield of Speech Processing, aims to ex…
FinChat: Corpus and Evaluation Setup for Finnish Chat Conversations on Everyday Topics Open
Creating open-domain chatbots requires large amounts of conversational data and related benchmark tasks to evaluate them. Standardized evaluation tasks are crucial for creating automatic evaluation metrics for model development; otherwise,…
Data Augmentation Using Prosody and False Starts to Recognize Non-Native Children’s Speech Open
This paper describes AaltoASR's speech recognition system for the INTERSPEECH 2020 shared task on Automatic Speech Recognition (ASR) for non-native children's speech. The task is to recognize non-native speech from children of various age …
Data augmentation using prosody and false starts to recognize non-native\n children's speech Open
This paper describes AaltoASR's speech recognition system for the INTERSPEECH\n2020 shared task on Automatic Speech Recognition (ASR) for non-native\nchildren's speech. The task is to recognize non-native speech from children of\nvarious a…
FinChat: Corpus and evaluation setup for Finnish chat conversations on\n everyday topics Open
Creating open-domain chatbots requires large amounts of conversational data\nand related benchmark tasks to evaluate them. Standardized evaluation tasks are\ncrucial for creating automatic evaluation metrics for model development;\notherwi…
Aalto's End-to-End DNN systems for the INTERSPEECH 2020 Computational Paralinguistics Challenge Open
End-to-end neural network models (E2E) have shown significant performance benefits on different INTERSPEECH ComParE tasks. Prior work has applied either a single instance of an E2E model for a task or the same E2E architecture for differen…
Effects of Language Relatedness for Cross-lingual Transfer Learning in Character-Based Language Models Open
Character-based Neural Network Language Models (NNLM) have the advantage of smaller vocabulary and thus faster training times in comparison to NNLMs based on multi-character units. However, in low-resource scenarios, both the character and…
Effects of Language Relatedness for Cross-lingual Transfer Learning in Character-Based Language Models Open
Character-based Neural Network Language Models (NNLM) have the advantage of smaller vocabulary and thus faster training times in comparison to NNLMs based on multi-character units. However, in low-resource scenarios, both the character and…
Service registration chatbot: collecting and comparing dialogues from AMT workers and service’s users Open
Crowdsourcing is the go-to solution for data collection and annotation in the context of NLP tasks. Nevertheless, crowdsourced data is noisy by nature; the source is often unknown and additional validation work is performed to guarantee th…
Subword RNNLM Approximations for Out-Of-Vocabulary Keyword Search Open
In spoken Keyword Search, the query may contain out-of-vocabulary (OOV) words not observed when training the speech recognition system. Using subword language models (LMs) in the first-pass recognition makes it possible to recognize the OO…
Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling Open
In this paper, we address the problem of effectively self-training neural networks in a low-resource setting. Self-training is frequently used to automatically increase the amount of training data. However, in a low-resource scenario, it i…
Building Blocks of Assistant Based Speech Recognition for Air Traffic Management Applications Open
In air traffic control rooms around the world paper flight strips are replaced through different digital solutions. This enables other systems to access the instructed air traffic controller (ATCo) commands and use them for other purposes.…
Semi-supervised Adaptation of Assistant Based Speech Recognition Models for different Approach Areas Open
Air Navigation Service Provider (ANSPs) replace paper flight strips through different digital solutions. The instructed commands from an air traffic controller (ATCOs) are then available in computer readable form. However, those systems re…
Iterative Learning of Speech Recognition Models for Air Traffic Control Open
Automatic Speech Recognition (ASR) has recently proved to be a useful tool to reduce the workload of air traffic controllers leading to significant gains in operational efficiency. Air Traffic Control (ATC) systems in operation rooms aroun…
Handling long-term dependencies and rare words in low-resource language modelling Open
For low resource NLP tasks like Keyword Search and domain adaptation with small amounts of in-domain data, having well-trained language models is essential. Two major challenges faced while building these language models for such tasks are…
Sequential Recurrent Neural Networks for Language Modeling Open
Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some c…
Long-Short Range Context Neural Networks for Language Modeling Open
The goal of language modeling techniques is to capture the statistical and structural properties of natural languages from training corpora. This task typically involves the learning of short range dependencies, which generally model the s…