Natarajan Balaji Shankar
YOU?
Author Swipe
View article: CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASR
CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASR Open
Automatic Speech Recognition (ASR) systems struggle with child speech due to its distinct acoustic and linguistic variability and limited availability of child speech datasets, leading to high transcription error rates. While ASR error cor…
View article: Addressing Bias in Spoken Language Systems Used in the Development and Implementation of Automated Child Language‐Based Assessment
Addressing Bias in Spoken Language Systems Used in the Development and Implementation of Automated Child Language‐Based Assessment Open
This article addresses bias in Spoken Language Systems (SLS) that involve both Automatic Speech Recognition (ASR) and Natural Language Processing (NLP) and reports experiments to improve the performance of SLS for automated language and li…
View article: Selective Attention Merging for low resource tasks: A case study of Child ASR
Selective Attention Merging for low resource tasks: A case study of Child ASR Open
While Speech Foundation Models (SFMs) excel in various speech tasks, their performance for low-resource tasks such as child Automatic Speech Recognition (ASR) is hampered by limited pretraining data. To address this, we explore different m…
View article: The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environment
The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environment Open
This paper describes an original dataset of children's speech, collected through the use of JIBO, a social robot. The dataset encompasses recordings from 110 children, aged 4–7 years old, who participated in a letter and digit identificati…
View article: Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models
Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models Open
Speech foundation models (SFMs) have achieved state-of-the-art results for various speech tasks in supervised (e.g. Whisper) or self-supervised systems (e.g. WavLM). However, the performance of SFMs for child ASR has not been systematicall…
View article: SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR
SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR Open
Recently, speech foundation models have gained popularity due to their superiority in finetuning downstream ASR tasks. However, models finetuned on certain domains, such as LibriSpeech (adult read speech), behave poorly on other domains (c…
View article: UniEnc-CASSNAT: An Encoder-Only Non-Autoregressive ASR for Speech SSL Models
UniEnc-CASSNAT: An Encoder-Only Non-Autoregressive ASR for Speech SSL Models Open
Non-autoregressive automatic speech recognition (NASR) models have gained\nattention due to their parallelism and fast inference. The encoder-based NASR,\ne.g. connectionist temporal classification (CTC), can be initialized from the\nspeec…
View article: CoProver: A Recommender System for Proof Construction
CoProver: A Recommender System for Proof Construction Open
Interactive Theorem Provers (ITPs) are an indispensable tool in the arsenal of formal method experts as a platform for construction and (formal) verification of proofs. The complexity of the proofs in conjunction with the level of expertis…