Explanipedia

Towards disentangling the contributions of articulation and acoustics in multimodal phoneme recognition Open

Sean Foley, Hong Nguyen, Jihwan Lee, Sudarsana Reddy Kadiri, Dani Byrd , et al. · 2025

Although many previous studies have carried out multimodal learning with real-time MRI data that captures the audio-visual kinematics of the vocal tract during speech, these studies have been limited by their reliance on multi-speaker corp…

Articulatory Feature Prediction from Surface EMG during Speech Production Open

Jihwan Lee, Kevin Huang, Kleanthis Avramidis, Simon Pistrosch, Monica Gonzalez-Machorro , et al. · 2025

We present a model for predicting articulatory features from surface electromyography (EMG) signals during speech production. The proposed model integrates convolutional layers and a Transformer block, followed by separate predictors for a…

The stability of articulatory and acoustic oscillatory signals derived from speech Open

Jessica Campbell, Dani Byrd, Louis Goldstein · 2025

Articulatory underpinnings of periodicities in the speech signal are unclear beyond a general alternation of vocal tract opening and closing. This study evaluates a modulatory articulatory signal that captures instantaneous change in vocal…

Workshop 13 May 2024 : SPEECH PRODUCTION MODELS AND EMPIRICAL EVIDENCE FROM TYPICAL AND PATHOLOGICAL SPEECH Open

Cécile Fougeron, Louis Goldstein, Frank H. Guenther, Hélène Lœvenbruck, Pascal Perrier · 2025

This unpublished document can be cited as: Fougeron C., Goldstein L., Guenther F., Lœvenbruck H., Mefferd A., Mücke D., Niziolek C., Parrel B., Perrier P., Ziegler W., Laganaro M. (unpublished manuscript) Transcription of the workshop Spee…

Direct articulatory observation reveals phoneme recognition performance characteristics of a self-supervised speech model Open

Xuan Shi, Tiantian Feng, Kevin Huang, Sudarsana Reddy Kadiri, Jihwan Lee , et al. · 2024

Computer science Psychology Philosophy

Variability in speech pronunciation is widely observed across different linguistic backgrounds, which impacts modern automatic speech recognition performance. Here, we evaluate the performance of a self-supervised speech model in phoneme r…

Vertical larynx actions and intergestural timing stability in Hausa ejectives and implosives Open

Miran Oh, Dani Byrd, Louis Goldstein, Shrikanth Narayanan · 2024

Computer science History Physics

The current project undertakes a kinematic examination of vertical larynx actions and intergestural timing stability within multi-gesture complex segments such as ejectives and implosives that may possess specific temporal goals critical t…

Deep Speech Synthesis from MRI-Based Articulatory Representations Open

Peter Wu, Tingle Li, Yijing Lu, Yubin Zhang, Jiachen Lian , et al. · 2023

Computer science Mathematics Philosophy

In this paper, we study articulatory synthesis, a speech synthesis method using human vocal tract information that offers a way to develop efficient, generalizable and interpretable synthesizers. While recent advances have enabled intellig…

Speaker-Independent Acoustic-to-Articulatory Speech Inversion Open

Peter Wu, Liwei Chen, Cheol Jun Cho, Shinji Watanabe, Louis Goldstein , et al. · 2023

Computer science Mathematics Engineering

To build speech processing methods that can handle speech as naturally as humans, researchers have explored multiple ways of building an invertible mapping from speech to an interpretable space. The articulatory space is a promising invers…

Articulatory Representation Learning Via Joint Factor Analysis and Neural Matrix Factorization Open

Jiachen Lian, Alan W. Black, Yijing Lu, Louis Goldstein, Shinji Watanabe , et al. · 2022

Computer science Political science Physics

Articulatory representation learning is the fundamental research in modeling neural speech production system. Our previous work has established a deep paradigm to decompose the articulatory kinematics data into gestures, which explicitly m…

Deep Speech Synthesis from Articulatory Representations Open

Peter Wu, Shinji Watanabe, Louis Goldstein, Alan W. Black, Gopala K. Anumanchipalli · 2022

Computer science Mathematics Economics

In the articulatory synthesis task, speech is synthesized from input features containing information about the physical behavior of the human vocal tract. This task provides a promising direction for speech synthesis research, as the artic…

Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition Open

Jiachen Lian, Alan W. Black, Louis Goldstein, Gopala K. Anumanchipalli · 2022

Computer science Political science Physics

Most of the research on data-driven speech representation learning has focused on raw audios in an end-to-end manner, paying little attention to their internal phonological or gestural structure. This work, investigating the speech represe…

Variation in compensatory strategies as a function of target constriction degree in post-glossectomy speech Open

Christina Hagedorn, Yijing Lu, Asterios Toutios, Uttam K. Sinha, Louis Goldstein , et al. · 2022

Psychology Medicine Computer science

Individuals who have undergone treatment for oral cancer oftentimes exhibit compensatory behavior in consonant production. This pilot study investigates whether compensatory mechanisms utilized in the production of speech sounds with a giv…

Complexity of vocal tract shaping in glossectomy patients and typical speakers: A principal component analysis Open

Christina Hagedorn, Jangwon Kim, Uttam K. Sinha, Louis Goldstein, Shrikanth Narayanan · 2021

Computer science Medicine

The glossectomy procedure, involving surgical resection of cancerous lingual tissue, has long been observed to affect speech production. This study aims to quantitatively index and compare complexity of vocal tract shaping due to lingual m…

Who converges? Variation reveals individual speaker adaptability Open

Yoonjeong Lee, Louis Goldstein, Benjamin Parrell, Dani Byrd · 2021

Computer science Psychology Mathematics

Little is known about the cognitive capacities underlying real-time accommodation in spoken language and how they may allow conversing speakers to adapt their speech production behaviors. This study first presents a simple attunement model…

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images Open

Yongwan Lim, Asterios Toutios, Yannick Bliesener, Ye Tian, Sajan Goud Lingala , et al. · 2021

Computer science Medicine

Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is howev…

Variability in individual constriction contributions to third formant values in American English /ɹ/ Open

Sarah Harper, Louis Goldstein, Shrikanth Narayanan · 2020

Psychology Biology Computer science

Although substantial variability is observed in the articulatory implementation of the constriction gestures involved in /ɹ/ production, studies of articulatory-acoustic relations in /ɹ/ have largely ignored the potential for subtle variat…

How an aglossic speaker produces an alveolar-like percept without a functional tongue tip Open

Asterios Toutios, Melissa Xu, Dani Byrd, Louis Goldstein, Shrikanth Narayanan · 2020

Psychology Computer science Medicine

It has been previously observed [McMicken, Salles, Berg, Vento-Wilson, Rogers, Toutios, and Narayanan. (2017). J. Commun. Disorders, Deaf Stud. Hear. Aids 5(2), 1–6] using real-time magnetic resonance imaging that a speaker with severe con…

Derivation of Fitts' law from the Task Dynamics model of speech production Open

Tanner Sorensen, Adam Lammert, Louis Goldstein, Shrikanth Narayanan · 2020

Computer science Mathematics Psychology

Fitts' law is a linear equation relating movement time to an index of movement difficulty. The recent finding that Fitts' law applies to voluntary movement of the vocal tract raises the question of whether the theory of speech production i…

The Role of Temporal Modulation in Sensorimotor Interaction Open

Louis Goldstein · 2019

Psychology Computer science Physics

How do we align the distinct neural patterns associated with the articulation and the acoustics of the same utterance in order to guide behaviors that demand sensorimotor interaction, such as vocal learning and the use of feedback during s…

Noggin Nodding: Head Movement Correlates With Increased Effort in Accelerating Speech Production Tasks Open

Mark Tiede, Christine Mooshammer, Louis Goldstein · 2019

Psychology Computer science Mathematics

Movements of the head and speech articulators have been observed in tandem during an alternating word pair production task driven by an accelerating rate metronome. Word pairs contrasted either onset or coda dissimilarity with same word co…

<i>I Scream for Ice Cream</i>: Resolving Lexical Ambiguity with Sub-phonemic Information Open

Yoonjeong Lee, Elsi Kaiser, Louis Goldstein · 2019

Psychology Computer science Philosophy

This study uses a response mouse-tracking paradigm to examine the role of sub-phonemic information in online lexical ambiguity resolution of continuous speech. We examine listeners’ sensitivity to the sub-phonemic information that is speci…

Task-dependence of articulator synergies Open

Tanner Sorensen, Asterios Toutios, Louis Goldstein, Shrikanth Narayanan · 2019

Computer science Psychology Engineering

In speech production, the motor system organizes articulators such as the jaw, tongue, and lips into synergies whose function is to produce speech sounds by forming constrictions at the phonetic places of articulation. The present study te…

Simultaneous electromagnetic articulography and electroglottography data acquisition of natural speech Open

Sarah Harper, Sungbok Lee, Louis Goldstein, Dani Byrd · 2018

Computer science Physics Medicine

This paper reports on the concurrent use of electroglottography (EGG) and electromagnetic articulography (EMA) in the acquisition of EMA trajectory data for running speech. Static and dynamic intersensor distances, standard deviations, and…

Quantitative analysis of multimodal speech data Open

Samantha Gordon Danner, Adriano Vilela Barbosa, Louis Goldstein · 2018

Computer science Biology Physics

This study presents techniques for quantitatively analyzing coordination and kinematics in multimodal speech using video, audio and electromagnetic articulography (EMA) data. Multimodal speech research has flourished due to recent improvem…

Louis Goldstein YOU? Author Swipe