Louis Goldstein
YOU?
Author Swipe
View article: Towards disentangling the contributions of articulation and acoustics in multimodal phoneme recognition
Towards disentangling the contributions of articulation and acoustics in multimodal phoneme recognition Open
Although many previous studies have carried out multimodal learning with real-time MRI data that captures the audio-visual kinematics of the vocal tract during speech, these studies have been limited by their reliance on multi-speaker corp…
View article: Articulatory Feature Prediction from Surface EMG during Speech Production
Articulatory Feature Prediction from Surface EMG during Speech Production Open
We present a model for predicting articulatory features from surface electromyography (EMG) signals during speech production. The proposed model integrates convolutional layers and a Transformer block, followed by separate predictors for a…
View article: The stability of articulatory and acoustic oscillatory signals derived from speech
The stability of articulatory and acoustic oscillatory signals derived from speech Open
Articulatory underpinnings of periodicities in the speech signal are unclear beyond a general alternation of vocal tract opening and closing. This study evaluates a modulatory articulatory signal that captures instantaneous change in vocal…
View article: Workshop 13 May 2024 : SPEECH PRODUCTION MODELS AND EMPIRICAL EVIDENCE FROM TYPICAL AND PATHOLOGICAL SPEECH
Workshop 13 May 2024 : SPEECH PRODUCTION MODELS AND EMPIRICAL EVIDENCE FROM TYPICAL AND PATHOLOGICAL SPEECH Open
This unpublished document can be cited as: Fougeron C., Goldstein L., Guenther F., Lœvenbruck H., Mefferd A., Mücke D., Niziolek C., Parrel B., Perrier P., Ziegler W., Laganaro M. (unpublished manuscript) Transcription of the workshop Spee…
View article: Direct articulatory observation reveals phoneme recognition performance characteristics of a self-supervised speech model
Direct articulatory observation reveals phoneme recognition performance characteristics of a self-supervised speech model Open
Variability in speech pronunciation is widely observed across different linguistic backgrounds, which impacts modern automatic speech recognition performance. Here, we evaluate the performance of a self-supervised speech model in phoneme r…
View article: Vertical larynx actions and intergestural timing stability in Hausa ejectives and implosives
Vertical larynx actions and intergestural timing stability in Hausa ejectives and implosives Open
The current project undertakes a kinematic examination of vertical larynx actions and intergestural timing stability within multi-gesture complex segments such as ejectives and implosives that may possess specific temporal goals critical t…
View article: Deep Speech Synthesis from MRI-Based Articulatory Representations
Deep Speech Synthesis from MRI-Based Articulatory Representations Open
In this paper, we study articulatory synthesis, a speech synthesis method using human vocal tract information that offers a way to develop efficient, generalizable and interpretable synthesizers. While recent advances have enabled intellig…
View article: Speaker-Independent Acoustic-to-Articulatory Speech Inversion
Speaker-Independent Acoustic-to-Articulatory Speech Inversion Open
To build speech processing methods that can handle speech as naturally as humans, researchers have explored multiple ways of building an invertible mapping from speech to an interpretable space. The articulatory space is a promising invers…
View article: Articulatory Representation Learning Via Joint Factor Analysis and Neural Matrix Factorization
Articulatory Representation Learning Via Joint Factor Analysis and Neural Matrix Factorization Open
Articulatory representation learning is the fundamental research in modeling neural speech production system. Our previous work has established a deep paradigm to decompose the articulatory kinematics data into gestures, which explicitly m…
View article: Deep Speech Synthesis from Articulatory Representations
Deep Speech Synthesis from Articulatory Representations Open
In the articulatory synthesis task, speech is synthesized from input features containing information about the physical behavior of the human vocal tract. This task provides a promising direction for speech synthesis research, as the artic…
View article: Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition
Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition Open
Most of the research on data-driven speech representation learning has focused on raw audios in an end-to-end manner, paying little attention to their internal phonological or gestural structure. This work, investigating the speech represe…
View article: Variation in compensatory strategies as a function of target constriction degree in post-glossectomy speech
Variation in compensatory strategies as a function of target constriction degree in post-glossectomy speech Open
Individuals who have undergone treatment for oral cancer oftentimes exhibit compensatory behavior in consonant production. This pilot study investigates whether compensatory mechanisms utilized in the production of speech sounds with a giv…
View article: Complexity of vocal tract shaping in glossectomy patients and typical speakers: A principal component analysis
Complexity of vocal tract shaping in glossectomy patients and typical speakers: A principal component analysis Open
The glossectomy procedure, involving surgical resection of cancerous lingual tissue, has long been observed to affect speech production. This study aims to quantitatively index and compare complexity of vocal tract shaping due to lingual m…
View article: Who converges? Variation reveals individual speaker adaptability
Who converges? Variation reveals individual speaker adaptability Open
Little is known about the cognitive capacities underlying real-time accommodation in spoken language and how they may allow conversing speakers to adapt their speech production behaviors. This study first presents a simple attunement model…
View article: A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images Open
Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is howev…
View article: Variability in individual constriction contributions to third formant values in American English /ɹ/
Variability in individual constriction contributions to third formant values in American English /ɹ/ Open
Although substantial variability is observed in the articulatory implementation of the constriction gestures involved in /ɹ/ production, studies of articulatory-acoustic relations in /ɹ/ have largely ignored the potential for subtle variat…
View article: How an aglossic speaker produces an alveolar-like percept without a functional tongue tip
How an aglossic speaker produces an alveolar-like percept without a functional tongue tip Open
It has been previously observed [McMicken, Salles, Berg, Vento-Wilson, Rogers, Toutios, and Narayanan. (2017). J. Commun. Disorders, Deaf Stud. Hear. Aids 5(2), 1–6] using real-time magnetic resonance imaging that a speaker with severe con…
View article: Derivation of Fitts' law from the Task Dynamics model of speech production
Derivation of Fitts' law from the Task Dynamics model of speech production Open
Fitts' law is a linear equation relating movement time to an index of movement difficulty. The recent finding that Fitts' law applies to voluntary movement of the vocal tract raises the question of whether the theory of speech production i…
View article: The Role of Temporal Modulation in Sensorimotor Interaction
The Role of Temporal Modulation in Sensorimotor Interaction Open
How do we align the distinct neural patterns associated with the articulation and the acoustics of the same utterance in order to guide behaviors that demand sensorimotor interaction, such as vocal learning and the use of feedback during s…
View article: Noggin Nodding: Head Movement Correlates With Increased Effort in Accelerating Speech Production Tasks
Noggin Nodding: Head Movement Correlates With Increased Effort in Accelerating Speech Production Tasks Open
Movements of the head and speech articulators have been observed in tandem during an alternating word pair production task driven by an accelerating rate metronome. Word pairs contrasted either onset or coda dissimilarity with same word co…
View article: <i>I Scream for Ice Cream</i>: Resolving Lexical Ambiguity with Sub-phonemic Information
<i>I Scream for Ice Cream</i>: Resolving Lexical Ambiguity with Sub-phonemic Information Open
This study uses a response mouse-tracking paradigm to examine the role of sub-phonemic information in online lexical ambiguity resolution of continuous speech. We examine listeners’ sensitivity to the sub-phonemic information that is speci…
View article: Task-dependence of articulator synergies
Task-dependence of articulator synergies Open
In speech production, the motor system organizes articulators such as the jaw, tongue, and lips into synergies whose function is to produce speech sounds by forming constrictions at the phonetic places of articulation. The present study te…
View article: Simultaneous electromagnetic articulography and electroglottography data acquisition of natural speech
Simultaneous electromagnetic articulography and electroglottography data acquisition of natural speech Open
This paper reports on the concurrent use of electroglottography (EGG) and electromagnetic articulography (EMA) in the acquisition of EMA trajectory data for running speech. Static and dynamic intersensor distances, standard deviations, and…
View article: Quantitative analysis of multimodal speech data
Quantitative analysis of multimodal speech data Open
This study presents techniques for quantitatively analyzing coordination and kinematics in multimodal speech using video, audio and electromagnetic articulography (EMA) data. Multimodal speech research has flourished due to recent improvem…