William M. Hartmann
YOU?
Author Swipe
View article: Cross-Lingual Conversational Speech Summarization with Large Language Models
Cross-Lingual Conversational Speech Summarization with Large Language Models Open
Cross-lingual conversational speech summarization is an important problem, but suffers from a dearth of resources. While transcriptions exist for a number of languages, translated conversational speech is rare and datasets containing summa…
View article: Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech
Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech Open
This paper introduces a set of English translations for a 123-hour subset of the CallHome Mandarin Chinese data and the HKUST Mandarin Telephone Speech data for the task of speech translation. Paired source-language speech and target-langu…
View article: Using i-vectors for subject-independent cross-session EEG transfer learning
Using i-vectors for subject-independent cross-session EEG transfer learning Open
Cognitive load classification is the task of automatically determining an individual's utilization of working memory resources during performance of a task based on physiologic measures such as electroencephalography (EEG). In this paper, …
View article: The masking-level difference in low-noise noise
The masking-level difference in low-noise noise Open
In experiment 1 NoSo and NoS pi thresholds for a 500-Hz pure tone were obtained in a low-fluctuation masking noise and a high-fluctuation masking noise for six normal-hearing listeners. The noise bandwidth was 10 Hz. In agreement with prev…
View article: Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition
Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition Open
Modeling code-switched speech is an important problem in automatic speech recognition (ASR). Labeled code-switched data are rare, so monolingual data are often used to model code-switched speech. These monolingual data may be more closely …
View article: Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts
Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts Open
Sequence-to-sequence (seq2seq) models are competitive with hybrid models for automatic speech recognition (ASR) tasks when large amounts of training data are available. However, data sparsity and domain adaptation are more problematic for …
View article: Learning from Noisy Labels with Noise Modeling Network
Learning from Noisy Labels with Noise Modeling Network Open
Multi-label image classification has generated significant interest in recent years and the performance of such systems often suffers from the not so infrequent occurrence of incorrect or missing labels in the training data. In this paper,…
View article: Cross-lingual Information Retrieval with BERT
Cross-lingual Information Retrieval with BERT Open
Multiple neural language models have been developed recently, e.g., BERT and XLNet, and achieved impressive results in various NLP tasks including sentence classification, question answering and document ranking. In this paper, we explore …
View article: Towards a New Understanding of the Training of Neural Networks with Mislabeled Training Data
Towards a New Understanding of the Training of Neural Networks with Mislabeled Training Data Open
We investigate the problem of machine learning with mislabeled training data. We try to make the effects of mislabeled training better understood through analysis of the basic model and equations that characterize the problem. This include…
View article: Neural-Network Lexical Translation for Cross-lingual IR from Text and Speech
Neural-Network Lexical Translation for Cross-lingual IR from Text and Speech Open
We propose a neural network model to estimate word translation probabilities for Cross-Lingual Information Retrieval (CLIR). The model estimates better probabilities for word translations than automatic word alignments alone, and generaliz…
View article: Noise edge pitch and models of pitch perception
Noise edge pitch and models of pitch perception Open
Monaural noise edge pitch (NEP) is evoked by a broadband noise with a sharp falling edge in the power spectrum. The pitch is heard near the spectral edge frequency but shifted slightly into the frequency region of the noise. Thus, the pitc…
View article: On the localization of high-frequency, sinusoidally amplitude-modulated tones in free field
On the localization of high-frequency, sinusoidally amplitude-modulated tones in free field Open
Previous headphone experiments have shown that listeners can lateralize high-frequency sine-wave amplitude-modulated (SAM) tones based on interaural time differences in the envelope. However, when SAM tones are presented to listeners in fr…
View article: Transaural experiments and a revised duplex theory for the localization of low-frequency tones
Transaural experiments and a revised duplex theory for the localization of low-frequency tones Open
The roles of interaural time difference (ITD) and interaural level difference (ILD) were studied in free-field source localization experiments for sine tones of low frequency (250–750 Hz). Experiments combined real-source trials with virtu…
View article: Using multidimensional scaling techniques to quantify binaural squelch
Using multidimensional scaling techniques to quantify binaural squelch Open
Binaural squelch is a perceptual phenomenon whereby the subjective strength of reverberant sound is attenuated under binaural listening conditions relative to monaural or diotic listening conditions. Although the effect is well known, only…