Explanipedia

I-DCCRN-VAE: An Improved Deep Representation Learning Framework for Complex VAE-based Single-channel Speech Enhancement Open

Jiatong Li, Simon Doclo · 2025

Recently, a complex variational autoencoder (VAE)-based single-channel speech enhancement system based on the DCCRN architecture has been proposed. In this system, a noise suppression VAE (NSVAE) learns to extract clean speech representati…

Multi-Source Position and Direction-of-Arrival Estimation Based on Euclidean Distance Matrices Open

Simon Doclo · 2025

A popular method to estimate the positions or directions-of-arrival (DOAs) of multiple sound sources using an array of microphones is based on steered-response power (SRP) beamforming. For a three-dimensional scenario, SRP-based methods ne…

Subjective quality evaluation of personalized own voice reconstruction systems Open

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo, Jan Rennies · 2025

Own voice pickup technology for hearable devices facilitates communication in noisy environments. Own voice reconstruction (OVR) systems enhance the quality and intelligibility of the recorded noisy own voice signals. Since disturbances af…

Evaluation of Speaker-Conditioned Target Speaker Extraction Algorithms for Hearing-Impaired Listeners Open

Ragini Sinha, A. Scherer, Simon Doclo, Christian Rollwage, Jan Rennies · 2025

Speaker-conditioned target speaker extraction algorithms aim at extracting the target speaker from a mixture of multiple speakers by using additional information about the target speaker. Previous studies have evaluated the performance of …

Binaural Localization Model for Speech in Noise Open

Vikas Tokala, Eric Grinstein, R. L. Brooks, Mike Brookes, Simon Doclo , et al. · 2025

Binaural acoustic source localization is important to human listeners for spatial awareness, communication and safety. In this paper, an end-to-end binaural localization model for speech in noise is presented. A lightweight convolutional r…

Soft-Constrained Spatially Selective Active Noise Control for Open-fitting Hearables Open

Tong Xiao, Reinhild Roden, Matthias Blau, Simon Doclo · 2025

Recent advances in spatially selective active noise control (SSANC) using multiple microphones have enabled hearables to suppress undesired noise while preserving desired speech from a specific direction. Aiming to achieve minimal speech d…

Incremental Averaging Method to Improve Graph-Based Time-Difference-of-Arrival Estimation Open

Klaus Brümann, Kouei Yamaoka, Nobutaka Ono, Simon Doclo · 2025

Estimating the position of a speech source based on time-differences-of-arrival (TDOAs) is often adversely affected by background noise and reverberation. A popular method to estimate the TDOA between a microphone pair involves maximizing …

Improved Topology-Independent Distributed Adaptive Node-Specific Signal Estimation for Wireless Acoustic Sensor Networks Open

Paul Didier, Toon van Waterschoot, Simon Doclo, Jöerg Bitzer, Marc Moonen · 2025

This paper addresses the challenge of topology-independent (TI) distributed adaptive node-specific signal estimation (DANSE) in wireless acoustic sensor networks (WASNs) where sensor nodes exchange only fused versions of their local signal…

Fast-Converging Distributed Signal Estimation in Topology-Unconstrained Wireless Acoustic Sensor Networks Open

Paul Didier, Toon van Waterschoot, Simon Doclo, Jöerg Bitzer, Marc Moonen · 2025

This paper focuses on distributed signal estimation in topology-unconstrained wireless acoustic sensor networks (WASNs) where sensor nodes only transmit fused versions of their local sensor signals. For this task, the topology-independent …

Spatially Selective Active Noise Control for Open-fitting Hearables with Acausal Optimization Open

Tong Xiao, Simon Doclo · 2025

Recent advances in active noise control have enabled the development of hearables with spatial selectivity, which actively suppress undesired noise while preserving desired sound from specific directions. In this work, we propose an improv…

Improving multi-talker binaural DOA estimation by combining periodicity and spatial features in convolutional neural networks Open

Reza Varzandeh, Simon Doclo, Volker Hohmann · 2025

Computer science

Deep neural network-based direction of arrival (DOA) estimation systems often rely on spatial features as input to learn a mapping for estimating the DOA of multiple talkers. Aiming to improve the accuracy of multi-talker DOA estimation fo…

A Steered Response Power Method for Sound Source Localization With Generic Acoustic Models Open

Kaspar Müller, Markus Buck, Simon Doclo, Jan Østergaard, Tobias Barrington Wolff · 2025

The steered response power (SRP) method is one of the most popular approaches for acoustic source localization with microphone arrays. It is often based on simplifying acoustic assumptions, such as an omnidirectional sound source in the fa…

Variants of LSTM cells for single-channel speaker-conditioned target speaker extraction Open

Ragini Sinha, Christian Rollwage, Simon Doclo · 2024

Computer science Chemistry

Speaker-conditioned target speaker extraction aims at estimating the target speaker from a mixture of speakers utilizing auxiliary information about the target speaker. In this paper, we consider a single-channel target speaker extraction …

Reference Microphone Selection for the Weighted Prediction Error Algorithm using the Normalized L-p Norm Open

Anselm Lohmann, Toon van Waterschoot, Jöerg Bitzer, Simon Doclo · 2024

Computer science Mathematics Political science

Reverberation may severely degrade the quality of speech signals recorded using microphones in a room. For compact microphone arrays, the choice of the reference microphone for multi-microphone dereverberation typically does not have a lar…

Low-Complexity Own Voice Reconstruction for Hearables with an In-Ear Microphone Open

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo · 2024

Computer science Physics Medicine

Hearable devices, equipped with one or more microphones, are commonly used for speech communication. Here, we consider the scenario where a hearable is used to capture the user's own voice in a noisy environment. In this scenario, own voic…

Steered Response Power-Based Direction-of-Arrival Estimation Exploiting an Auxiliary Microphone Open

Klaus Brümann, Simon Doclo · 2024

Computer science Engineering Physics

Accurately estimating the direction-of-arrival (DOA) of a speech source using a compact microphone array (CMA) is often complicated by background noise and reverberation. A commonly used DOA estimation method is the steered response power …

Modeling of Speech-dependent Own Voice Transfer Characteristics for Hearables with In-ear Microphones: Audio Examples Open

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo · 2024

Computer science Psychology Physics

This upload contains audio examples for the preprint "Modeling of Speech-dependent Own Voice Transfer Characteristics for Hearables with In-ear Microphones". The audio files correspond to subplots of the spectrogram shown in the default pr…

Deep low-latency joint speech transmission and enhancement over a gaussian channel Open

Mohammad Hadi Bokaei, Jesper Jensen, Simon Doclo, Jan Østergaard · 2024

Computer science Physics Engineering

Ensuring intelligible speech communication for hearing assistive devices in low-latency scenarios presents significant challenges in terms of speech enhancement, coding and transmission. In this paper, we propose novel solutions for low-la…

Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks Open

Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen , et al. · 2024

Computer science Engineering

Studies have shown that in noisy acoustic environments, providing binaural signals to the user of an assistive listening device may improve speech intelligibility and spatial awareness. This paper presents a binaural speech enhancement met…

Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers Open

Marvin Tammen, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki , et al. · 2024

Computer science Psychology Physics

Although mask-based beamforming is a powerful speech enhancement approach, it often requires manual parameter tuning to handle moving speakers. Recently, this approach was augmented with an attention-based spatial covariance matrix aggrega…

Speech-dependent Modeling of Own Voice Transfer Characteristics for In-ear Microphones in Hearables Open

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo · 2024

Computer science Physics

1899

Exploiting an External Microphone for Binaural RTF-Vector-Based Direction of Arrival Estimation for Multiple Speakers Open

Daniel Fejgin, Simon Doclo · 2024

Computer science Physics Engineering

In hearing aid applications, an important objective is to accurately estimate the direction of arrival (DOA) of multiple speakers in noisy and reverberant environments.Recently, we proposed a binaural DOA estimation method, where the DOAs …

Microphone Subset Selection for the Weighted Prediction Error Algorithm using a Group Sparsity Penalty Open

Anselm Lohmann, Toon van Waterschoot, Jöerg Bitzer, Simon Doclo · 2024

Computer science Physics

Reverberation can severely degrade the quality of speech signals recorded using microphones in an enclosure. In acoustic sensor networks with spatially distributed microphones, a similar dereverberation performance may be achieved using on…

Effect of target signals and delays on spatially selective active noise control for open-fitting hearables Open

Tong Xiao, Simon Doclo · 2024

Computer science Physics

Spatially selective active noise control (ANC) hearables are designed to reduce unwanted noise from certain directions while preserving desired sounds from other directions. In previous studies, the target signal has been defined either as…

Comparison of Frequency-Fusion Mechanisms for Binaural Direction-of-Arrival Estimation for Multiple Speakers Open

Daniel Fejgin, Elior Hadad, Sharon Gannot, Zbyněk Koldovský, Simon Doclo · 2024

Computer science Philosophy

To estimate the direction of arrival (DOA) of multiple speakers with methods that use prototype transfer functions, frequency-dependent spatial spectra (SPS) are usually constructed. To make the DOA estimation robust, SPS from different fr…

Modeling of speech-dependent own voice transfer characteristics for hearables with an in-ear microphone Open

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo · 2024

Computer science Physics Medicine

Many hearables contain an in-ear microphone, which may be used to capture the own voice of its user. However, due to the hearable occluding the ear canal, the in-ear microphone mostly records body-conducted speech, typically suffering from…

Multi-Microphone Noise Data Augmentation for DNN-based Own Voice Reconstruction for Hearables in Noisy Environments Open

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo · 2023

Computer science Physics

Hearables with integrated microphones may offer communication benefits in noisy working environments, e.g. by transmitting the recorded own voice of the user. Systems aiming at reconstructing the clean and full-bandwidth own voice from noi…

Head Orientation Estimation with Distributed Microphones Using Speech Radiation Patterns Open

Kaspar Müller, Bilgesu Çakmak, Paul Didier, Simon Doclo, Jan Østergaard , et al. · 2023

Computer science Mathematics Geology

Determining the head orientation of a talker is not only beneficial for various speech signal processing applications, such as source localization or speech enhancement, but also facilitates intuitive voice control and interaction with sma…

Binaural Speech Enhancement Using Complex Convolutional Recurrent Networks Open

Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen , et al. · 2023

Computer science Sociology Philosophy

From hearing aids to augmented and virtual reality devices, binaural speech enhancement algorithms have been established as state-of-the-art techniques to improve speech intelligibility and listening comfort. In this paper, we present an e…

Simon Doclo YOU? Author Swipe