Explanipedia

Binaural Localization Model for Speech in Noise Open

Vikas Tokala, Eric Grinstein, R. L. Brooks, Mike Brookes, Simon Doclo , et al. · 2025

Binaural acoustic source localization is important to human listeners for spatial awareness, communication and safety. In this paper, an end-to-end binaural localization model for speech in noise is presented. A lightweight convolutional r…

SONIVA: Speech recOgNItion Validation in Aphasia Open

Giulia Sanguedolce, Cathy J. Price, Sophie Brook, Dragos C. Gruia, Niamh Parkinson , et al. · 2025

Post-stroke aphasia is a major contributor to language impairment and neuro-disability worldwide, making automated assessment a critical research priority. However, the development of clinically validated automatic speech recognition (ASR)…

Observations and Biogeochemical Modeling Reveal Chlorophyll Diel Cycle With Near‐Sunset Maxima in the Red Sea Open

Yixin Wang, Matthew R. Mazloff, Ariane Verdy, Ivana Cerovečki, Malika Kheireddine , et al. · 2025

Environmental science Geology Biology

The Red Sea is an extremely warm tropical sea hosting diverse ecosystems, with marine organisms operating at the high end of their thermal tolerance. Therefore, in the context of global warming, it is increasingly important to understand t…

Binary Estimator Selection Methods for Hearing Aids With a Remote Microphone Open

Vasudha Sathyapriyan, Michael Syskind Pedersen, Mike Brookes, Jan Østergaard, Patrick A. Naylor , et al. · 2025

Using a high signal-to-noise ratio remote microphone (RM) with hearing aids (HAs) is advantageous for HA users. However, the benefit depends significantly on the properties of the wireless channel. While existing literature often assumes a…

Steered Response Power for Sound Source Localization: a tutorial review Open

Eric Grinstein, Elisa Tengan, Bilgesu Çakmak, Thomas Dietzen, Leonardo Silva Nunes , et al. · 2024

Computer science Physics

In the last three decades, the Steered Response Power (SRP) method has been widely used for the task of Sound Source Localization (SSL), due to its satisfactory localization performance on moderately reverberant and noisy scenarios. Many w…

Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora Open

Francesco Nespoli, Daniel R. Barreda, Patrick A. Naylor · 2024

Computer science Chemistry Philosophy

In recent years, automatic speech recognition (ASR) models greatly improved transcription performance both in clean, low noise, acoustic conditions and in reverberant environments. However, all these systems rely on the availability of hun…

XANE Background Acoustic Embeddings: Ablation and Clustering Analysis Open

Dushyant Sharma, James Fosburgh, Sri Harsha Dumpala, Chandramouli Shama Sastri, Stanislav Yu. Kruchinin , et al. · 2024

Computer science Engineering

We explore the recently proposed explainable acoustic neural embedding~(XANE) system that models the background acoustics of a speech signal in a non-intrusive manner. The XANE embeddings are used to estimate specific parameters related to…

XANE: eXplainable Acoustic Neural Embeddings Open

Sri Harsha Dumpala, Dushyant Sharma, Chandramouli Shama Sastri, Stanislav Yu. Kruchinin, James Fosburgh , et al. · 2024

Computer science

We present a novel method for extracting neural embeddings that model the background acoustics of a speech signal. The extracted embeddings are used to estimate specific parameters related to the background acoustic properties of the signa…

Steered Response Power for Sound Source Localization: A Tutorial Review Open

Eric Grinstein, Elisa Tengan, Bilgesu Çakmak, Thomas Dietzen, Leonardo Silva Nunes , et al. · 2024

Computer science Physics

In the last three decades, the Steered Response Power (SRP) method has been widely used for the task of Sound Source Localization (SSL), due to its satisfactory localization performance on moderately reverberant and noisy scenarios. Many w…

The Neural-SRP method for positional sound source localization Open

Eric Grinstein, Toon van Waterschoot, Mike Brookes, Patrick A. Naylor · 2024

Computer science Physics

Steered Response Power (SRP) is a widely used method for the task of sound source localization using microphone arrays, showing satisfactory localization performance on many practical scenarios. However, its performance is diminished under…

Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks Open

Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen , et al. · 2024

Computer science Engineering

Studies have shown that in noisy acoustic environments, providing binaural signals to the user of an assistive listening device may improve speech intelligibility and spatial awareness. This paper presents a binaural speech enhancement met…

Uncertainty Quantification in Machine Learning for Joint Speaker Diarization and Identification Open

Simon W. McKnight, Aidan O. T. Hogg, Vincent W. Neo, Patrick A. Naylor · 2023

Computer science

This paper studies modulation spectrum features ($Φ$) and mel-frequency cepstral coefficients ($Ψ$) in joint speaker diarization and identification (JSID). JSID is important as speaker diarization on its own to distinguish speakers is insu…

Group Conversations in Noisy Environments (GiN) – Multimedia Recordings for Location-Aware Speech Enhancement Open

Emilie d'Olne, Alastair H. Moore, Patrick A. Naylor, Jacob Donley, Vladimir Tourbabin , et al. · 2023

Computer science Chemistry

Recent years have seen a growing interest in the use of smart glasses mounted with microphones to solve the cocktail party problem using beamforming techniques or machine learning. Many such approaches could bring substantial advances in h…

Practical utility of a head-mounted gaze-directed beamforming system Open

John F. Culling, Emilie d'Olne, Bryn D. Davies, Niamh Powell, Patrick A. Naylor · 2023

Computer science Psychology Medicine

Assistive auditory devices that enhance signal-to-noise ratio must follow the user's changing attention; errors could lead to the desired source being suppressed as noise. A method for measuring the practical benefit of attention-following…

Subspace Hybrid MVDR Beamforming for Augmented Hearing Open

Sina Hafezi, Alastair H. Moore, Pierre Guiraud, Patrick A. Naylor, Jacob Donley , et al. · 2023

Computer science Philosophy Chemistry

Signal-dependent beamformers are advantageous over signal-independent beamformers when the acoustic scenario - be it real-world or simulated - is straightforward in terms of the number of sound sources, the ambient sound field and their dy…

Polynomial Eigenvalue Decomposition for Multichannel Broadband Signal Processing: A mathematical technique offering new insights and solutions Open

Vincent W. Neo, Soydan Redif, J.G. McWhirter, Jennifer Pestana, Ian K. Proudler , et al. · 2023

Computer science Mathematics

This article is devoted to the polynomial eigenvalue decomposition (PEVD) and its applications in broadband multichannel signal processing, motivated by the optimum solutions provided by the EVD for the narrowband case [1] , [2] . In gener…

Binaural Speech Enhancement Using Complex Convolutional Recurrent Networks Open

Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen , et al. · 2023

Computer science Sociology Philosophy

From hearing aids to augmented and virtual reality devices, binaural speech enhancement algorithms have been established as state-of-the-art techniques to improve speech intelligibility and listening comfort. In this paper, we present an e…

Dual input neural networks for positional sound source localization Open

Eric Grinstein, Vincent W. Neo, Patrick A. Naylor · 2023

Computer science Art Geography

In many signal processing applications, metadata may be advantageously used in conjunction with a high dimensional signal to produce a desired output. In the case of classical Sound Source Localization (SSL) algorithms, information from a …

Dual input neural networks for positional sound source localization Open

Eric Grinstein, Vincent W. Neo, Patrick A. Naylor · 2023

Computer science Economics Philosophy

In many signal processing applications, metadata may be advantageously used in conjunction with a high dimensional signal to produce a desired output. In the case of classical Sound Source Localization (SSL) algorithms, information from a …

Observations and biogeochemical modeling reveal chlorophyll diel cycle with near-sunset maxima in the Red Sea Open

Yixin Wang, Matthew R. Mazloff, Ariane Verdy, Ivana Cerovečki, Patrick A. Naylor , et al. · 2023

Environmental science Biology Geology

The Red Sea is an extremely warm tropical sea that hosts diverse ecosystems; thus, it is important to understand its ecology in the context of global warming. Using a coupled physical–biogeochemical model validated against in situ data, we…

Audio Signal Processing in the 21st Century: The important outcomes of the past 25 years Open

Gaël Richard, Paris Smaragdis, Sharon Gannot, Patrick A. Naylor, Shoji Makino , et al. · 2023

Computer science Engineering Physics

International audience

Long-term Conversation Analysis: Exploring Utility and Privacy Open

Francesco Nespoli, Jule Pohlhausen, Patrick A. Naylor, Jöerg Bitzer · 2023

Computer science Mathematics Psychology

The analysis of conversations recorded in everyday life requires privacy protection. In this contribution, we explore a privacy-preserving feature extraction method based on input feature dimension reduction, spectral smoothing and the low…

Two-Stage Voice Anonymization for Enhanced Privacy Open

Francesco Nespoli, Daniel R. Barreda, Jöerg Bitzer, Patrick A. Naylor · 2023

Computer science Art Economics

In recent years, the need for privacy preservation when manipulating or storing personal data, including speech , has become a major issue. In this paper, we present a system addressing the speaker-level anonymization problem. We propose a…

Graph neural networks for sound source localization on distributed microphone networks Open

Eric Grinstein, Mike Brookes, Patrick A. Naylor · 2023

Computer science Engineering Philosophy

Distributed Microphone Arrays (DMAs) present many challenges with respect to centralized microphone arrays. An important requirement of applications on these arrays is handling a variable number of input channels. We consider the use of Gr…

Subspace Hybrid Beamforming for Head-worn Microphone Arrays Open

Sina Hafezi, Alastair H. Moore, Pierre Guiraud, Patrick A. Naylor, Jacob Donley , et al. · 2023

Computer science Mathematics Philosophy

A two-stage multi-channel speech enhancement method is proposed which consists of a novel adaptive beamformer, Hybrid Minimum Variance Distortionless Response (MVDR), Isotropic-MVDR (Iso), and a novel multi-channel spectral Principal Compo…

Using a single-channel reference with the MBSTOI binaural intelligibility metric Open

Pierre Guiraud, Alastair H. Moore, Rebecca R. Vos, Patrick A. Naylor, Mike Brookes · 2023

Computer science Economics Philosophy

In order to assess the intelligibility of a target signal in a noisy environment, intrusive speech intelligibility metrics are typically used. They require a clean reference signal to be available which can be difficult to obtain especiall…

Signal Compaction Using Polynomial EVD for Spherical Array Processing With Applications Open

Vincent W. Neo, Christine Evers, Stephan Weiss, Patrick A. Naylor · 2023

Computer science Mathematics Physics

Multi-channel signals captured by spatially separated sensors often contain a high level of data redundancy. A compact signal representation enables more efficient storage and processing, which has been exploited for data compression, nois…

Uncovering the Potential for a Weakly Supervised End-to-End Model in Recognising Speech from Patient with Post-Stroke Aphasia Open

Giulia Sanguedolce, Patrick A. Naylor, Fatemeh Geranmayeh · 2023

Psychology Medicine Computer science

Post-stroke speech and language deficits (aphasia) significantly impact patients' quality of life. Many with mild symptoms remain undiagnosed, and the majority do not receive the intensive doses of therapy recommended, due to healthcare co…

The Neural-SRP Method for Universal Robust Multi-source Tracking Open

Eric Grinstein, Christopher Hicks, Toon van Waterschoot, Mike Brookes, Patrick A. Naylor · 2023

Computer science Physics Geography

Neural networks have achieved state-of-the-art performance on the task of acoustic Direction-of-Arrival (DOA) estimation using microphone arrays. Neural models can be classified as end-to-end or hybrid, each class showing advantages and di…

Patrick A. Naylor YOU? Author Swipe