Craig Jin
YOU?
Author Swipe
View article: Non-Collaborative User Simulators for Tool Agents
Non-Collaborative User Simulators for Tool Agents Open
Tool agents interact with users through multi-turn dialogues to accomplish various tasks. Recent studies have adopted user simulation methods to develop these agents in multi-turn settings. However, existing user simulators tend to be agen…
View article: Speech-to-Noise Ratio and Voice-to-Noise Ratio of Voice Databases With Implications for Acoustic Voice Analysis
Speech-to-Noise Ratio and Voice-to-Noise Ratio of Voice Databases With Implications for Acoustic Voice Analysis Open
These databases provided voice samples with variable signal quality, ranging from low to high levels compared with recommended values. The decreased values of SNR and VNR in connected speech tasks in these databases suggested the effects o…
View article: TB-HSU: Hierarchical 3D Scene Understanding with Contextual Affordances
TB-HSU: Hierarchical 3D Scene Understanding with Contextual Affordances Open
The concept of function and affordance is a critical aspect of 3D scene understanding and supports task-oriented objectives. In this work, we develop a model that learns to structure and vary functional affordance across a 3D hierarchical …
View article: Simi-SFX: A similarity-based conditioning method for controllable sound effect synthesis
Simi-SFX: A similarity-based conditioning method for controllable sound effect synthesis Open
Generating sound effects with controllable variations is a challenging task, traditionally addressed using sophisticated physical models that require in-depth knowledge of signal processing parameters and algorithms. In the era of generati…
View article: Listen to Your Map: An Online Representation for Spatial Sonification
Listen to Your Map: An Online Representation for Spatial Sonification Open
Robotic perception is becoming a key technology for navigation aids, especially helping individuals with visual impairments through spatial sonification. This paper introduces a mapping representation that accurately captures scene geometr…
View article: TB-HSU: Hierarchical 3D Scene Understanding with Contextual Affordances
TB-HSU: Hierarchical 3D Scene Understanding with Contextual Affordances Open
The concept of function and affordance is a critical aspect of 3D scene understanding and supports task-oriented objectives. In this work, we develop a model that learns to structure and vary functional affordance across a 3D hierarchical …
View article: ICGAN: An implicit conditioning method for interpretable feature control of neural audio synthesis
ICGAN: An implicit conditioning method for interpretable feature control of neural audio synthesis Open
Neural audio synthesis methods can achieve high-fidelity and realistic sound generation by utilizing deep generative models. Such models typically rely on external labels which are often discrete as conditioning information to achieve guid…
View article: Voice disorder recognition using machine learning: a scoping review protocol
Voice disorder recognition using machine learning: a scoping review protocol Open
Introduction Over the past decade, several machine learning (ML) algorithms have been investigated to assess their efficacy in detecting voice disorders. Literature indicates that ML algorithms can detect voice disorders with high accuracy…
View article: VR/AR and hearing research: current examples and future challenges
VR/AR and hearing research: current examples and future challenges Open
A well-known issue in clinical audiology and hearing research is the level of abstraction of traditional experimental assessments and methods, which lack ecological validity and differ significantly from real-life experiences, often result…
View article: HRTF Interpolation Using a Spherical Neural Process Meta-Learner
HRTF Interpolation Using a Spherical Neural Process Meta-Learner Open
Publisher Copyright: IEEE
View article: An investigation into the effectiveness of using acoustic touch to assist people who are blind
An investigation into the effectiveness of using acoustic touch to assist people who are blind Open
Wearable smart glasses are an emerging technology gaining popularity in the assistive technologies industry. Smart glasses aids typically leverage computer vision and other sensory information to translate the wearer’s surrounding into com…
View article: HRTF Interpolation using a Spherical Neural Process Meta-Learner
HRTF Interpolation using a Spherical Neural Process Meta-Learner Open
Several individualization methods have recently been proposed to estimate a subject's Head-Related Transfer Function (HRTF) using convenient input modalities such as anthropometric measurements or pinnae photographs. There exists a need fo…
View article: DDSP-SFX: Acoustically-guided sound effects generation with differentiable digital signal processing
DDSP-SFX: Acoustically-guided sound effects generation with differentiable digital signal processing Open
Controlling the variations of sound effects using neural audio synthesis models has been a difficult task. Differentiable digital signal processing (DDSP) provides a lightweight solution that achieves high-quality sound synthesis while ena…
View article: Conditional Sound Effects Generation with Regularized WGAN
Conditional Sound Effects Generation with Regularized WGAN Open
Over recent years generative models utilizing deep neural networks have demonstrated outstanding capacity in synthesizing high-quality and plausible human speech and music. The majority of research in neural audio synthesis (NAS) targets s…
View article: Acoustic touch: An auditory sensing paradigm to support close reaching for people who are blind
Acoustic touch: An auditory sensing paradigm to support close reaching for people who are blind Open
This work explores an auditory sensory augmentation paradigm we call acoustic touch, to assist people who are blind with reaching for close objects. The sensory augmentation system is constructed based on the Nreal augmented-reality glasse…
View article: Improving spatial cues for hearables using a parameterized binaural CDR estimator
Improving spatial cues for hearables using a parameterized binaural CDR estimator Open
We investigate a speech enhancement method based on the binaural coherence-to-diffuse power ratio (CDR), which preserves auditory spatial cues for maskers and a broadside target. Conventional CDR estimators typically rely on a mathematical…
View article: Wireless Signal Representation Techniques for Automatic Modulation Classification
Wireless Signal Representation Techniques for Automatic Modulation Classification Open
In this paper, we present a comprehensive survey and detailed comparison of techniques that have been applied to the problem of identifying the type of modulation contained within received wireless signals. Known as automatic modulation cl…
View article: A tutorial on immersive three-dimensional sound technologies
A tutorial on immersive three-dimensional sound technologies Open
There is renewed interest in virtual auditory perception and spatial audio arising from a technological drive toward enhanced perception via mixed-reality systems. Because the various technologies for three-dimensional (3D) sound are so nu…
View article: Perspectives on microphone array processing including sparse recovery, ray space analysis, and neural networks
Perspectives on microphone array processing including sparse recovery, ray space analysis, and neural networks Open
Hands-free audio services supporting speech communication are playing an increasingly ubiquitous and foundational role in everyday life as services for the home and work become more automated, interactive and robotic. People will speak the…
View article: Nano-Enhanced Drug Delivery and Therapeutic Ultrasound for Cancer Treatment and Beyond
Nano-Enhanced Drug Delivery and Therapeutic Ultrasound for Cancer Treatment and Beyond Open
While ultrasound is most widely known for its use in diagnostic imaging, the energy carried by ultrasound waves can be utilized to influence cell function and drug delivery. Consequently, our ability to use ultrasound energy at a given int…
View article: Improved Multipath Time Delay Estimation Using Cepstrum Subtraction
Improved Multipath Time Delay Estimation Using Cepstrum Subtraction Open
When a motor-powered vessel travels past a fixed hydrophone in a multipath environment, a Lloyd's mirror constructive/destructive interference pattern is observed in the output spectrogram. The power cepstrum detects the periodic structure…
View article: Fully Open-Access Passive Dry Electrodes BIOADC: Open-Electroencephalography (EEG) Re-Invented
Fully Open-Access Passive Dry Electrodes BIOADC: Open-Electroencephalography (EEG) Re-Invented Open
The Open-electroencephalography (EEG) framework is a popular platform to enable EEG measurements and general purposes Brain Computer Interface experimentations. However, the current platform is limited by the number of available channels a…
View article: Sound Source Localization in a Multipath Environment Using Convolutional Neural Networks
Sound Source Localization in a Multipath Environment Using Convolutional Neural Networks Open
The propagation of sound in a shallow water environment is characterized by boundary reflections from the sea surface and sea floor. These reflections result in multiple (indirect) sound propagation paths, which can degrade the performance…
View article: Embedded Systems Feel the Beat in New Orleans: Highlights from the IEEE Signal Processing Cup 2017 Student Competition [SP Competitions]
Embedded Systems Feel the Beat in New Orleans: Highlights from the IEEE Signal Processing Cup 2017 Student Competition [SP Competitions] Open
Presents information and highlights from the IEEE Signal Processing Cup 2017 Student Competition.
View article: Convolutional neural networks for passive monitoring of a shallow water environment using a single sensor
Convolutional neural networks for passive monitoring of a shallow water environment using a single sensor Open
A cost effective approach to remote monitoring of protected areas such as\nmarine reserves and restricted naval waters is to use passive sonar to detect,\nclassify, localize, and track marine vessel activity (including small boats and\naut…
View article: Design and Evaluation of Agents that Sequence and Juxtapose Short Musical Patterns in Real Time
Design and Evaluation of Agents that Sequence and Juxtapose Short Musical Patterns in Real Time Open
We present and discuss the Agent Designer, a system that enables users of digital audio workstations to generate novel high-level structures for their compositions based on previous examples. The system uses variable-order Markov models an…