Deep Learning for Audio Signal Processing Article Swipe

View

Related Concepts

Computer science Deep learning Audio signal processing Speech recognition Audio signal Artificial intelligence Convolutional neural network Speech processing Key (lock) Signal processing Pattern recognition (psychology) Speech coding Digital signal processing Computer hardware Computer security

H.‐G. Purwins , Bo Li , Tuomas Virtanen , Jan Schlüter , Shuo-Yiin Chang , Tara N. Sainath ·

YOU? · · 2019 · Open Access · · DOI: https://doi.org/10.1109/jstsp.2019.2908700 · OA: W2931364255

Given the recent surge in developments of deep learning, this article\nprovides a review of the state-of-the-art deep learning techniques for audio\nsignal processing. Speech, music, and environmental sound processing are\nconsidered side-by-side, in order to point out similarities and differences\nbetween the domains, highlighting general methods, problems, key references,\nand potential for cross-fertilization between areas. The dominant feature\nrepresentations (in particular, log-mel spectra and raw waveform) and deep\nlearning models are reviewed, including convolutional neural networks, variants\nof the long short-term memory architecture, as well as more audio-specific\nneural network models. Subsequently, prominent deep learning application areas\nare covered, i.e. audio recognition (automatic speech recognition, music\ninformation retrieval, environmental sound detection, localization and\ntracking) and synthesis and transformation (source separation, audio\nenhancement, generative models for speech, sound, and music synthesis).\nFinally, key issues and future questions regarding deep learning applied to\naudio signal processing are identified.\n

Deep Learning for Audio Signal Processing Article Swipe

Related Topics