Explanipedia

Mel Frequency Cepstral Coefficient and its Applications: A Review Open

Zrar Kh. Abdul, Abdulbasit K. Al‐Talabani · 2022

Computer science Political science

Feature extraction and representation has significant impact on the performance of any machine learning method. Mel Frequency Cepstrum Coefficient (MFCC) is designed to model features of audio signal and is widely used in various fields. T…

Classification of Heart Sound Signal Using Multiple Features Open

Yaseen Yaseen, Guiyoung Son, Soonil Kwon · 2018

Computer science Medicine Physics

Cardiac disorders are critical and must be diagnosed in the early stage using routine auscultation examination with high precision. Cardiac auscultation is a technique to analyze and listen to heart sound using electronic stethoscope, an e…

Classifying environmental sounds using image recognition networks Open

Venkatesh Boddapati, Andrej Petef, Jim Rasmusson, Lars Lundberg · 2017

Computer science

Automatic classification of environmental sounds, such as dog barking and glass breaking, is becoming increasingly interesting, especially for mobile devices. Most mobile devices contain both cameras and microphones, and companies that dev…

Audio-Visual Emotion Recognition in Video Clips Open

Fatemeh Noroozi, Marina Marjanović, Angelina Njeguš, Sérgio Escalera, Gholamreza Anbarjafari · 2017

Computer science Philosophy Sociology

This paper presents a multimodal emotion recognition system, which is based on the analysis of audio and visual cues. From the audio channel, Mel-Frequency Cepstral Coefficients, Filter Bank Energies and prosodic features are extracted. Fo…

Wav2Letter: an End-to-End ConvNet-based Speech Recognition System Open

Ronan Collobert, Christian Puhrsch, Gabriel Synnaeve · 2016

Computer science

This paper presents a simple end-to-end model for speech recognition, combining a convolutional network based acoustic model and a graph decoding. It is trained to output letters, with transcribed speech, without the need for force alignme…

Artificial Intelligent System for Automatic Depression Level Analysis Through Visual and Vocal Expressions Open

Asim Jan, Hongying Meng, Yona Falinie binti Abd Gaus, Fan Zhang · 2017

Computer science Philosophy

A human being's cognitive system can be simulated by artificial intelligent systems. Machines and robots equipped with cognitive capability can automatically recognize a humans mental state through their gestures and facial expressions. In…

Target-Speaker Voice Activity Detection: A Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario Open

Ivan Medennikov, Maxim Korenevsky, Tatiana Prisyach, Yuri Khokhlov, Mariya Korenevskaya , et al. · 2020

Computer science

Speaker diarization for real-life scenarios is an extremely challenging\nproblem. Widely used clustering-based diarization approaches perform rather\npoorly in such conditions, mainly due to the limited ability to handle\noverlapping speec…

Deepfake Audio Detection via MFCC Features Using Machine Learning Open

Ameer Hamza, Abdul Rehman Javed, Farkhund Iqbal, Natalia Kryvinska, Ahmad Almadhor , et al. · 2022

Computer science Physics Geography

Deepfake content is created or altered synthetically using artificial intelligence (AI) approaches to appear real. It can include synthesizing audio, video, images, and text. Deepfakes may now produce natural-looking content, making them h…

Automatic Speech Emotion Recognition Using Machine Learning Open

Leila Kerkeni, Youssef Serrestou, Mohamed Mbarki, Kosai Raoof, Mohamed Ali Mahjoub , et al. · 2019

Computer science Sociology

International audience

Speech Emotion Recognition with deep learning Open

Hadhami Aouani, Yassine Ben Ayed · 2020

Computer science Mathematics

This paper proposes an emotion recognition system based on speech signals in two-stage approach, namely feature extraction and classification engine. Firstly, two sets of feature are investigated which are: the first one, we extract an 42-…

Speech Recognition using MFCC Open

Siwat Suksri, Thaweesak Yingthawornsuk · 2015

Computer science Philosophy

This paper describes an approach of speech recognition by using the Mel-Scale Frequency Cepstral Coefficients (MFCC) extracted from speech signal of spoken words. Principal Component Analysis is employed as the supplement in feature dimens…

Real-Time Smart-Digital Stethoscope System for Heart Diseases Monitoring Open

Muhammad E. H. Chowdhury, Amith Khandakar, Khawla Alzoubi, Samar Mansoor, Anas Tahir , et al. · 2019

Computer science Medicine Philosophy

One of the major causes of death all over the world is heart disease or cardiac dysfunction. These diseases could be identified easily with the variations in the sound produced due to the heart activity. These sophisticated auscultations n…

Audiovisual emotion recognition in wild Open

Egils Avots, Tomasz Sapiński, Maie Bachmann, Dorota Kamińska · 2018

Computer science Psychology Physics

People express emotions through different modalities. Utilization of both verbal and nonverbal communication channels allows to create a system in which the emotional state is expressed more clearly and therefore easier to understand. Expa…

3D CNN-Based Speech Emotion Recognition Using K-Means Clustering and Spectrograms Open

Noushin Hajarolasvadi, Hasan Demirel · 2019

Computer science

Detecting human intentions and emotions helps improve human–robot interactions. Emotion recognition has been a challenging research direction in the past decade. This paper proposes an emotion recognition system based on analysis of speech…

Multimodal Approach of Speech Emotion Recognition Using Multi-Level Multi-Head Fusion Attention-Based Recurrent Neural Network Open

Ngoc-Huynh Ho, Hyung-Jeong Yang, Soo-Hyung Kim, Guee-Sang Lee · 2020

Computer science

Speech emotion recognition is a challenging but important task in human computer interaction (HCI). As technology and understanding of emotion are progressing, it is necessary to design robust and reliable emotion recognition systems that …

Multimodal Feature-Based Surface Material Classification Open

Matti Strese, Clemens Schuwerk, Albert Iepure, Eckehard Steinbach · 2016

Computer science Engineering Materials science

When a tool is tapped on or dragged over an object surface, vibrations are induced in the tool, which can be captured using acceleration sensors. The tool-surface interaction additionally creates audible sound waves, which can be recorded …

Deep Learning Methods for Underwater Target Feature Extraction and Recognition Open

Gang Hu, Kejun Wang, Peng Yuan, Qiu Mengran, Jianfei Shi , et al. · 2018

Computer science Geology

The classification and recognition technology of underwater acoustic signal were always an important research content in the field of underwater acoustic signal processing. Currently, wavelet transform, Hilbert-Huang transform, and Mel fre…

Speech Emotion Recognition with Dual-Sequence LSTM Architecture Open

Jianyou Wang, Michael Xue, Ryan Culhane, Enmao Diao, Jie Ding , et al. · 2020

Computer science Biology

Speech Emotion Recognition (SER) has emerged as a critical component of the next generation human-machine interfacing technologies. In this work, we propose a new dual-level model that predicts emotions based on both MFCC features and mel-…

Hybrid LSTM-Transformer Model for Emotion Recognition From Speech Audio Files Open

Felicia Andayani, Lau Bee Theng, Mark Tee Kit Tsun, Caslon Chua · 2022

Computer science Philosophy Physics

Emotion is a vital component in daily human communication and it helps people understand each other. Emotion recognition plays a crucial role in developing human-computer interaction and computer-based speech emotion recognition. In a nuts…

A Novel Approach for Classification of Speech Emotions Based on Deep and Acoustic Features Open

Mehmet Bilal Er · 2020

Computer science Philosophy

The problem of recognition and classification of emotions in speech is one of the most prominent research topics, that has gained popularity, in human-computer interaction in the last decades. Having recognized the feelings or emotions in …

Classifying Heart Sound Recordings using Deep Convolutional Neural Networks and Mel:Frequency Cepstral Coefficients Open

Jonathan Rubin, Rui Abreu, Anurag Ganguli, Saigopal Nelaturi, Ion Matei , et al. · 2016

Computer science Medicine Physics

We describe the development of an algorithm for the automatic classification of heart sound phonocardiogram waveforms as normal, abnormal or uncertain.Our approach consists of three major components: 1) Heart sound segmentation, 2) Transfo…

Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN Open

Lianzhang Zhu, Leiming Chen, Dehai Zhao, Jiehan Zhou, Weishan Zhang · 2017

Computer science

Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of t…

Fault Detection and Diagnosis of Railway Point Machines by Sound Analysis Open

Jonguk Lee, Heesu Choi, Daihee Park, Yongwha Chung, Hee‐Young Kim , et al. · 2016

Computer science Engineering Geology

Railway point devices act as actuators that provide different routes to trains by driving switchblades from the current position to the opposite one. Point failure can significantly affect railway operations, with potentially disastrous co…

Towards Directly Modeling Raw Speech Signal for Speaker Verification Using CNNS Open

Hannah Muckenhirn, Mathew Magimai.-Doss, Sebastien Marcell · 2018

Computer science

Speaker verification systems traditionally extract and model cepstral features or filter bank energies from the speech signal. In this paper, inspired by the success of neural network-based approaches to model directly raw speech signal fo…

Implementation and Comparison of Speech Emotion Recognition System Using Gaussian Mixture Model (GMM) and K- Nearest Neighbor (K-NN) Techniques Open

Rahul B. Lanjewar, Swarup S. Mathurkar, Nilesh Patel · 2015

Computer science

The kinship between man and machines has become a new trend of technology such that machines now have to respond by considering the human emotional levels. The signal processing and machine learning technologies have boosted the machine in…

Deception Detection in Videos Open

Zhe Wu, Bharat Singh, Larry S. Davis, V. S. Subrahmanian · 2018

Computer science Philosophy Economics

We present a system for covert automated deception detection using information available in a video. We study the importance of different modalities like vision, audio and text for this task. On the vision side, our system uses classifiers…

Classification of Indian Classical Music With Time-Series Matching Deep Learning Approach Open

Akhilesh Sharma, Gaurav Aggarwal, Sachit Bhardwaj, Prąsun Chakrabarti, Tulika Chakrabarti , et al. · 2021

Computer science Mathematics Art

Music is a heavenly way of expressing feelings about the world. The language of music has vast diversity. For centuries, people have indulged in debates to stratisfy between Western and Indian Classical Music. But through this paper, an un…

Some Commonly Used Speech Feature Extraction Algorithms Open

Sabur Ajibola Alim, Nahrul Khair Alang Md Rashid · 2018

Computer science Mathematics Philosophy

Speech is a complex naturally acquired human motor ability. It is characterized in adults with the production of about 14 different sounds per second via the harmonized actions of roughly 100 muscles. Speaker recognition is the capability …

Phonocardiogram Signal Processing for Automatic Diagnosis of Congenital Heart Disorders through Fusion of Temporal and Cepstral Features Open

Sumair Aziz, Muhammad Umar Khan, Majed Alhaisoni, Tallha Akram, Muhammad Altaf · 2020

Computer science Medicine

Congenital heart disease (CHD) is a heart disorder associated with the devastating indications that result in increased mortality, increased morbidity, increased healthcare expenditure, and decreased quality of life. Ventricular Septal Def…

Acoustic-Based Emergency Vehicle Detection Using Convolutional Neural Networks Open

Van-Thuan Tran, Wei-Ho Tsai · 2020

Computer science Art Physics

This work investigates how to detect emergency vehicles such as ambulances, fire engines, and police cars based on their siren sounds. Recognizing that car drivers may sometimes be unaware of the siren warnings from the emergency vehicles,…

Mel-frequency cepstrum ≈ Mel-frequency cepstrum