Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments Article Swipe

PDF

YOU? · · 2022 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2208.13100

This research work is about recent development made in speech recognition. In this research work, analysis of isolated digit recognition in the presence of different bit rates and at different noise levels has been performed. This research work has been carried using audacity and HTK toolkit. Hidden Markov Model (HMM) is the recognition model which was used to perform this experiment. The feature extraction techniques used are Mel Frequency Cepstrum coefficient (MFCC), Linear Predictive Coding (LPC), perceptual linear predictive (PLP), mel spectrum (MELSPEC), filter bank (FBANK). There were three types of different noise levels which have been considered for testing of data. These include random noise, fan noise and random noise in real time environment. This was done to analyse the best environment which can used for real time applications. Further, five different types of commonly used bit rates at different sampling rates were considered to find out the most optimum bit rate.

Related Topics

Computer Science

Artificial Intelligence

Concepts

Speech recognition Computer science Linear predictive coding Mel-frequency cepstrum Hidden Markov model Noise (video) Pattern recognition (psychology) Coding (social sciences) Linear prediction Artificial intelligence Feature (linguistics) Feature extraction Cepstrum Speech coding Statistics Mathematics Philosophy Image (mathematics) Linguistics

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2208.13100
PDF: https://arxiv.org/pdf/2208.13100
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4293790420

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4293790420

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2208.13100

Digital Object Identifier
Title: Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2022

Year of publication
Publication date: 2022-08-27

Full publication date if available
Authors: Muskan Garg, Naveen Aggarwal

List of authors in order
Landing page: https://arxiv.org/abs/2208.13100

Publisher landing page
PDF URL: https://arxiv.org/pdf/2208.13100

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2208.13100

Direct OA link when available
Concepts: Speech recognition, Computer science, Linear predictive coding, Mel-frequency cepstrum, Hidden Markov model, Noise (video), Pattern recognition (psychology), Coding (social sciences), Linear prediction, Artificial intelligence, Feature (linguistics), Feature extraction, Cepstrum, Speech coding, Statistics, Mathematics, Philosophy, Image (mathematics), Linguistics

Top concepts (fields/topics) attached by OpenAlex
Cited by: 0

Total citation count in OpenAlex
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4293790420
doi	https://doi.org/10.48550/arxiv.2208.13100
ids.doi	https://doi.org/10.48550/arxiv.2208.13100
ids.openalex	https://openalex.org/W4293790420
fwci
type	preprint
title	Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T10860
topics[0].field.id	https://openalex.org/fields/17
topics[0].field.display_name	Computer Science
topics[0].score	0.9965000152587891
topics[0].domain.id	https://openalex.org/domains/3
topics[0].domain.display_name	Physical Sciences
topics[0].subfield.id	https://openalex.org/subfields/1711
topics[0].subfield.display_name	Signal Processing
topics[0].display_name	Speech and Audio Processing
topics[1].id	https://openalex.org/T10901
topics[1].field.id	https://openalex.org/fields/17
topics[1].field.display_name	Computer Science
topics[1].score	0.9624999761581421
topics[1].domain.id	https://openalex.org/domains/3
topics[1].domain.display_name	Physical Sciences
topics[1].subfield.id	https://openalex.org/subfields/1707
topics[1].subfield.display_name	Computer Vision and Pattern Recognition
topics[1].display_name	Advanced Data Compression Techniques
topics[2].id	https://openalex.org/T11447
topics[2].field.id	https://openalex.org/fields/17
topics[2].field.display_name	Computer Science
topics[2].score	0.9556999802589417
topics[2].domain.id	https://openalex.org/domains/3
topics[2].domain.display_name	Physical Sciences
topics[2].subfield.id	https://openalex.org/subfields/1711
topics[2].subfield.display_name	Signal Processing
topics[2].display_name	Blind Source Separation Techniques
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C28490314
concepts[0].level	1
concepts[0].score	0.7032358646392822
concepts[0].wikidata	https://www.wikidata.org/wiki/Q189436
concepts[0].display_name	Speech recognition
concepts[1].id	https://openalex.org/C41008148
concepts[1].level	0
concepts[1].score	0.6956835985183716
concepts[1].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[1].display_name	Computer science
concepts[2].id	https://openalex.org/C59883199
concepts[2].level	3
concepts[2].score	0.679691731929779
concepts[2].wikidata	https://www.wikidata.org/wiki/Q1826438
concepts[2].display_name	Linear predictive coding
concepts[3].id	https://openalex.org/C151989614
concepts[3].level	3
concepts[3].score	0.6347624063491821
concepts[3].wikidata	https://www.wikidata.org/wiki/Q440370
concepts[3].display_name	Mel-frequency cepstrum
concepts[4].id	https://openalex.org/C23224414
concepts[4].level	2
concepts[4].score	0.5786162614822388
concepts[4].wikidata	https://www.wikidata.org/wiki/Q176769
concepts[4].display_name	Hidden Markov model
concepts[5].id	https://openalex.org/C99498987
concepts[5].level	3
concepts[5].score	0.5764116048812866
concepts[5].wikidata	https://www.wikidata.org/wiki/Q2210247
concepts[5].display_name	Noise (video)
concepts[6].id	https://openalex.org/C153180895
concepts[6].level	2
concepts[6].score	0.5525259971618652
concepts[6].wikidata	https://www.wikidata.org/wiki/Q7148389
concepts[6].display_name	Pattern recognition (psychology)
concepts[7].id	https://openalex.org/C179518139
concepts[7].level	2
concepts[7].score	0.45377475023269653
concepts[7].wikidata	https://www.wikidata.org/wiki/Q5140297
concepts[7].display_name	Coding (social sciences)
concepts[8].id	https://openalex.org/C131109320
concepts[8].level	2
concepts[8].score	0.45296338200569153
concepts[8].wikidata	https://www.wikidata.org/wiki/Q581012
concepts[8].display_name	Linear prediction
concepts[9].id	https://openalex.org/C154945302
concepts[9].level	1
concepts[9].score	0.4390665590763092
concepts[9].wikidata	https://www.wikidata.org/wiki/Q11660
concepts[9].display_name	Artificial intelligence
concepts[10].id	https://openalex.org/C2776401178
concepts[10].level	2
concepts[10].score	0.43666544556617737
concepts[10].wikidata	https://www.wikidata.org/wiki/Q12050496
concepts[10].display_name	Feature (linguistics)
concepts[11].id	https://openalex.org/C52622490
concepts[11].level	2
concepts[11].score	0.4243229925632477
concepts[11].wikidata	https://www.wikidata.org/wiki/Q1026626
concepts[11].display_name	Feature extraction
concepts[12].id	https://openalex.org/C88485024
concepts[12].level	2
concepts[12].score	0.41156163811683655
concepts[12].wikidata	https://www.wikidata.org/wiki/Q1054571
concepts[12].display_name	Cepstrum
concepts[13].id	https://openalex.org/C13895895
concepts[13].level	2
concepts[13].score	0.33854958415031433
concepts[13].wikidata	https://www.wikidata.org/wiki/Q3270773
concepts[13].display_name	Speech coding
concepts[14].id	https://openalex.org/C105795698
concepts[14].level	1
concepts[14].score	0.15372538566589355
concepts[14].wikidata	https://www.wikidata.org/wiki/Q12483
concepts[14].display_name	Statistics
concepts[15].id	https://openalex.org/C33923547
concepts[15].level	0
concepts[15].score	0.14616790413856506
concepts[15].wikidata	https://www.wikidata.org/wiki/Q395
concepts[15].display_name	Mathematics
concepts[16].id	https://openalex.org/C138885662
concepts[16].level	0
concepts[16].score	0.0
concepts[16].wikidata	https://www.wikidata.org/wiki/Q5891
concepts[16].display_name	Philosophy
concepts[17].id	https://openalex.org/C115961682
concepts[17].level	2
concepts[17].score	0.0
concepts[17].wikidata	https://www.wikidata.org/wiki/Q860623
concepts[17].display_name	Image (mathematics)
concepts[18].id	https://openalex.org/C41895202
concepts[18].level	1
concepts[18].score	0.0
concepts[18].wikidata	https://www.wikidata.org/wiki/Q8162
concepts[18].display_name	Linguistics
keywords[0].id	https://openalex.org/keywords/speech-recognition
keywords[0].score	0.7032358646392822
keywords[0].display_name	Speech recognition
keywords[1].id	https://openalex.org/keywords/computer-science
keywords[1].score	0.6956835985183716
keywords[1].display_name	Computer science
keywords[2].id	https://openalex.org/keywords/linear-predictive-coding
keywords[2].score	0.679691731929779
keywords[2].display_name	Linear predictive coding
keywords[3].id	https://openalex.org/keywords/mel-frequency-cepstrum
keywords[3].score	0.6347624063491821
keywords[3].display_name	Mel-frequency cepstrum
keywords[4].id	https://openalex.org/keywords/hidden-markov-model
keywords[4].score	0.5786162614822388
keywords[4].display_name	Hidden Markov model
keywords[5].id	https://openalex.org/keywords/noise
keywords[5].score	0.5764116048812866
keywords[5].display_name	Noise (video)
keywords[6].id	https://openalex.org/keywords/pattern-recognition
keywords[6].score	0.5525259971618652
keywords[6].display_name	Pattern recognition (psychology)
keywords[7].id	https://openalex.org/keywords/coding
keywords[7].score	0.45377475023269653
keywords[7].display_name	Coding (social sciences)
keywords[8].id	https://openalex.org/keywords/linear-prediction
keywords[8].score	0.45296338200569153
keywords[8].display_name	Linear prediction
keywords[9].id	https://openalex.org/keywords/artificial-intelligence
keywords[9].score	0.4390665590763092
keywords[9].display_name	Artificial intelligence
keywords[10].id	https://openalex.org/keywords/feature
keywords[10].score	0.43666544556617737
keywords[10].display_name	Feature (linguistics)
keywords[11].id	https://openalex.org/keywords/feature-extraction
keywords[11].score	0.4243229925632477
keywords[11].display_name	Feature extraction
keywords[12].id	https://openalex.org/keywords/cepstrum
keywords[12].score	0.41156163811683655
keywords[12].display_name	Cepstrum
keywords[13].id	https://openalex.org/keywords/speech-coding
keywords[13].score	0.33854958415031433
keywords[13].display_name	Speech coding
keywords[14].id	https://openalex.org/keywords/statistics
keywords[14].score	0.15372538566589355
keywords[14].display_name	Statistics
keywords[15].id	https://openalex.org/keywords/mathematics
keywords[15].score	0.14616790413856506
keywords[15].display_name	Mathematics
language	en
locations[0].id	pmh:oai:arXiv.org:2208.13100
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2208.13100
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2208.13100
locations[1].id	doi:10.48550/arxiv.2208.13100
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license	cc-by
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id	https://openalex.org/licenses/cc-by
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2208.13100
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5081545838
authorships[0].author.orcid
authorships[0].author.display_name	Muskan Garg
authorships[0].author_position	first
authorships[0].raw_author_name	Garg, Muskan
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5054152952
authorships[1].author.orcid	https://orcid.org/0000-0003-1549-531X
authorships[1].author.display_name	Naveen Aggarwal
authorships[1].author_position	last
authorships[1].raw_author_name	Aggarwal, Naveen
authorships[1].is_corresponding	False
has_content.pdf	False
has_content.grobid_xml	False
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2208.13100
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2022-08-31T00:00:00
display_name	Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments
has_fulltext	False
is_retracted	False
updated_date	2025-11-06T06:51:31.235846
primary_topic.id	https://openalex.org/T10860
primary_topic.field.id	https://openalex.org/fields/17
primary_topic.field.display_name	Computer Science
primary_topic.score	0.9965000152587891
primary_topic.domain.id	https://openalex.org/domains/3
primary_topic.domain.display_name	Physical Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1711
primary_topic.subfield.display_name	Signal Processing
primary_topic.display_name	Speech and Audio Processing
related_works	https://openalex.org/W2363056088, https://openalex.org/W2363301696, https://openalex.org/W2808395304, https://openalex.org/W4312036005, https://openalex.org/W1921152853, https://openalex.org/W2383072803, https://openalex.org/W1994313308, https://openalex.org/W2352223112, https://openalex.org/W1570840316, https://openalex.org/W1949563597
cited_by_count	0
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2208.13100
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2208.13100
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2208.13100
primary_location.id	pmh:oai:arXiv.org:2208.13100
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2208.13100
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2208.13100
publication_date	2022-08-27
publication_year	2022
referenced_works_count	0
abstract_inverted_index.In	11
abstract_inverted_index.at	28, 139
abstract_inverted_index.in	8, 20, 111
abstract_inverted_index.is	3, 50
abstract_inverted_index.of	16, 23, 90, 100, 134
abstract_inverted_index.to	57, 118, 145
abstract_inverted_index.HTK	44
abstract_inverted_index.Mel	67
abstract_inverted_index.The	61
abstract_inverted_index.and	27, 43, 108
abstract_inverted_index.are	66
abstract_inverted_index.bit	25, 137, 151
abstract_inverted_index.can	124
abstract_inverted_index.fan	106
abstract_inverted_index.for	98, 126
abstract_inverted_index.has	32, 38
abstract_inverted_index.mel	80
abstract_inverted_index.out	147
abstract_inverted_index.the	21, 51, 120, 148
abstract_inverted_index.was	55, 116
abstract_inverted_index.This	0, 35, 115
abstract_inverted_index.bank	84
abstract_inverted_index.been	33, 39, 96
abstract_inverted_index.best	121
abstract_inverted_index.done	117
abstract_inverted_index.find	146
abstract_inverted_index.five	131
abstract_inverted_index.have	95
abstract_inverted_index.made	7
abstract_inverted_index.most	149
abstract_inverted_index.real	112, 127
abstract_inverted_index.this	12, 59
abstract_inverted_index.time	113, 128
abstract_inverted_index.used	56, 65, 125, 136
abstract_inverted_index.were	87, 143
abstract_inverted_index.work	2, 37
abstract_inverted_index.(HMM)	49
abstract_inverted_index.Model	48
abstract_inverted_index.There	86
abstract_inverted_index.These	102
abstract_inverted_index.about	4
abstract_inverted_index.data.	101
abstract_inverted_index.digit	18
abstract_inverted_index.model	53
abstract_inverted_index.noise	30, 92, 107, 110
abstract_inverted_index.rate.	152
abstract_inverted_index.rates	26, 138, 142
abstract_inverted_index.three	88
abstract_inverted_index.types	89, 133
abstract_inverted_index.using	41
abstract_inverted_index.which	54, 94, 123
abstract_inverted_index.work,	14
abstract_inverted_index.(LPC),	75
abstract_inverted_index.(PLP),	79
abstract_inverted_index.Coding	74
abstract_inverted_index.Hidden	46
abstract_inverted_index.Linear	72
abstract_inverted_index.Markov	47
abstract_inverted_index.filter	83
abstract_inverted_index.levels	31, 93
abstract_inverted_index.linear	77
abstract_inverted_index.noise,	105
abstract_inverted_index.random	104, 109
abstract_inverted_index.recent	5
abstract_inverted_index.speech	9
abstract_inverted_index.(MFCC),	71
abstract_inverted_index.analyse	119
abstract_inverted_index.carried	40
abstract_inverted_index.feature	62
abstract_inverted_index.include	103
abstract_inverted_index.optimum	150
abstract_inverted_index.perform	58
abstract_inverted_index.testing	99
abstract_inverted_index.(FBANK).	85
abstract_inverted_index.Cepstrum	69
abstract_inverted_index.Further,	130
abstract_inverted_index.analysis	15
abstract_inverted_index.audacity	42
abstract_inverted_index.commonly	135
abstract_inverted_index.isolated	17
abstract_inverted_index.presence	22
abstract_inverted_index.research	1, 13, 36
abstract_inverted_index.sampling	141
abstract_inverted_index.spectrum	81
abstract_inverted_index.toolkit.	45
abstract_inverted_index.Frequency	68
abstract_inverted_index.different	24, 29, 91, 132, 140
abstract_inverted_index.(MELSPEC),	82
abstract_inverted_index.Predictive	73
abstract_inverted_index.considered	97, 144
abstract_inverted_index.extraction	63
abstract_inverted_index.perceptual	76
abstract_inverted_index.performed.	34
abstract_inverted_index.predictive	78
abstract_inverted_index.techniques	64
abstract_inverted_index.coefficient	70
abstract_inverted_index.development	6
abstract_inverted_index.environment	122
abstract_inverted_index.experiment.	60
abstract_inverted_index.recognition	19, 52
abstract_inverted_index.environment.	114
abstract_inverted_index.recognition.	10
abstract_inverted_index.applications.	129
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	2
citation_normalized_percentile