A Conformer-Based ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement and Speech Separation Article Swipe
YOU?
·
· 2021
· Open Access
·
· DOI: https://doi.org/10.1109/asru51503.2021.9687942
We present a frontend for improving robustness of automatic speech recognition (ASR), that jointly implements three modules within a single model: acoustic echo cancellation, speech enhancement, and speech separation. This is achieved by using a contextual enhancement neural network that can optionally make use of different types of side inputs: (1) a reference signal of the playback audio, which is necessary for echo cancellation; (2) a noise context, which is useful for speech enhancement; and (3) an embedding vector representing the voice characteristic of the target speaker of interest, which is not only critical in speech separation, but also helpful for echo cancellation and speech enhancement. We present detailed evaluations to show that the joint model performs almost as well as the task-specific models, and significantly reduces word error rate in noisy conditions even when using a large-scale state-of-the-art ASR model. Compared to the noisy baseline, the joint model reduces the word error rate in low signal-to-noise ratio conditions by at least 71% on our echo cancellation dataset, 10% on our noisy dataset, and 26% on our multi-speaker dataset. Compared to task-specific models, the joint model performs within 10% on our echo cancellation dataset, 2% on the noisy dataset, and 3% on the multi-speaker dataset.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.1109/asru51503.2021.9687942
- OA Status
- green
- Cited By
- 1
- References
- 72
- OpenAlex ID
- https://openalex.org/W3215167351
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3215167351Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1109/asru51503.2021.9687942Digital Object Identifier
- Title
-
A Conformer-Based ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement and Speech SeparationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2021Year of publication
- Publication date
-
2021-12-13Full publication date if available
- Authors
-
Tom O’Malley, Arun Narayanan, Quan Wang, Alex Park, James S. Walker, N. T. HowardList of authors in order
- Landing page
-
https://doi.org/10.1109/asru51503.2021.9687942Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2111.09935Direct OA link when available
- Concepts
-
Computer science, Speech recognition, Robustness (evolution), Speech enhancement, Word error rate, Joint (building), Echo (communications protocol), Noise (video), Artificial intelligence, Noise reduction, Architectural engineering, Gene, Image (mathematics), Engineering, Biochemistry, Chemistry, Computer networkTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2021: 1Per-year citation counts (last 5 years)
- References (count)
-
72Number of works referenced by this work
Full payload
| id | https://openalex.org/W3215167351 |
|---|---|
| doi | https://doi.org/10.1109/asru51503.2021.9687942 |
| ids.doi | https://doi.org/10.1109/asru51503.2021.9687942 |
| ids.mag | 3215167351 |
| ids.openalex | https://openalex.org/W3215167351 |
| fwci | 0.17122083 |
| type | preprint |
| title | A Conformer-Based ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement and Speech Separation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | 311 |
| biblio.first_page | 304 |
| topics[0].id | https://openalex.org/T10860 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 1.0 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1711 |
| topics[0].subfield.display_name | Signal Processing |
| topics[0].display_name | Speech and Audio Processing |
| topics[1].id | https://openalex.org/T10201 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9997000098228455 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Speech Recognition and Synthesis |
| topics[2].id | https://openalex.org/T11309 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9979000091552734 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1711 |
| topics[2].subfield.display_name | Signal Processing |
| topics[2].display_name | Music and Audio Processing |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7448105812072754 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C28490314 |
| concepts[1].level | 1 |
| concepts[1].score | 0.7443621158599854 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q189436 |
| concepts[1].display_name | Speech recognition |
| concepts[2].id | https://openalex.org/C63479239 |
| concepts[2].level | 3 |
| concepts[2].score | 0.6022388935089111 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q7353546 |
| concepts[2].display_name | Robustness (evolution) |
| concepts[3].id | https://openalex.org/C2776182073 |
| concepts[3].level | 3 |
| concepts[3].score | 0.5631170868873596 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q7575395 |
| concepts[3].display_name | Speech enhancement |
| concepts[4].id | https://openalex.org/C40969351 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5362638235092163 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q3516228 |
| concepts[4].display_name | Word error rate |
| concepts[5].id | https://openalex.org/C18555067 |
| concepts[5].level | 2 |
| concepts[5].score | 0.4908347725868225 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q8375051 |
| concepts[5].display_name | Joint (building) |
| concepts[6].id | https://openalex.org/C2779426996 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4480116367340088 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q18389128 |
| concepts[6].display_name | Echo (communications protocol) |
| concepts[7].id | https://openalex.org/C99498987 |
| concepts[7].level | 3 |
| concepts[7].score | 0.43694978952407837 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q2210247 |
| concepts[7].display_name | Noise (video) |
| concepts[8].id | https://openalex.org/C154945302 |
| concepts[8].level | 1 |
| concepts[8].score | 0.32151180505752563 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[8].display_name | Artificial intelligence |
| concepts[9].id | https://openalex.org/C163294075 |
| concepts[9].level | 2 |
| concepts[9].score | 0.31037667393684387 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q581861 |
| concepts[9].display_name | Noise reduction |
| concepts[10].id | https://openalex.org/C170154142 |
| concepts[10].level | 1 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q150737 |
| concepts[10].display_name | Architectural engineering |
| concepts[11].id | https://openalex.org/C104317684 |
| concepts[11].level | 2 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q7187 |
| concepts[11].display_name | Gene |
| concepts[12].id | https://openalex.org/C115961682 |
| concepts[12].level | 2 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[12].display_name | Image (mathematics) |
| concepts[13].id | https://openalex.org/C127413603 |
| concepts[13].level | 0 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[13].display_name | Engineering |
| concepts[14].id | https://openalex.org/C55493867 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q7094 |
| concepts[14].display_name | Biochemistry |
| concepts[15].id | https://openalex.org/C185592680 |
| concepts[15].level | 0 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[15].display_name | Chemistry |
| concepts[16].id | https://openalex.org/C31258907 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q1301371 |
| concepts[16].display_name | Computer network |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7448105812072754 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/speech-recognition |
| keywords[1].score | 0.7443621158599854 |
| keywords[1].display_name | Speech recognition |
| keywords[2].id | https://openalex.org/keywords/robustness |
| keywords[2].score | 0.6022388935089111 |
| keywords[2].display_name | Robustness (evolution) |
| keywords[3].id | https://openalex.org/keywords/speech-enhancement |
| keywords[3].score | 0.5631170868873596 |
| keywords[3].display_name | Speech enhancement |
| keywords[4].id | https://openalex.org/keywords/word-error-rate |
| keywords[4].score | 0.5362638235092163 |
| keywords[4].display_name | Word error rate |
| keywords[5].id | https://openalex.org/keywords/joint |
| keywords[5].score | 0.4908347725868225 |
| keywords[5].display_name | Joint (building) |
| keywords[6].id | https://openalex.org/keywords/echo |
| keywords[6].score | 0.4480116367340088 |
| keywords[6].display_name | Echo (communications protocol) |
| keywords[7].id | https://openalex.org/keywords/noise |
| keywords[7].score | 0.43694978952407837 |
| keywords[7].display_name | Noise (video) |
| keywords[8].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[8].score | 0.32151180505752563 |
| keywords[8].display_name | Artificial intelligence |
| keywords[9].id | https://openalex.org/keywords/noise-reduction |
| keywords[9].score | 0.31037667393684387 |
| keywords[9].display_name | Noise reduction |
| language | en |
| locations[0].id | doi:10.1109/asru51503.2021.9687942 |
| locations[0].is_oa | False |
| locations[0].source.id | https://openalex.org/S4363606113 |
| locations[0].source.issn | |
| locations[0].source.type | conference |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].source.host_organization_lineage | |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | publishedVersion |
| locations[0].raw_type | proceedings-article |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) |
| locations[0].landing_page_url | https://doi.org/10.1109/asru51503.2021.9687942 |
| locations[1].id | pmh:oai:arXiv.org:2111.09935 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | public-domain |
| locations[1].pdf_url | https://arxiv.org/pdf/2111.09935 |
| locations[1].version | submittedVersion |
| locations[1].raw_type | |
| locations[1].license_id | https://openalex.org/licenses/public-domain |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | http://arxiv.org/abs/2111.09935 |
| locations[2].id | mag:3215167351 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S4306400194 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | True |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | arXiv (Cornell University) |
| locations[2].source.host_organization | https://openalex.org/I205783295 |
| locations[2].source.host_organization_name | Cornell University |
| locations[2].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[2].license | |
| locations[2].pdf_url | |
| locations[2].version | submittedVersion |
| locations[2].raw_type | |
| locations[2].license_id | |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | arXiv (Cornell University) |
| locations[2].landing_page_url | https://arxiv.org/pdf/2111.09935 |
| locations[3].id | doi:10.48550/arxiv.2111.09935 |
| locations[3].is_oa | True |
| locations[3].source.id | https://openalex.org/S4306400194 |
| locations[3].source.issn | |
| locations[3].source.type | repository |
| locations[3].source.is_oa | True |
| locations[3].source.issn_l | |
| locations[3].source.is_core | False |
| locations[3].source.is_in_doaj | False |
| locations[3].source.display_name | arXiv (Cornell University) |
| locations[3].source.host_organization | https://openalex.org/I205783295 |
| locations[3].source.host_organization_name | Cornell University |
| locations[3].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[3].license | |
| locations[3].pdf_url | |
| locations[3].version | |
| locations[3].raw_type | article |
| locations[3].license_id | |
| locations[3].is_accepted | False |
| locations[3].is_published | |
| locations[3].raw_source_name | |
| locations[3].landing_page_url | https://doi.org/10.48550/arxiv.2111.09935 |
| indexed_in | arxiv, crossref, datacite |
| authorships[0].author.id | https://openalex.org/A5042631153 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Tom O’Malley |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Tom O'Malley |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5000078382 |
| authorships[1].author.orcid | https://orcid.org/0009-0008-3325-8928 |
| authorships[1].author.display_name | Arun Narayanan |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Arun Narayanan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5108047863 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-5483-0243 |
| authorships[2].author.display_name | Quan Wang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Quan Wang |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5111429885 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Alex Park |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Alex Park |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5101928782 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-7184-584X |
| authorships[4].author.display_name | James S. Walker |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | James Walker |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5060708515 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-8787-6309 |
| authorships[5].author.display_name | N. T. Howard |
| authorships[5].author_position | last |
| authorships[5].raw_author_name | Nathan Howard |
| authorships[5].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2111.09935 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2021-12-06T00:00:00 |
| display_name | A Conformer-Based ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement and Speech Separation |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10860 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 1.0 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1711 |
| primary_topic.subfield.display_name | Signal Processing |
| primary_topic.display_name | Speech and Audio Processing |
| cited_by_count | 1 |
| counts_by_year[0].year | 2021 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 4 |
| best_oa_location.id | pmh:oai:arXiv.org:2111.09935 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | public-domain |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2111.09935 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | https://openalex.org/licenses/public-domain |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2111.09935 |
| primary_location.id | doi:10.1109/asru51503.2021.9687942 |
| primary_location.is_oa | False |
| primary_location.source.id | https://openalex.org/S4363606113 |
| primary_location.source.issn | |
| primary_location.source.type | conference |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.source.host_organization_lineage | |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | publishedVersion |
| primary_location.raw_type | proceedings-article |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) |
| primary_location.landing_page_url | https://doi.org/10.1109/asru51503.2021.9687942 |
| publication_date | 2021-12-13 |
| publication_year | 2021 |
| referenced_works | https://openalex.org/W2937769560, https://openalex.org/W2042141988, https://openalex.org/W6774687970, https://openalex.org/W3015191643, https://openalex.org/W2973062255, https://openalex.org/W2964238697, https://openalex.org/W2062164080, https://openalex.org/W3162646409, https://openalex.org/W2888860279, https://openalex.org/W3095248373, https://openalex.org/W2734774145, https://openalex.org/W2460742184, https://openalex.org/W6743590492, https://openalex.org/W6741807409, https://openalex.org/W6743440867, https://openalex.org/W2972461400, https://openalex.org/W6781924587, https://openalex.org/W2953932601, https://openalex.org/W2696967604, https://openalex.org/W2136439176, https://openalex.org/W2221409856, https://openalex.org/W6776048756, https://openalex.org/W6713134421, https://openalex.org/W6631190155, https://openalex.org/W3008181812, https://openalex.org/W3015194534, https://openalex.org/W2889013998, https://openalex.org/W2559809918, https://openalex.org/W3097777922, https://openalex.org/W2884797218, https://openalex.org/W2963431393, https://openalex.org/W2065804682, https://openalex.org/W2550493152, https://openalex.org/W2132882255, https://openalex.org/W123803282, https://openalex.org/W3160766462, https://openalex.org/W6739366949, https://openalex.org/W2473329891, https://openalex.org/W6743244400, https://openalex.org/W6738243166, https://openalex.org/W2963122170, https://openalex.org/W6629717138, https://openalex.org/W2936774411, https://openalex.org/W6740674931, https://openalex.org/W6780404847, https://openalex.org/W2141411743, https://openalex.org/W4226191785, https://openalex.org/W2962788625, https://openalex.org/W3163842642, https://openalex.org/W3197845638, https://openalex.org/W48610814, https://openalex.org/W3044278286, https://openalex.org/W2101045344, https://openalex.org/W2749510669, https://openalex.org/W3048487650, https://openalex.org/W1522301498, https://openalex.org/W2739427748, https://openalex.org/W2617258110, https://openalex.org/W3025165719, https://openalex.org/W2748545504, https://openalex.org/W1494198834, https://openalex.org/W3008880747, https://openalex.org/W3211065263, https://openalex.org/W2963827914, https://openalex.org/W580085951, https://openalex.org/W2963921132, https://openalex.org/W2928941594, https://openalex.org/W3015746570, https://openalex.org/W2750499125, https://openalex.org/W3143332423, https://openalex.org/W2402144811, https://openalex.org/W2627092829 |
| referenced_works_count | 72 |
| abstract_inverted_index.a | 2, 18, 34, 51, 65, 136 |
| abstract_inverted_index.2% | 194 |
| abstract_inverted_index.3% | 200 |
| abstract_inverted_index.We | 0, 106 |
| abstract_inverted_index.an | 76 |
| abstract_inverted_index.as | 118, 120 |
| abstract_inverted_index.at | 160 |
| abstract_inverted_index.by | 32, 159 |
| abstract_inverted_index.in | 94, 130, 154 |
| abstract_inverted_index.is | 30, 59, 69, 90 |
| abstract_inverted_index.of | 7, 44, 47, 54, 83, 87 |
| abstract_inverted_index.on | 163, 169, 175, 189, 195, 201 |
| abstract_inverted_index.to | 110, 142, 180 |
| abstract_inverted_index.(1) | 50 |
| abstract_inverted_index.(2) | 64 |
| abstract_inverted_index.(3) | 75 |
| abstract_inverted_index.10% | 168, 188 |
| abstract_inverted_index.26% | 174 |
| abstract_inverted_index.71% | 162 |
| abstract_inverted_index.ASR | 139 |
| abstract_inverted_index.and | 26, 74, 103, 124, 173, 199 |
| abstract_inverted_index.but | 97 |
| abstract_inverted_index.can | 40 |
| abstract_inverted_index.for | 4, 61, 71, 100 |
| abstract_inverted_index.low | 155 |
| abstract_inverted_index.not | 91 |
| abstract_inverted_index.our | 164, 170, 176, 190 |
| abstract_inverted_index.the | 55, 80, 84, 113, 121, 143, 146, 150, 183, 196, 202 |
| abstract_inverted_index.use | 43 |
| abstract_inverted_index.This | 29 |
| abstract_inverted_index.also | 98 |
| abstract_inverted_index.echo | 22, 62, 101, 165, 191 |
| abstract_inverted_index.even | 133 |
| abstract_inverted_index.make | 42 |
| abstract_inverted_index.only | 92 |
| abstract_inverted_index.rate | 129, 153 |
| abstract_inverted_index.show | 111 |
| abstract_inverted_index.side | 48 |
| abstract_inverted_index.that | 12, 39, 112 |
| abstract_inverted_index.well | 119 |
| abstract_inverted_index.when | 134 |
| abstract_inverted_index.word | 127, 151 |
| abstract_inverted_index.error | 128, 152 |
| abstract_inverted_index.joint | 114, 147, 184 |
| abstract_inverted_index.least | 161 |
| abstract_inverted_index.model | 115, 148, 185 |
| abstract_inverted_index.noise | 66 |
| abstract_inverted_index.noisy | 131, 144, 171, 197 |
| abstract_inverted_index.ratio | 157 |
| abstract_inverted_index.three | 15 |
| abstract_inverted_index.types | 46 |
| abstract_inverted_index.using | 33, 135 |
| abstract_inverted_index.voice | 81 |
| abstract_inverted_index.which | 58, 68, 89 |
| abstract_inverted_index.(ASR), | 11 |
| abstract_inverted_index.almost | 117 |
| abstract_inverted_index.audio, | 57 |
| abstract_inverted_index.model. | 140 |
| abstract_inverted_index.model: | 20 |
| abstract_inverted_index.neural | 37 |
| abstract_inverted_index.signal | 53 |
| abstract_inverted_index.single | 19 |
| abstract_inverted_index.speech | 9, 24, 27, 72, 95, 104 |
| abstract_inverted_index.target | 85 |
| abstract_inverted_index.useful | 70 |
| abstract_inverted_index.vector | 78 |
| abstract_inverted_index.within | 17, 187 |
| abstract_inverted_index.helpful | 99 |
| abstract_inverted_index.inputs: | 49 |
| abstract_inverted_index.jointly | 13 |
| abstract_inverted_index.models, | 123, 182 |
| abstract_inverted_index.modules | 16 |
| abstract_inverted_index.network | 38 |
| abstract_inverted_index.present | 1, 107 |
| abstract_inverted_index.reduces | 126, 149 |
| abstract_inverted_index.speaker | 86 |
| abstract_inverted_index.Compared | 141, 179 |
| abstract_inverted_index.achieved | 31 |
| abstract_inverted_index.acoustic | 21 |
| abstract_inverted_index.context, | 67 |
| abstract_inverted_index.critical | 93 |
| abstract_inverted_index.dataset, | 167, 172, 193, 198 |
| abstract_inverted_index.dataset. | 178, 204 |
| abstract_inverted_index.detailed | 108 |
| abstract_inverted_index.frontend | 3 |
| abstract_inverted_index.performs | 116, 186 |
| abstract_inverted_index.playback | 56 |
| abstract_inverted_index.automatic | 8 |
| abstract_inverted_index.baseline, | 145 |
| abstract_inverted_index.different | 45 |
| abstract_inverted_index.embedding | 77 |
| abstract_inverted_index.improving | 5 |
| abstract_inverted_index.interest, | 88 |
| abstract_inverted_index.necessary | 60 |
| abstract_inverted_index.reference | 52 |
| abstract_inverted_index.conditions | 132, 158 |
| abstract_inverted_index.contextual | 35 |
| abstract_inverted_index.implements | 14 |
| abstract_inverted_index.optionally | 41 |
| abstract_inverted_index.robustness | 6 |
| abstract_inverted_index.enhancement | 36 |
| abstract_inverted_index.evaluations | 109 |
| abstract_inverted_index.large-scale | 137 |
| abstract_inverted_index.recognition | 10 |
| abstract_inverted_index.separation, | 96 |
| abstract_inverted_index.separation. | 28 |
| abstract_inverted_index.cancellation | 102, 166, 192 |
| abstract_inverted_index.enhancement, | 25 |
| abstract_inverted_index.enhancement. | 105 |
| abstract_inverted_index.enhancement; | 73 |
| abstract_inverted_index.representing | 79 |
| abstract_inverted_index.cancellation, | 23 |
| abstract_inverted_index.cancellation; | 63 |
| abstract_inverted_index.multi-speaker | 177, 203 |
| abstract_inverted_index.significantly | 125 |
| abstract_inverted_index.task-specific | 122, 181 |
| abstract_inverted_index.characteristic | 82 |
| abstract_inverted_index.signal-to-noise | 156 |
| abstract_inverted_index.state-of-the-art | 138 |
| cited_by_percentile_year.max | 93 |
| cited_by_percentile_year.min | 89 |
| countries_distinct_count | 0 |
| institutions_distinct_count | 6 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/16 |
| sustainable_development_goals[0].score | 0.6100000143051147 |
| sustainable_development_goals[0].display_name | Peace, Justice and strong institutions |
| citation_normalized_percentile.value | 0.36611296 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |