End-to-End Speaker-Dependent Voice Activity Detection Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2009.09906
Voice activity detection (VAD) is an essential pre-processing step for tasks such as automatic speech recognition (ASR) and speaker recognition. A basic goal is to remove silent segments within an audio, while a more general VAD system could remove all the irrelevant segments such as noise and even unwanted speech from non-target speakers. We define the task, which only detects the speech from the target speaker, as speaker-dependent voice activity detection (SDVAD). This task is quite common in real applications and usually implemented by performing speaker verification (SV) on audio segments extracted from VAD. In this paper, we propose an end-to-end neural network based approach to address this problem, which explicitly takes the speaker identity into the modeling process. Moreover, inference can be performed in an online fashion, which leads to low system latency. Experiments are carried out on a conversational telephone dataset generated from the Switchboard corpus. Results show that our proposed online approach achieves significantly better performance than the usual VAD/SV system in terms of both frame accuracy and F-score. We also used our previously proposed segment-level metric for a more comprehensive analysis.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2009.09906
- https://arxiv.org/pdf/2009.09906
- OA Status
- green
- Cited By
- 2
- References
- 20
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W3087422378
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3087422378Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2009.09906Digital Object Identifier
- Title
-
End-to-End Speaker-Dependent Voice Activity DetectionWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-09-21Full publication date if available
- Authors
-
Yefei Chen, Shuai Wang, Yanmin Qian, Kai YuList of authors in order
- Landing page
-
https://arxiv.org/abs/2009.09906Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2009.09906Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2009.09906Direct OA link when available
- Concepts
-
Speech recognition, Computer science, Voice activity detection, Speaker diarisation, Speaker recognition, Latency (audio), Task (project management), Inference, Speech processing, Frame (networking), Process (computing), Artificial intelligence, Telecommunications, Economics, Operating system, ManagementTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
2Total citation count in OpenAlex
- Citations by year (recent)
-
2021: 2Per-year citation counts (last 5 years)
- References (count)
-
20Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3087422378 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2009.09906 |
| ids.doi | https://doi.org/10.48550/arxiv.2009.09906 |
| ids.mag | 3087422378 |
| ids.openalex | https://openalex.org/W3087422378 |
| fwci | |
| type | preprint |
| title | End-to-End Speaker-Dependent Voice Activity Detection |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10201 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9998999834060669 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Speech Recognition and Synthesis |
| topics[1].id | https://openalex.org/T10860 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9998999834060669 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1711 |
| topics[1].subfield.display_name | Signal Processing |
| topics[1].display_name | Speech and Audio Processing |
| topics[2].id | https://openalex.org/T11309 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9980000257492065 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1711 |
| topics[2].subfield.display_name | Signal Processing |
| topics[2].display_name | Music and Audio Processing |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C28490314 |
| concepts[0].level | 1 |
| concepts[0].score | 0.820976972579956 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q189436 |
| concepts[0].display_name | Speech recognition |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.815491795539856 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C204201278 |
| concepts[2].level | 3 |
| concepts[2].score | 0.7618936896324158 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q1332614 |
| concepts[2].display_name | Voice activity detection |
| concepts[3].id | https://openalex.org/C149838564 |
| concepts[3].level | 3 |
| concepts[3].score | 0.7297342419624329 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q7574248 |
| concepts[3].display_name | Speaker diarisation |
| concepts[4].id | https://openalex.org/C133892786 |
| concepts[4].level | 2 |
| concepts[4].score | 0.6609026193618774 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q1145189 |
| concepts[4].display_name | Speaker recognition |
| concepts[5].id | https://openalex.org/C82876162 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5993220806121826 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q17096504 |
| concepts[5].display_name | Latency (audio) |
| concepts[6].id | https://openalex.org/C2780451532 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5513314604759216 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q759676 |
| concepts[6].display_name | Task (project management) |
| concepts[7].id | https://openalex.org/C2776214188 |
| concepts[7].level | 2 |
| concepts[7].score | 0.5020081996917725 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q408386 |
| concepts[7].display_name | Inference |
| concepts[8].id | https://openalex.org/C61328038 |
| concepts[8].level | 2 |
| concepts[8].score | 0.443744957447052 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q3358061 |
| concepts[8].display_name | Speech processing |
| concepts[9].id | https://openalex.org/C126042441 |
| concepts[9].level | 2 |
| concepts[9].score | 0.42494115233421326 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q1324888 |
| concepts[9].display_name | Frame (networking) |
| concepts[10].id | https://openalex.org/C98045186 |
| concepts[10].level | 2 |
| concepts[10].score | 0.41453057527542114 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q205663 |
| concepts[10].display_name | Process (computing) |
| concepts[11].id | https://openalex.org/C154945302 |
| concepts[11].level | 1 |
| concepts[11].score | 0.34192168712615967 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[11].display_name | Artificial intelligence |
| concepts[12].id | https://openalex.org/C76155785 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q418 |
| concepts[12].display_name | Telecommunications |
| concepts[13].id | https://openalex.org/C162324750 |
| concepts[13].level | 0 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[13].display_name | Economics |
| concepts[14].id | https://openalex.org/C111919701 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[14].display_name | Operating system |
| concepts[15].id | https://openalex.org/C187736073 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q2920921 |
| concepts[15].display_name | Management |
| keywords[0].id | https://openalex.org/keywords/speech-recognition |
| keywords[0].score | 0.820976972579956 |
| keywords[0].display_name | Speech recognition |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.815491795539856 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/voice-activity-detection |
| keywords[2].score | 0.7618936896324158 |
| keywords[2].display_name | Voice activity detection |
| keywords[3].id | https://openalex.org/keywords/speaker-diarisation |
| keywords[3].score | 0.7297342419624329 |
| keywords[3].display_name | Speaker diarisation |
| keywords[4].id | https://openalex.org/keywords/speaker-recognition |
| keywords[4].score | 0.6609026193618774 |
| keywords[4].display_name | Speaker recognition |
| keywords[5].id | https://openalex.org/keywords/latency |
| keywords[5].score | 0.5993220806121826 |
| keywords[5].display_name | Latency (audio) |
| keywords[6].id | https://openalex.org/keywords/task |
| keywords[6].score | 0.5513314604759216 |
| keywords[6].display_name | Task (project management) |
| keywords[7].id | https://openalex.org/keywords/inference |
| keywords[7].score | 0.5020081996917725 |
| keywords[7].display_name | Inference |
| keywords[8].id | https://openalex.org/keywords/speech-processing |
| keywords[8].score | 0.443744957447052 |
| keywords[8].display_name | Speech processing |
| keywords[9].id | https://openalex.org/keywords/frame |
| keywords[9].score | 0.42494115233421326 |
| keywords[9].display_name | Frame (networking) |
| keywords[10].id | https://openalex.org/keywords/process |
| keywords[10].score | 0.41453057527542114 |
| keywords[10].display_name | Process (computing) |
| keywords[11].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[11].score | 0.34192168712615967 |
| keywords[11].display_name | Artificial intelligence |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2009.09906 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2009.09906 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2009.09906 |
| locations[1].id | doi:10.48550/arxiv.2009.09906 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2009.09906 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5012756702 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-1414-2045 |
| authorships[0].author.display_name | Yefei Chen |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yefei Chen |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100328312 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-7897-2024 |
| authorships[1].author.display_name | Shuai Wang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Shuai Wang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100341993 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-0314-3790 |
| authorships[2].author.display_name | Yanmin Qian |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Yanmin Qian |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5043098653 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-7102-9826 |
| authorships[3].author.display_name | Kai Yu |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Kai Yu |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2009.09906 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | End-to-End Speaker-Dependent Voice Activity Detection |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10201 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9998999834060669 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Speech Recognition and Synthesis |
| related_works | https://openalex.org/W2206035908, https://openalex.org/W2162158162, https://openalex.org/W4247736853, https://openalex.org/W1493012537, https://openalex.org/W1999004162, https://openalex.org/W2175373321, https://openalex.org/W2125642021, https://openalex.org/W1521049138, https://openalex.org/W2938358845, https://openalex.org/W2997340161 |
| cited_by_count | 2 |
| counts_by_year[0].year | 2021 |
| counts_by_year[0].cited_by_count | 2 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2009.09906 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2009.09906 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2009.09906 |
| primary_location.id | pmh:oai:arXiv.org:2009.09906 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2009.09906 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2009.09906 |
| publication_date | 2020-09-21 |
| publication_year | 2020 |
| referenced_works | https://openalex.org/W2069095950, https://openalex.org/W2106214098, https://openalex.org/W1999454387, https://openalex.org/W2079623482, https://openalex.org/W2032474878, https://openalex.org/W1991899119, https://openalex.org/W296042737, https://openalex.org/W2401833940, https://openalex.org/W2408468399, https://openalex.org/W2032362923, https://openalex.org/W2048497537, https://openalex.org/W2395750323, https://openalex.org/W2283417180, https://openalex.org/W2150769028, https://openalex.org/W2623155250, https://openalex.org/W2059203007, https://openalex.org/W2406262283, https://openalex.org/W2130426352, https://openalex.org/W2651834199, https://openalex.org/W2403186097 |
| referenced_works_count | 20 |
| abstract_inverted_index.A | 20 |
| abstract_inverted_index.a | 32, 139, 181 |
| abstract_inverted_index.In | 94 |
| abstract_inverted_index.We | 53, 172 |
| abstract_inverted_index.an | 5, 29, 99, 125 |
| abstract_inverted_index.as | 12, 44, 66 |
| abstract_inverted_index.be | 122 |
| abstract_inverted_index.by | 83 |
| abstract_inverted_index.in | 77, 124, 164 |
| abstract_inverted_index.is | 4, 23, 74 |
| abstract_inverted_index.of | 166 |
| abstract_inverted_index.on | 88, 138 |
| abstract_inverted_index.to | 24, 105, 130 |
| abstract_inverted_index.we | 97 |
| abstract_inverted_index.VAD | 35 |
| abstract_inverted_index.all | 39 |
| abstract_inverted_index.and | 17, 46, 80, 170 |
| abstract_inverted_index.are | 135 |
| abstract_inverted_index.can | 121 |
| abstract_inverted_index.for | 9, 180 |
| abstract_inverted_index.low | 131 |
| abstract_inverted_index.our | 151, 175 |
| abstract_inverted_index.out | 137 |
| abstract_inverted_index.the | 40, 55, 60, 63, 112, 116, 145, 160 |
| abstract_inverted_index.(SV) | 87 |
| abstract_inverted_index.This | 72 |
| abstract_inverted_index.VAD. | 93 |
| abstract_inverted_index.also | 173 |
| abstract_inverted_index.both | 167 |
| abstract_inverted_index.even | 47 |
| abstract_inverted_index.from | 50, 62, 92, 144 |
| abstract_inverted_index.goal | 22 |
| abstract_inverted_index.into | 115 |
| abstract_inverted_index.more | 33, 182 |
| abstract_inverted_index.only | 58 |
| abstract_inverted_index.real | 78 |
| abstract_inverted_index.show | 149 |
| abstract_inverted_index.step | 8 |
| abstract_inverted_index.such | 11, 43 |
| abstract_inverted_index.task | 73 |
| abstract_inverted_index.than | 159 |
| abstract_inverted_index.that | 150 |
| abstract_inverted_index.this | 95, 107 |
| abstract_inverted_index.used | 174 |
| abstract_inverted_index.(ASR) | 16 |
| abstract_inverted_index.(VAD) | 3 |
| abstract_inverted_index.Voice | 0 |
| abstract_inverted_index.audio | 89 |
| abstract_inverted_index.based | 103 |
| abstract_inverted_index.basic | 21 |
| abstract_inverted_index.could | 37 |
| abstract_inverted_index.frame | 168 |
| abstract_inverted_index.leads | 129 |
| abstract_inverted_index.noise | 45 |
| abstract_inverted_index.quite | 75 |
| abstract_inverted_index.takes | 111 |
| abstract_inverted_index.task, | 56 |
| abstract_inverted_index.tasks | 10 |
| abstract_inverted_index.terms | 165 |
| abstract_inverted_index.usual | 161 |
| abstract_inverted_index.voice | 68 |
| abstract_inverted_index.which | 57, 109, 128 |
| abstract_inverted_index.while | 31 |
| abstract_inverted_index.VAD/SV | 162 |
| abstract_inverted_index.audio, | 30 |
| abstract_inverted_index.better | 157 |
| abstract_inverted_index.common | 76 |
| abstract_inverted_index.define | 54 |
| abstract_inverted_index.metric | 179 |
| abstract_inverted_index.neural | 101 |
| abstract_inverted_index.online | 126, 153 |
| abstract_inverted_index.paper, | 96 |
| abstract_inverted_index.remove | 25, 38 |
| abstract_inverted_index.silent | 26 |
| abstract_inverted_index.speech | 14, 49, 61 |
| abstract_inverted_index.system | 36, 132, 163 |
| abstract_inverted_index.target | 64 |
| abstract_inverted_index.within | 28 |
| abstract_inverted_index.Results | 148 |
| abstract_inverted_index.address | 106 |
| abstract_inverted_index.carried | 136 |
| abstract_inverted_index.corpus. | 147 |
| abstract_inverted_index.dataset | 142 |
| abstract_inverted_index.detects | 59 |
| abstract_inverted_index.general | 34 |
| abstract_inverted_index.network | 102 |
| abstract_inverted_index.propose | 98 |
| abstract_inverted_index.speaker | 18, 85, 113 |
| abstract_inverted_index.usually | 81 |
| abstract_inverted_index.(SDVAD). | 71 |
| abstract_inverted_index.F-score. | 171 |
| abstract_inverted_index.accuracy | 169 |
| abstract_inverted_index.achieves | 155 |
| abstract_inverted_index.activity | 1, 69 |
| abstract_inverted_index.approach | 104, 154 |
| abstract_inverted_index.fashion, | 127 |
| abstract_inverted_index.identity | 114 |
| abstract_inverted_index.latency. | 133 |
| abstract_inverted_index.modeling | 117 |
| abstract_inverted_index.problem, | 108 |
| abstract_inverted_index.process. | 118 |
| abstract_inverted_index.proposed | 152, 177 |
| abstract_inverted_index.segments | 27, 42, 90 |
| abstract_inverted_index.speaker, | 65 |
| abstract_inverted_index.unwanted | 48 |
| abstract_inverted_index.Moreover, | 119 |
| abstract_inverted_index.analysis. | 184 |
| abstract_inverted_index.automatic | 13 |
| abstract_inverted_index.detection | 2, 70 |
| abstract_inverted_index.essential | 6 |
| abstract_inverted_index.extracted | 91 |
| abstract_inverted_index.generated | 143 |
| abstract_inverted_index.inference | 120 |
| abstract_inverted_index.performed | 123 |
| abstract_inverted_index.speakers. | 52 |
| abstract_inverted_index.telephone | 141 |
| abstract_inverted_index.end-to-end | 100 |
| abstract_inverted_index.explicitly | 110 |
| abstract_inverted_index.irrelevant | 41 |
| abstract_inverted_index.non-target | 51 |
| abstract_inverted_index.performing | 84 |
| abstract_inverted_index.previously | 176 |
| abstract_inverted_index.Experiments | 134 |
| abstract_inverted_index.Switchboard | 146 |
| abstract_inverted_index.implemented | 82 |
| abstract_inverted_index.performance | 158 |
| abstract_inverted_index.recognition | 15 |
| abstract_inverted_index.applications | 79 |
| abstract_inverted_index.recognition. | 19 |
| abstract_inverted_index.verification | 86 |
| abstract_inverted_index.comprehensive | 183 |
| abstract_inverted_index.segment-level | 178 |
| abstract_inverted_index.significantly | 156 |
| abstract_inverted_index.conversational | 140 |
| abstract_inverted_index.pre-processing | 7 |
| abstract_inverted_index.speaker-dependent | 67 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/16 |
| sustainable_development_goals[0].score | 0.5400000214576721 |
| sustainable_development_goals[0].display_name | Peace, Justice and strong institutions |
| citation_normalized_percentile |