StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2311.16509
We propose StyleCap, a method to generate natural language descriptions of speaking styles appearing in speech. Although most of conventional techniques for para-/non-linguistic information recognition focus on the category classification or the intensity estimation of pre-defined labels, they cannot provide the reasoning of the recognition result in an interpretable manner. StyleCap is a first step towards an end-to-end method for generating speaking-style prompts from speech, i.e., automatic speaking-style captioning. StyleCap is trained with paired data of speech and natural language descriptions. We train neural networks that convert a speech representation vector into prefix vectors that are fed into a large language model (LLM)-based text decoder. We explore an appropriate text decoder and speech feature representation suitable for this new task. The experimental results demonstrate that our StyleCap leveraging richer LLMs for the text decoder, speech self-supervised learning (SSL) features, and sentence rephrasing augmentation improves the accuracy and diversity of generated speaking-style captions. Samples of speaking-style captions generated by our StyleCap are publicly available.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2311.16509
- https://arxiv.org/pdf/2311.16509
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4389156698
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4389156698Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2311.16509Digital Object Identifier
- Title
-
StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-11-28Full publication date if available
- Authors
-
Kazuki Yamauchi, Yusuke Ijima, Yuki SaitoList of authors in order
- Landing page
-
https://arxiv.org/abs/2311.16509Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2311.16509Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2311.16509Direct OA link when available
- Concepts
-
Computer science, Closed captioning, Natural language processing, Speech recognition, Sentence, Artificial intelligence, Style (visual arts), Focus (optics), Natural language, Representation (politics), Natural language generation, Feature (linguistics), Linguistics, Philosophy, Image (mathematics), History, Physics, Politics, Optics, Political science, Law, ArchaeologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4389156698 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2311.16509 |
| ids.doi | https://doi.org/10.48550/arxiv.2311.16509 |
| ids.openalex | https://openalex.org/W4389156698 |
| fwci | |
| type | preprint |
| title | StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10181 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9983000159263611 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Natural Language Processing Techniques |
| topics[1].id | https://openalex.org/T12031 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9972000122070312 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Speech and dialogue systems |
| topics[2].id | https://openalex.org/T10028 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9950000047683716 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Topic Modeling |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.8147604465484619 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C157657479 |
| concepts[1].level | 3 |
| concepts[1].score | 0.7064263820648193 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q2367247 |
| concepts[1].display_name | Closed captioning |
| concepts[2].id | https://openalex.org/C204321447 |
| concepts[2].level | 1 |
| concepts[2].score | 0.6739928722381592 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[2].display_name | Natural language processing |
| concepts[3].id | https://openalex.org/C28490314 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6077100038528442 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q189436 |
| concepts[3].display_name | Speech recognition |
| concepts[4].id | https://openalex.org/C2777530160 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5815175175666809 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q41796 |
| concepts[4].display_name | Sentence |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.5727705955505371 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C2776445246 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5426627993583679 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q1792644 |
| concepts[6].display_name | Style (visual arts) |
| concepts[7].id | https://openalex.org/C192209626 |
| concepts[7].level | 2 |
| concepts[7].score | 0.5323460698127747 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q190909 |
| concepts[7].display_name | Focus (optics) |
| concepts[8].id | https://openalex.org/C195324797 |
| concepts[8].level | 2 |
| concepts[8].score | 0.5276447534561157 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q33742 |
| concepts[8].display_name | Natural language |
| concepts[9].id | https://openalex.org/C2776359362 |
| concepts[9].level | 3 |
| concepts[9].score | 0.43999525904655457 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q2145286 |
| concepts[9].display_name | Representation (politics) |
| concepts[10].id | https://openalex.org/C2776187449 |
| concepts[10].level | 3 |
| concepts[10].score | 0.4383416175842285 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q1513879 |
| concepts[10].display_name | Natural language generation |
| concepts[11].id | https://openalex.org/C2776401178 |
| concepts[11].level | 2 |
| concepts[11].score | 0.4251898229122162 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q12050496 |
| concepts[11].display_name | Feature (linguistics) |
| concepts[12].id | https://openalex.org/C41895202 |
| concepts[12].level | 1 |
| concepts[12].score | 0.2886231541633606 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q8162 |
| concepts[12].display_name | Linguistics |
| concepts[13].id | https://openalex.org/C138885662 |
| concepts[13].level | 0 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[13].display_name | Philosophy |
| concepts[14].id | https://openalex.org/C115961682 |
| concepts[14].level | 2 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[14].display_name | Image (mathematics) |
| concepts[15].id | https://openalex.org/C95457728 |
| concepts[15].level | 0 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q309 |
| concepts[15].display_name | History |
| concepts[16].id | https://openalex.org/C121332964 |
| concepts[16].level | 0 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[16].display_name | Physics |
| concepts[17].id | https://openalex.org/C94625758 |
| concepts[17].level | 2 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q7163 |
| concepts[17].display_name | Politics |
| concepts[18].id | https://openalex.org/C120665830 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q14620 |
| concepts[18].display_name | Optics |
| concepts[19].id | https://openalex.org/C17744445 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q36442 |
| concepts[19].display_name | Political science |
| concepts[20].id | https://openalex.org/C199539241 |
| concepts[20].level | 1 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q7748 |
| concepts[20].display_name | Law |
| concepts[21].id | https://openalex.org/C166957645 |
| concepts[21].level | 1 |
| concepts[21].score | 0.0 |
| concepts[21].wikidata | https://www.wikidata.org/wiki/Q23498 |
| concepts[21].display_name | Archaeology |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.8147604465484619 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/closed-captioning |
| keywords[1].score | 0.7064263820648193 |
| keywords[1].display_name | Closed captioning |
| keywords[2].id | https://openalex.org/keywords/natural-language-processing |
| keywords[2].score | 0.6739928722381592 |
| keywords[2].display_name | Natural language processing |
| keywords[3].id | https://openalex.org/keywords/speech-recognition |
| keywords[3].score | 0.6077100038528442 |
| keywords[3].display_name | Speech recognition |
| keywords[4].id | https://openalex.org/keywords/sentence |
| keywords[4].score | 0.5815175175666809 |
| keywords[4].display_name | Sentence |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.5727705955505371 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/style |
| keywords[6].score | 0.5426627993583679 |
| keywords[6].display_name | Style (visual arts) |
| keywords[7].id | https://openalex.org/keywords/focus |
| keywords[7].score | 0.5323460698127747 |
| keywords[7].display_name | Focus (optics) |
| keywords[8].id | https://openalex.org/keywords/natural-language |
| keywords[8].score | 0.5276447534561157 |
| keywords[8].display_name | Natural language |
| keywords[9].id | https://openalex.org/keywords/representation |
| keywords[9].score | 0.43999525904655457 |
| keywords[9].display_name | Representation (politics) |
| keywords[10].id | https://openalex.org/keywords/natural-language-generation |
| keywords[10].score | 0.4383416175842285 |
| keywords[10].display_name | Natural language generation |
| keywords[11].id | https://openalex.org/keywords/feature |
| keywords[11].score | 0.4251898229122162 |
| keywords[11].display_name | Feature (linguistics) |
| keywords[12].id | https://openalex.org/keywords/linguistics |
| keywords[12].score | 0.2886231541633606 |
| keywords[12].display_name | Linguistics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2311.16509 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by-sa |
| locations[0].pdf_url | https://arxiv.org/pdf/2311.16509 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by-sa |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2311.16509 |
| locations[1].id | doi:10.48550/arxiv.2311.16509 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2311.16509 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5108574971 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Kazuki Yamauchi |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yamauchi, Kazuki |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5068604686 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Yusuke Ijima |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Ijima, Yusuke |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5083394213 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-7967-2613 |
| authorships[2].author.display_name | Yuki Saito |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Saito, Yuki |
| authorships[2].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2311.16509 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2023-11-30T00:00:00 |
| display_name | StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10181 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9983000159263611 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Natural Language Processing Techniques |
| related_works | https://openalex.org/W2050523636, https://openalex.org/W3009270862, https://openalex.org/W2152921782, https://openalex.org/W382594479, https://openalex.org/W2470045054, https://openalex.org/W2575772232, https://openalex.org/W2151245229, https://openalex.org/W2140902089, https://openalex.org/W2030298461, https://openalex.org/W1510553545 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2311.16509 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by-sa |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2311.16509 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by-sa |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2311.16509 |
| primary_location.id | pmh:oai:arXiv.org:2311.16509 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by-sa |
| primary_location.pdf_url | https://arxiv.org/pdf/2311.16509 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by-sa |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2311.16509 |
| publication_date | 2023-11-28 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 3, 52, 87, 98 |
| abstract_inverted_index.We | 0, 81, 105 |
| abstract_inverted_index.an | 47, 56, 107 |
| abstract_inverted_index.by | 157 |
| abstract_inverted_index.in | 14, 46 |
| abstract_inverted_index.is | 51, 70 |
| abstract_inverted_index.of | 10, 18, 34, 42, 75, 148, 153 |
| abstract_inverted_index.on | 26 |
| abstract_inverted_index.or | 30 |
| abstract_inverted_index.to | 5 |
| abstract_inverted_index.The | 120 |
| abstract_inverted_index.and | 77, 111, 139, 146 |
| abstract_inverted_index.are | 95, 160 |
| abstract_inverted_index.fed | 96 |
| abstract_inverted_index.for | 21, 59, 116, 130 |
| abstract_inverted_index.new | 118 |
| abstract_inverted_index.our | 125, 158 |
| abstract_inverted_index.the | 27, 31, 40, 43, 131, 144 |
| abstract_inverted_index.LLMs | 129 |
| abstract_inverted_index.data | 74 |
| abstract_inverted_index.from | 63 |
| abstract_inverted_index.into | 91, 97 |
| abstract_inverted_index.most | 17 |
| abstract_inverted_index.step | 54 |
| abstract_inverted_index.text | 103, 109, 132 |
| abstract_inverted_index.that | 85, 94, 124 |
| abstract_inverted_index.they | 37 |
| abstract_inverted_index.this | 117 |
| abstract_inverted_index.with | 72 |
| abstract_inverted_index.(SSL) | 137 |
| abstract_inverted_index.first | 53 |
| abstract_inverted_index.focus | 25 |
| abstract_inverted_index.i.e., | 65 |
| abstract_inverted_index.large | 99 |
| abstract_inverted_index.model | 101 |
| abstract_inverted_index.task. | 119 |
| abstract_inverted_index.train | 82 |
| abstract_inverted_index.cannot | 38 |
| abstract_inverted_index.method | 4, 58 |
| abstract_inverted_index.neural | 83 |
| abstract_inverted_index.paired | 73 |
| abstract_inverted_index.prefix | 92 |
| abstract_inverted_index.result | 45 |
| abstract_inverted_index.richer | 128 |
| abstract_inverted_index.speech | 76, 88, 112, 134 |
| abstract_inverted_index.styles | 12 |
| abstract_inverted_index.vector | 90 |
| abstract_inverted_index.Samples | 152 |
| abstract_inverted_index.convert | 86 |
| abstract_inverted_index.decoder | 110 |
| abstract_inverted_index.explore | 106 |
| abstract_inverted_index.feature | 113 |
| abstract_inverted_index.labels, | 36 |
| abstract_inverted_index.manner. | 49 |
| abstract_inverted_index.natural | 7, 78 |
| abstract_inverted_index.prompts | 62 |
| abstract_inverted_index.propose | 1 |
| abstract_inverted_index.provide | 39 |
| abstract_inverted_index.results | 122 |
| abstract_inverted_index.speech, | 64 |
| abstract_inverted_index.speech. | 15 |
| abstract_inverted_index.towards | 55 |
| abstract_inverted_index.trained | 71 |
| abstract_inverted_index.vectors | 93 |
| abstract_inverted_index.Although | 16 |
| abstract_inverted_index.StyleCap | 50, 69, 126, 159 |
| abstract_inverted_index.accuracy | 145 |
| abstract_inverted_index.captions | 155 |
| abstract_inverted_index.category | 28 |
| abstract_inverted_index.decoder, | 133 |
| abstract_inverted_index.decoder. | 104 |
| abstract_inverted_index.generate | 6 |
| abstract_inverted_index.improves | 143 |
| abstract_inverted_index.language | 8, 79, 100 |
| abstract_inverted_index.learning | 136 |
| abstract_inverted_index.networks | 84 |
| abstract_inverted_index.publicly | 161 |
| abstract_inverted_index.sentence | 140 |
| abstract_inverted_index.speaking | 11 |
| abstract_inverted_index.suitable | 115 |
| abstract_inverted_index.StyleCap, | 2 |
| abstract_inverted_index.appearing | 13 |
| abstract_inverted_index.automatic | 66 |
| abstract_inverted_index.captions. | 151 |
| abstract_inverted_index.diversity | 147 |
| abstract_inverted_index.features, | 138 |
| abstract_inverted_index.generated | 149, 156 |
| abstract_inverted_index.intensity | 32 |
| abstract_inverted_index.reasoning | 41 |
| abstract_inverted_index.available. | 162 |
| abstract_inverted_index.end-to-end | 57 |
| abstract_inverted_index.estimation | 33 |
| abstract_inverted_index.generating | 60 |
| abstract_inverted_index.leveraging | 127 |
| abstract_inverted_index.rephrasing | 141 |
| abstract_inverted_index.techniques | 20 |
| abstract_inverted_index.(LLM)-based | 102 |
| abstract_inverted_index.appropriate | 108 |
| abstract_inverted_index.captioning. | 68 |
| abstract_inverted_index.demonstrate | 123 |
| abstract_inverted_index.information | 23 |
| abstract_inverted_index.pre-defined | 35 |
| abstract_inverted_index.recognition | 24, 44 |
| abstract_inverted_index.augmentation | 142 |
| abstract_inverted_index.conventional | 19 |
| abstract_inverted_index.descriptions | 9 |
| abstract_inverted_index.experimental | 121 |
| abstract_inverted_index.descriptions. | 80 |
| abstract_inverted_index.interpretable | 48 |
| abstract_inverted_index.classification | 29 |
| abstract_inverted_index.representation | 89, 114 |
| abstract_inverted_index.speaking-style | 61, 67, 150, 154 |
| abstract_inverted_index.self-supervised | 135 |
| abstract_inverted_index.para-/non-linguistic | 22 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.8199999928474426 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile |