The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2308.16884
We present Belebele, a multiple-choice machine reading comprehension (MRC) dataset spanning 122 language variants. Significantly expanding the language coverage of natural language understanding (NLU) benchmarks, this dataset enables the evaluation of text models in high-, medium-, and low-resource languages. Each question is based on a short passage from the Flores-200 dataset and has four multiple-choice answers. The questions were carefully curated to discriminate between models with different levels of general language comprehension. The English dataset on its own proves difficult enough to challenge state-of-the-art language models. Being fully parallel, this dataset enables direct comparison of model performance across all languages. We use this dataset to evaluate the capabilities of multilingual masked language models (MLMs) and large language models (LLMs). We present extensive results and find that despite significant cross-lingual transfer in English-centric LLMs, much smaller MLMs pretrained on balanced multilingual data still understand far more languages. We also observe that larger vocabulary size and conscious vocabulary construction correlate with better performance on low-resource languages. Overall, Belebele opens up new avenues for evaluating and analyzing the multilingual capabilities of NLP systems.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2308.16884
- https://arxiv.org/pdf/2308.16884
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4386384919
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4386384919Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2308.16884Digital Object Identifier
- Title
-
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language VariantsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-08-31Full publication date if available
- Authors
-
Lucas Bandarkar, Davis Liang, Benjamin Müller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, Madian KhabsaList of authors in order
- Landing page
-
https://arxiv.org/abs/2308.16884Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2308.16884Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2308.16884Direct OA link when available
- Concepts
-
Computer science, Benchmark (surveying), Comprehension, Natural language processing, Vocabulary, Artificial intelligence, Language model, Reading (process), Reading comprehension, Resource (disambiguation), Linguistics, Programming language, Geography, Geodesy, Computer network, PhilosophyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2024: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4386384919 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2308.16884 |
| ids.doi | https://doi.org/10.48550/arxiv.2308.16884 |
| ids.openalex | https://openalex.org/W4386384919 |
| fwci | 0.25544289 |
| type | preprint |
| title | The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10181 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9994999766349792 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Natural Language Processing Techniques |
| topics[1].id | https://openalex.org/T10028 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9993000030517578 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Topic Modeling |
| topics[2].id | https://openalex.org/T13629 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9977999925613403 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Text Readability and Simplification |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.8207255601882935 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C185798385 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7365621328353882 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1161707 |
| concepts[1].display_name | Benchmark (surveying) |
| concepts[2].id | https://openalex.org/C511192102 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6963722705841064 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q5156948 |
| concepts[2].display_name | Comprehension |
| concepts[3].id | https://openalex.org/C204321447 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6501059532165527 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[3].display_name | Natural language processing |
| concepts[4].id | https://openalex.org/C2777601683 |
| concepts[4].level | 2 |
| concepts[4].score | 0.6409082412719727 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q6499736 |
| concepts[4].display_name | Vocabulary |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.6219724416732788 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C137293760 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5189035534858704 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q3621696 |
| concepts[6].display_name | Language model |
| concepts[7].id | https://openalex.org/C554936623 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4581510126590729 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q199657 |
| concepts[7].display_name | Reading (process) |
| concepts[8].id | https://openalex.org/C2778780117 |
| concepts[8].level | 3 |
| concepts[8].score | 0.45468780398368835 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q3269423 |
| concepts[8].display_name | Reading comprehension |
| concepts[9].id | https://openalex.org/C206345919 |
| concepts[9].level | 2 |
| concepts[9].score | 0.4384937882423401 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q20380951 |
| concepts[9].display_name | Resource (disambiguation) |
| concepts[10].id | https://openalex.org/C41895202 |
| concepts[10].level | 1 |
| concepts[10].score | 0.26503026485443115 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q8162 |
| concepts[10].display_name | Linguistics |
| concepts[11].id | https://openalex.org/C199360897 |
| concepts[11].level | 1 |
| concepts[11].score | 0.07381138205528259 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[11].display_name | Programming language |
| concepts[12].id | https://openalex.org/C205649164 |
| concepts[12].level | 0 |
| concepts[12].score | 0.06418609619140625 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q1071 |
| concepts[12].display_name | Geography |
| concepts[13].id | https://openalex.org/C13280743 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q131089 |
| concepts[13].display_name | Geodesy |
| concepts[14].id | https://openalex.org/C31258907 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q1301371 |
| concepts[14].display_name | Computer network |
| concepts[15].id | https://openalex.org/C138885662 |
| concepts[15].level | 0 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[15].display_name | Philosophy |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.8207255601882935 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/benchmark |
| keywords[1].score | 0.7365621328353882 |
| keywords[1].display_name | Benchmark (surveying) |
| keywords[2].id | https://openalex.org/keywords/comprehension |
| keywords[2].score | 0.6963722705841064 |
| keywords[2].display_name | Comprehension |
| keywords[3].id | https://openalex.org/keywords/natural-language-processing |
| keywords[3].score | 0.6501059532165527 |
| keywords[3].display_name | Natural language processing |
| keywords[4].id | https://openalex.org/keywords/vocabulary |
| keywords[4].score | 0.6409082412719727 |
| keywords[4].display_name | Vocabulary |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.6219724416732788 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/language-model |
| keywords[6].score | 0.5189035534858704 |
| keywords[6].display_name | Language model |
| keywords[7].id | https://openalex.org/keywords/reading |
| keywords[7].score | 0.4581510126590729 |
| keywords[7].display_name | Reading (process) |
| keywords[8].id | https://openalex.org/keywords/reading-comprehension |
| keywords[8].score | 0.45468780398368835 |
| keywords[8].display_name | Reading comprehension |
| keywords[9].id | https://openalex.org/keywords/resource |
| keywords[9].score | 0.4384937882423401 |
| keywords[9].display_name | Resource (disambiguation) |
| keywords[10].id | https://openalex.org/keywords/linguistics |
| keywords[10].score | 0.26503026485443115 |
| keywords[10].display_name | Linguistics |
| keywords[11].id | https://openalex.org/keywords/programming-language |
| keywords[11].score | 0.07381138205528259 |
| keywords[11].display_name | Programming language |
| keywords[12].id | https://openalex.org/keywords/geography |
| keywords[12].score | 0.06418609619140625 |
| keywords[12].display_name | Geography |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2308.16884 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2308.16884 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2308.16884 |
| locations[1].id | doi:10.48550/arxiv.2308.16884 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article-journal |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2308.16884 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5029975638 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Lucas Bandarkar |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Bandarkar, Lucas |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5112447939 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Davis Liang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Liang, Davis |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5079873734 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-4463-2873 |
| authorships[2].author.display_name | Benjamin Müller |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Muller, Benjamin |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5023341622 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Mikel Artetxe |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Artetxe, Mikel |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5065117717 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Satya Narayan Shukla |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Shukla, Satya Narayan |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5092732683 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Donald Husa |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Husa, Donald |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5075834790 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-7565-4303 |
| authorships[6].author.display_name | Naman Goyal |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Goyal, Naman |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5044240506 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Abhinandan Krishnan |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Krishnan, Abhinandan |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5067919401 |
| authorships[8].author.orcid | https://orcid.org/0009-0008-8296-0764 |
| authorships[8].author.display_name | Luke Zettlemoyer |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Zettlemoyer, Luke |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5054253075 |
| authorships[9].author.orcid | |
| authorships[9].author.display_name | Madian Khabsa |
| authorships[9].author_position | last |
| authorships[9].raw_author_name | Khabsa, Madian |
| authorships[9].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2308.16884 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10181 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9994999766349792 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Natural Language Processing Techniques |
| related_works | https://openalex.org/W2378211422, https://openalex.org/W2745001401, https://openalex.org/W4321353415, https://openalex.org/W2130974462, https://openalex.org/W2028665553, https://openalex.org/W2086519370, https://openalex.org/W2082296339, https://openalex.org/W2161828220, https://openalex.org/W1972348076, https://openalex.org/W2083863157 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2024 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2308.16884 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2308.16884 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2308.16884 |
| primary_location.id | pmh:oai:arXiv.org:2308.16884 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2308.16884 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2308.16884 |
| publication_date | 2023-08-31 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 3, 44 |
| abstract_inverted_index.We | 0, 100, 119, 146 |
| abstract_inverted_index.in | 33, 130 |
| abstract_inverted_index.is | 41 |
| abstract_inverted_index.of | 19, 30, 68, 94, 108, 177 |
| abstract_inverted_index.on | 43, 75, 137, 161 |
| abstract_inverted_index.to | 61, 81, 104 |
| abstract_inverted_index.up | 167 |
| abstract_inverted_index.122 | 11 |
| abstract_inverted_index.NLP | 178 |
| abstract_inverted_index.The | 56, 72 |
| abstract_inverted_index.all | 98 |
| abstract_inverted_index.and | 36, 51, 114, 123, 153, 172 |
| abstract_inverted_index.far | 143 |
| abstract_inverted_index.for | 170 |
| abstract_inverted_index.has | 52 |
| abstract_inverted_index.its | 76 |
| abstract_inverted_index.new | 168 |
| abstract_inverted_index.own | 77 |
| abstract_inverted_index.the | 16, 28, 48, 106, 174 |
| abstract_inverted_index.use | 101 |
| abstract_inverted_index.Each | 39 |
| abstract_inverted_index.MLMs | 135 |
| abstract_inverted_index.also | 147 |
| abstract_inverted_index.data | 140 |
| abstract_inverted_index.find | 124 |
| abstract_inverted_index.four | 53 |
| abstract_inverted_index.from | 47 |
| abstract_inverted_index.more | 144 |
| abstract_inverted_index.much | 133 |
| abstract_inverted_index.size | 152 |
| abstract_inverted_index.text | 31 |
| abstract_inverted_index.that | 125, 149 |
| abstract_inverted_index.this | 25, 89, 102 |
| abstract_inverted_index.were | 58 |
| abstract_inverted_index.with | 65, 158 |
| abstract_inverted_index.(MRC) | 8 |
| abstract_inverted_index.(NLU) | 23 |
| abstract_inverted_index.Being | 86 |
| abstract_inverted_index.LLMs, | 132 |
| abstract_inverted_index.based | 42 |
| abstract_inverted_index.fully | 87 |
| abstract_inverted_index.large | 115 |
| abstract_inverted_index.model | 95 |
| abstract_inverted_index.opens | 166 |
| abstract_inverted_index.short | 45 |
| abstract_inverted_index.still | 141 |
| abstract_inverted_index.(MLMs) | 113 |
| abstract_inverted_index.across | 97 |
| abstract_inverted_index.better | 159 |
| abstract_inverted_index.direct | 92 |
| abstract_inverted_index.enough | 80 |
| abstract_inverted_index.high-, | 34 |
| abstract_inverted_index.larger | 150 |
| abstract_inverted_index.levels | 67 |
| abstract_inverted_index.masked | 110 |
| abstract_inverted_index.models | 32, 64, 112, 117 |
| abstract_inverted_index.proves | 78 |
| abstract_inverted_index.(LLMs). | 118 |
| abstract_inverted_index.English | 73 |
| abstract_inverted_index.avenues | 169 |
| abstract_inverted_index.between | 63 |
| abstract_inverted_index.curated | 60 |
| abstract_inverted_index.dataset | 9, 26, 50, 74, 90, 103 |
| abstract_inverted_index.despite | 126 |
| abstract_inverted_index.enables | 27, 91 |
| abstract_inverted_index.general | 69 |
| abstract_inverted_index.machine | 5 |
| abstract_inverted_index.models. | 85 |
| abstract_inverted_index.natural | 20 |
| abstract_inverted_index.observe | 148 |
| abstract_inverted_index.passage | 46 |
| abstract_inverted_index.present | 1, 120 |
| abstract_inverted_index.reading | 6 |
| abstract_inverted_index.results | 122 |
| abstract_inverted_index.smaller | 134 |
| abstract_inverted_index.Belebele | 165 |
| abstract_inverted_index.Overall, | 164 |
| abstract_inverted_index.answers. | 55 |
| abstract_inverted_index.balanced | 138 |
| abstract_inverted_index.coverage | 18 |
| abstract_inverted_index.evaluate | 105 |
| abstract_inverted_index.language | 12, 17, 21, 70, 84, 111, 116 |
| abstract_inverted_index.medium-, | 35 |
| abstract_inverted_index.question | 40 |
| abstract_inverted_index.spanning | 10 |
| abstract_inverted_index.systems. | 179 |
| abstract_inverted_index.transfer | 129 |
| abstract_inverted_index.Belebele, | 2 |
| abstract_inverted_index.analyzing | 173 |
| abstract_inverted_index.carefully | 59 |
| abstract_inverted_index.challenge | 82 |
| abstract_inverted_index.conscious | 154 |
| abstract_inverted_index.correlate | 157 |
| abstract_inverted_index.different | 66 |
| abstract_inverted_index.difficult | 79 |
| abstract_inverted_index.expanding | 15 |
| abstract_inverted_index.extensive | 121 |
| abstract_inverted_index.parallel, | 88 |
| abstract_inverted_index.questions | 57 |
| abstract_inverted_index.variants. | 13 |
| abstract_inverted_index.Flores-200 | 49 |
| abstract_inverted_index.comparison | 93 |
| abstract_inverted_index.evaluating | 171 |
| abstract_inverted_index.evaluation | 29 |
| abstract_inverted_index.languages. | 38, 99, 145, 163 |
| abstract_inverted_index.pretrained | 136 |
| abstract_inverted_index.understand | 142 |
| abstract_inverted_index.vocabulary | 151, 155 |
| abstract_inverted_index.benchmarks, | 24 |
| abstract_inverted_index.performance | 96, 160 |
| abstract_inverted_index.significant | 127 |
| abstract_inverted_index.capabilities | 107, 176 |
| abstract_inverted_index.construction | 156 |
| abstract_inverted_index.discriminate | 62 |
| abstract_inverted_index.low-resource | 37, 162 |
| abstract_inverted_index.multilingual | 109, 139, 175 |
| abstract_inverted_index.Significantly | 14 |
| abstract_inverted_index.comprehension | 7 |
| abstract_inverted_index.cross-lingual | 128 |
| abstract_inverted_index.understanding | 22 |
| abstract_inverted_index.comprehension. | 71 |
| abstract_inverted_index.English-centric | 131 |
| abstract_inverted_index.multiple-choice | 4, 54 |
| abstract_inverted_index.state-of-the-art | 83 |
| cited_by_percentile_year.max | 94 |
| cited_by_percentile_year.min | 90 |
| countries_distinct_count | 0 |
| institutions_distinct_count | 10 |
| citation_normalized_percentile.value | 0.57817245 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |