None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2503.01550
Multiple-choice exam questions with "None of the above" (NA) options have been extensively studied in educational testing, in which existing research suggests that they better assess true knowledge. However, their impact on Large Language Models (LLMs) evaluation remains underexplored. Through systematic experiments with 28 LLMs on the MMLU benchmark, we examine how NA options affect model performance and confidence calibration. Our analysis reveals that NA options, when used as the correct answer, lead to a consistent 30-50\% performance drop across models regardless of scale--suggesting that LLMs lack the meta-cognitive ability to systematically evaluate and reject all given options when none are correct. This degradation shows strong domain dependence, with minimal impact on mathematical reasoning (14.6\% drop) but severe effects on tasks requiring uncertainty handling like business ethics (48.1\% drop). Our results highlight important implications for benchmark design and raise questions about LLMs' ability to handle uncertainty in real-world applications.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2503.01550
- https://arxiv.org/pdf/2503.01550
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415084990
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415084990Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2503.01550Digital Object Identifier
- Title
-
None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions AnsweringWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-03-03Full publication date if available
- Authors
-
Zhi Rui Tam, Chunfei Wu, Chieh-Yen Lin, Yun-Nung ChenList of authors in order
- Landing page
-
https://arxiv.org/abs/2503.01550Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2503.01550Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2503.01550Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415084990 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2503.01550 |
| ids.doi | https://doi.org/10.48550/arxiv.2503.01550 |
| ids.openalex | https://openalex.org/W4415084990 |
| fwci | 0.0 |
| type | preprint |
| title | None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10028 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9847000241279602 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Topic Modeling |
| topics[1].id | https://openalex.org/T10181 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9772999882698059 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Natural Language Processing Techniques |
| topics[2].id | https://openalex.org/T13274 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9477999806404114 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1710 |
| topics[2].subfield.display_name | Information Systems |
| topics[2].display_name | Expert finding and Q&A systems |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2503.01550 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2503.01550 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2503.01550 |
| locations[1].id | doi:10.48550/arxiv.2503.01550 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2503.01550 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5028150164 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-9968-2416 |
| authorships[0].author.display_name | Zhi Rui Tam |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Tam, Zhi Rui |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5032589855 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-7961-1186 |
| authorships[1].author.display_name | Chunfei Wu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wu, Cheng-Kuang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5012260456 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Chieh-Yen Lin |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Lin, Chieh-Yen |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5058211616 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Yun-Nung Chen |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Chen, Yun-Nung |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2503.01550 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-12T00:00:00 |
| display_name | None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10028 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9847000241279602 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Topic Modeling |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2503.01550 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2503.01550 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2503.01550 |
| primary_location.id | pmh:oai:arXiv.org:2503.01550 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2503.01550 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2503.01550 |
| publication_date | 2025-03-03 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 74 |
| abstract_inverted_index.28 | 43 |
| abstract_inverted_index.NA | 52, 64 |
| abstract_inverted_index.as | 68 |
| abstract_inverted_index.in | 14, 17, 146 |
| abstract_inverted_index.of | 5, 82 |
| abstract_inverted_index.on | 31, 45, 111, 119 |
| abstract_inverted_index.to | 73, 90, 143 |
| abstract_inverted_index.we | 49 |
| abstract_inverted_index.Our | 60, 129 |
| abstract_inverted_index.all | 95 |
| abstract_inverted_index.and | 57, 93, 137 |
| abstract_inverted_index.are | 100 |
| abstract_inverted_index.but | 116 |
| abstract_inverted_index.for | 134 |
| abstract_inverted_index.how | 51 |
| abstract_inverted_index.the | 6, 46, 69, 87 |
| abstract_inverted_index.(NA) | 8 |
| abstract_inverted_index.LLMs | 44, 85 |
| abstract_inverted_index.MMLU | 47 |
| abstract_inverted_index.This | 102 |
| abstract_inverted_index.been | 11 |
| abstract_inverted_index.drop | 78 |
| abstract_inverted_index.exam | 1 |
| abstract_inverted_index.have | 10 |
| abstract_inverted_index.lack | 86 |
| abstract_inverted_index.lead | 72 |
| abstract_inverted_index.like | 124 |
| abstract_inverted_index.none | 99 |
| abstract_inverted_index.that | 22, 63, 84 |
| abstract_inverted_index.they | 23 |
| abstract_inverted_index.true | 26 |
| abstract_inverted_index.used | 67 |
| abstract_inverted_index.when | 66, 98 |
| abstract_inverted_index.with | 3, 42, 108 |
| abstract_inverted_index."None | 4 |
| abstract_inverted_index.LLMs' | 141 |
| abstract_inverted_index.Large | 32 |
| abstract_inverted_index.about | 140 |
| abstract_inverted_index.drop) | 115 |
| abstract_inverted_index.given | 96 |
| abstract_inverted_index.model | 55 |
| abstract_inverted_index.raise | 138 |
| abstract_inverted_index.shows | 104 |
| abstract_inverted_index.tasks | 120 |
| abstract_inverted_index.their | 29 |
| abstract_inverted_index.which | 18 |
| abstract_inverted_index.(LLMs) | 35 |
| abstract_inverted_index.Models | 34 |
| abstract_inverted_index.above" | 7 |
| abstract_inverted_index.across | 79 |
| abstract_inverted_index.affect | 54 |
| abstract_inverted_index.assess | 25 |
| abstract_inverted_index.better | 24 |
| abstract_inverted_index.design | 136 |
| abstract_inverted_index.domain | 106 |
| abstract_inverted_index.drop). | 128 |
| abstract_inverted_index.ethics | 126 |
| abstract_inverted_index.handle | 144 |
| abstract_inverted_index.impact | 30, 110 |
| abstract_inverted_index.models | 80 |
| abstract_inverted_index.reject | 94 |
| abstract_inverted_index.severe | 117 |
| abstract_inverted_index.strong | 105 |
| abstract_inverted_index.(14.6\% | 114 |
| abstract_inverted_index.(48.1\% | 127 |
| abstract_inverted_index.30-50\% | 76 |
| abstract_inverted_index.Through | 39 |
| abstract_inverted_index.ability | 89, 142 |
| abstract_inverted_index.answer, | 71 |
| abstract_inverted_index.correct | 70 |
| abstract_inverted_index.effects | 118 |
| abstract_inverted_index.examine | 50 |
| abstract_inverted_index.minimal | 109 |
| abstract_inverted_index.options | 9, 53, 97 |
| abstract_inverted_index.remains | 37 |
| abstract_inverted_index.results | 130 |
| abstract_inverted_index.reveals | 62 |
| abstract_inverted_index.studied | 13 |
| abstract_inverted_index.However, | 28 |
| abstract_inverted_index.Language | 33 |
| abstract_inverted_index.analysis | 61 |
| abstract_inverted_index.business | 125 |
| abstract_inverted_index.correct. | 101 |
| abstract_inverted_index.evaluate | 92 |
| abstract_inverted_index.existing | 19 |
| abstract_inverted_index.handling | 123 |
| abstract_inverted_index.options, | 65 |
| abstract_inverted_index.research | 20 |
| abstract_inverted_index.suggests | 21 |
| abstract_inverted_index.testing, | 16 |
| abstract_inverted_index.benchmark | 135 |
| abstract_inverted_index.highlight | 131 |
| abstract_inverted_index.important | 132 |
| abstract_inverted_index.questions | 2, 139 |
| abstract_inverted_index.reasoning | 113 |
| abstract_inverted_index.requiring | 121 |
| abstract_inverted_index.benchmark, | 48 |
| abstract_inverted_index.confidence | 58 |
| abstract_inverted_index.consistent | 75 |
| abstract_inverted_index.evaluation | 36 |
| abstract_inverted_index.knowledge. | 27 |
| abstract_inverted_index.real-world | 147 |
| abstract_inverted_index.regardless | 81 |
| abstract_inverted_index.systematic | 40 |
| abstract_inverted_index.degradation | 103 |
| abstract_inverted_index.dependence, | 107 |
| abstract_inverted_index.educational | 15 |
| abstract_inverted_index.experiments | 41 |
| abstract_inverted_index.extensively | 12 |
| abstract_inverted_index.performance | 56, 77 |
| abstract_inverted_index.uncertainty | 122, 145 |
| abstract_inverted_index.calibration. | 59 |
| abstract_inverted_index.implications | 133 |
| abstract_inverted_index.mathematical | 112 |
| abstract_inverted_index.applications. | 148 |
| abstract_inverted_index.meta-cognitive | 88 |
| abstract_inverted_index.systematically | 91 |
| abstract_inverted_index.underexplored. | 38 |
| abstract_inverted_index.Multiple-choice | 0 |
| abstract_inverted_index.scale--suggesting | 83 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |