InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2403.02889
Despite the many advances of Large Language Models (LLMs) and their unprecedented rapid evolution, their impact and integration into every facet of our daily lives is limited due to various reasons. One critical factor hindering their widespread adoption is the occurrence of hallucinations, where LLMs invent answers that sound realistic, yet drift away from factual truth. In this paper, we present a novel method for detecting hallucinations in large language models, which tackles a critical issue in the adoption of these models in various real-world scenarios. Through extensive evaluations across multiple datasets and LLMs, including Llama-2, we study the hallucination levels of various recent LLMs and demonstrate the effectiveness of our method to automatically detect them. Notably, we observe up to 87% hallucinations for Llama-2 in a specific experiment, where our method achieves a Balanced Accuracy of 81%, all without relying on external knowledge.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2403.02889
- https://arxiv.org/pdf/2403.02889
- OA Status
- green
- Cited By
- 2
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4392538383
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4392538383Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2403.02889Digital Object Identifier
- Title
-
InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated AnswersWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-03-05Full publication date if available
- Authors
-
Yakir Yehuda, Itzik Malkiel, Oren Barkan, Jonathan Weill, Royi Ronen, Noam KoenigsteinList of authors in order
- Landing page
-
https://arxiv.org/abs/2403.02889Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2403.02889Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2403.02889Direct OA link when available
- Concepts
-
Interrogation, Psychology, Computer science, Artificial intelligence, Political science, LawTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
2Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 2Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4392538383 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2403.02889 |
| ids.doi | https://doi.org/10.48550/arxiv.2403.02889 |
| ids.openalex | https://openalex.org/W4392538383 |
| fwci | 2.92221034 |
| type | preprint |
| title | InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12553 |
| topics[0].field.id | https://openalex.org/fields/32 |
| topics[0].field.display_name | Psychology |
| topics[0].score | 0.8682000041007996 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/3203 |
| topics[0].subfield.display_name | Clinical Psychology |
| topics[0].display_name | Psychedelics and Drug Studies |
| topics[1].id | https://openalex.org/T10094 |
| topics[1].field.id | https://openalex.org/fields/27 |
| topics[1].field.display_name | Medicine |
| topics[1].score | 0.7910000085830688 |
| topics[1].domain.id | https://openalex.org/domains/4 |
| topics[1].domain.display_name | Health Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2738 |
| topics[1].subfield.display_name | Psychiatry and Mental health |
| topics[1].display_name | Epilepsy research and treatment |
| topics[2].id | https://openalex.org/T11147 |
| topics[2].field.id | https://openalex.org/fields/33 |
| topics[2].field.display_name | Social Sciences |
| topics[2].score | 0.7502999901771545 |
| topics[2].domain.id | https://openalex.org/domains/2 |
| topics[2].domain.display_name | Social Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/3312 |
| topics[2].subfield.display_name | Sociology and Political Science |
| topics[2].display_name | Misinformation and Its Impacts |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2776240099 |
| concepts[0].level | 2 |
| concepts[0].score | 0.9596600532531738 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q327018 |
| concepts[0].display_name | Interrogation |
| concepts[1].id | https://openalex.org/C15744967 |
| concepts[1].level | 0 |
| concepts[1].score | 0.4419773519039154 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[1].display_name | Psychology |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.35453519225120544 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3392983078956604 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C17744445 |
| concepts[4].level | 0 |
| concepts[4].score | 0.21619686484336853 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q36442 |
| concepts[4].display_name | Political science |
| concepts[5].id | https://openalex.org/C199539241 |
| concepts[5].level | 1 |
| concepts[5].score | 0.16031530499458313 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q7748 |
| concepts[5].display_name | Law |
| keywords[0].id | https://openalex.org/keywords/interrogation |
| keywords[0].score | 0.9596600532531738 |
| keywords[0].display_name | Interrogation |
| keywords[1].id | https://openalex.org/keywords/psychology |
| keywords[1].score | 0.4419773519039154 |
| keywords[1].display_name | Psychology |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.35453519225120544 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.3392983078956604 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/political-science |
| keywords[4].score | 0.21619686484336853 |
| keywords[4].display_name | Political science |
| keywords[5].id | https://openalex.org/keywords/law |
| keywords[5].score | 0.16031530499458313 |
| keywords[5].display_name | Law |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2403.02889 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by-nc-sa |
| locations[0].pdf_url | https://arxiv.org/pdf/2403.02889 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | https://openalex.org/licenses/cc-by-nc-sa |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2403.02889 |
| locations[1].id | doi:10.48550/arxiv.2403.02889 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article-journal |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2403.02889 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5014189152 |
| authorships[0].author.orcid | https://orcid.org/0009-0008-1620-037X |
| authorships[0].author.display_name | Yakir Yehuda |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yehuda, Yakir |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5067773841 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-4151-9119 |
| authorships[1].author.display_name | Itzik Malkiel |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Malkiel, Itzik |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5077269072 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-5059-0905 |
| authorships[2].author.display_name | Oren Barkan |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Barkan, Oren |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5057343912 |
| authorships[3].author.orcid | https://orcid.org/0009-0009-4582-9406 |
| authorships[3].author.display_name | Jonathan Weill |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Weill, Jonathan |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5033389618 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Royi Ronen |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Ronen, Royi |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5083002103 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-8219-4512 |
| authorships[5].author.display_name | Noam Koenigstein |
| authorships[5].author_position | last |
| authorships[5].raw_author_name | Koenigstein, Noam |
| authorships[5].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2403.02889 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-03-07T00:00:00 |
| display_name | InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T12553 |
| primary_topic.field.id | https://openalex.org/fields/32 |
| primary_topic.field.display_name | Psychology |
| primary_topic.score | 0.8682000041007996 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/3203 |
| primary_topic.subfield.display_name | Clinical Psychology |
| primary_topic.display_name | Psychedelics and Drug Studies |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W2401316817, https://openalex.org/W1589588912, https://openalex.org/W2382804087, https://openalex.org/W2367830717, https://openalex.org/W2104637952, https://openalex.org/W2393800603, https://openalex.org/W2356812289, https://openalex.org/W2348542805, https://openalex.org/W2375055351 |
| cited_by_count | 2 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 2 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2403.02889 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by-nc-sa |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2403.02889 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by-nc-sa |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2403.02889 |
| primary_location.id | pmh:oai:arXiv.org:2403.02889 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by-nc-sa |
| primary_location.pdf_url | https://arxiv.org/pdf/2403.02889 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | https://openalex.org/licenses/cc-by-nc-sa |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2403.02889 |
| publication_date | 2024-03-05 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 61, 73, 126, 133 |
| abstract_inverted_index.In | 56 |
| abstract_inverted_index.in | 67, 76, 82, 125 |
| abstract_inverted_index.is | 25, 38 |
| abstract_inverted_index.of | 4, 21, 41, 79, 101, 109, 136 |
| abstract_inverted_index.on | 141 |
| abstract_inverted_index.to | 28, 112, 120 |
| abstract_inverted_index.up | 119 |
| abstract_inverted_index.we | 59, 96, 117 |
| abstract_inverted_index.87% | 121 |
| abstract_inverted_index.One | 31 |
| abstract_inverted_index.all | 138 |
| abstract_inverted_index.and | 9, 16, 92, 105 |
| abstract_inverted_index.due | 27 |
| abstract_inverted_index.for | 64, 123 |
| abstract_inverted_index.our | 22, 110, 130 |
| abstract_inverted_index.the | 1, 39, 77, 98, 107 |
| abstract_inverted_index.yet | 50 |
| abstract_inverted_index.81%, | 137 |
| abstract_inverted_index.LLMs | 44, 104 |
| abstract_inverted_index.away | 52 |
| abstract_inverted_index.from | 53 |
| abstract_inverted_index.into | 18 |
| abstract_inverted_index.many | 2 |
| abstract_inverted_index.that | 47 |
| abstract_inverted_index.this | 57 |
| abstract_inverted_index.LLMs, | 93 |
| abstract_inverted_index.Large | 5 |
| abstract_inverted_index.daily | 23 |
| abstract_inverted_index.drift | 51 |
| abstract_inverted_index.every | 19 |
| abstract_inverted_index.facet | 20 |
| abstract_inverted_index.issue | 75 |
| abstract_inverted_index.large | 68 |
| abstract_inverted_index.lives | 24 |
| abstract_inverted_index.novel | 62 |
| abstract_inverted_index.rapid | 12 |
| abstract_inverted_index.sound | 48 |
| abstract_inverted_index.study | 97 |
| abstract_inverted_index.their | 10, 14, 35 |
| abstract_inverted_index.them. | 115 |
| abstract_inverted_index.these | 80 |
| abstract_inverted_index.where | 43, 129 |
| abstract_inverted_index.which | 71 |
| abstract_inverted_index.(LLMs) | 8 |
| abstract_inverted_index.Models | 7 |
| abstract_inverted_index.across | 89 |
| abstract_inverted_index.detect | 114 |
| abstract_inverted_index.factor | 33 |
| abstract_inverted_index.impact | 15 |
| abstract_inverted_index.invent | 45 |
| abstract_inverted_index.levels | 100 |
| abstract_inverted_index.method | 63, 111, 131 |
| abstract_inverted_index.models | 81 |
| abstract_inverted_index.paper, | 58 |
| abstract_inverted_index.recent | 103 |
| abstract_inverted_index.truth. | 55 |
| abstract_inverted_index.Despite | 0 |
| abstract_inverted_index.Llama-2 | 124 |
| abstract_inverted_index.Through | 86 |
| abstract_inverted_index.answers | 46 |
| abstract_inverted_index.factual | 54 |
| abstract_inverted_index.limited | 26 |
| abstract_inverted_index.models, | 70 |
| abstract_inverted_index.observe | 118 |
| abstract_inverted_index.present | 60 |
| abstract_inverted_index.relying | 140 |
| abstract_inverted_index.tackles | 72 |
| abstract_inverted_index.various | 29, 83, 102 |
| abstract_inverted_index.without | 139 |
| abstract_inverted_index.Accuracy | 135 |
| abstract_inverted_index.Balanced | 134 |
| abstract_inverted_index.Language | 6 |
| abstract_inverted_index.Llama-2, | 95 |
| abstract_inverted_index.Notably, | 116 |
| abstract_inverted_index.achieves | 132 |
| abstract_inverted_index.adoption | 37, 78 |
| abstract_inverted_index.advances | 3 |
| abstract_inverted_index.critical | 32, 74 |
| abstract_inverted_index.datasets | 91 |
| abstract_inverted_index.external | 142 |
| abstract_inverted_index.language | 69 |
| abstract_inverted_index.multiple | 90 |
| abstract_inverted_index.reasons. | 30 |
| abstract_inverted_index.specific | 127 |
| abstract_inverted_index.detecting | 65 |
| abstract_inverted_index.extensive | 87 |
| abstract_inverted_index.hindering | 34 |
| abstract_inverted_index.including | 94 |
| abstract_inverted_index.evolution, | 13 |
| abstract_inverted_index.knowledge. | 143 |
| abstract_inverted_index.occurrence | 40 |
| abstract_inverted_index.real-world | 84 |
| abstract_inverted_index.realistic, | 49 |
| abstract_inverted_index.scenarios. | 85 |
| abstract_inverted_index.widespread | 36 |
| abstract_inverted_index.demonstrate | 106 |
| abstract_inverted_index.evaluations | 88 |
| abstract_inverted_index.experiment, | 128 |
| abstract_inverted_index.integration | 17 |
| abstract_inverted_index.automatically | 113 |
| abstract_inverted_index.effectiveness | 108 |
| abstract_inverted_index.hallucination | 99 |
| abstract_inverted_index.unprecedented | 11 |
| abstract_inverted_index.hallucinations | 66, 122 |
| abstract_inverted_index.hallucinations, | 42 |
| cited_by_percentile_year.max | 97 |
| cited_by_percentile_year.min | 95 |
| countries_distinct_count | 0 |
| institutions_distinct_count | 6 |
| citation_normalized_percentile.value | 0.80831045 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |