Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2505.02311
The collaborative paradigm of large and small language models (LMs) effectively balances performance and cost, yet its pivotal challenge lies in precisely pinpointing the moment of invocation when hallucinations arise in small LMs. Previous optimization efforts primarily focused on post-processing techniques, which were separate from the reasoning process of LMs, resulting in high computational costs and limited effectiveness. In this paper, we propose a practical invocation evaluation metric called AttenHScore, which calculates the accumulation and propagation of hallucinations during the generation process of small LMs, continuously amplifying potential reasoning errors. By dynamically adjusting the detection threshold, we achieve more accurate real-time invocation of large LMs. Additionally, considering the limited reasoning capacity of small LMs, we leverage uncertainty-aware knowledge reorganization to assist them better capture critical information from different text chunks. Extensive experiments reveal that our AttenHScore outperforms most baselines in enhancing real-time hallucination detection capabilities across multiple QA datasets, especially when addressing complex queries. Moreover, our strategies eliminate the need for additional model training and display flexibility in adapting to various transformer-based LMs.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2505.02311
- https://arxiv.org/pdf/2505.02311
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415031204
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415031204Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2505.02311Digital Object Identifier
- Title
-
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question AnsweringWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-05-05Full publication date if available
- Authors
-
J. Zhao, Chunlai Zhou, Biao QinList of authors in order
- Landing page
-
https://arxiv.org/abs/2505.02311Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2505.02311Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2505.02311Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415031204 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2505.02311 |
| ids.doi | https://doi.org/10.48550/arxiv.2505.02311 |
| ids.openalex | https://openalex.org/W4415031204 |
| fwci | |
| type | preprint |
| title | Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10028 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9954000115394592 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Topic Modeling |
| topics[1].id | https://openalex.org/T12031 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9532999992370605 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Speech and dialogue systems |
| topics[2].id | https://openalex.org/T10181 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9337999820709229 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Natural Language Processing Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2505.02311 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2505.02311 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2505.02311 |
| locations[1].id | doi:10.48550/arxiv.2505.02311 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2505.02311 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5070851446 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-6408-2314 |
| authorships[0].author.display_name | J. Zhao |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhao, Jihao |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5027514426 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-5687-2196 |
| authorships[1].author.display_name | Chunlai Zhou |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zhou, Chunlai |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5111326441 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Biao Qin |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Qin, Biao |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2505.02311 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10028 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9954000115394592 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Topic Modeling |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2505.02311 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2505.02311 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2505.02311 |
| primary_location.id | pmh:oai:arXiv.org:2505.02311 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2505.02311 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2505.02311 |
| publication_date | 2025-05-05 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 63 |
| abstract_inverted_index.By | 90 |
| abstract_inverted_index.In | 58 |
| abstract_inverted_index.QA | 147 |
| abstract_inverted_index.in | 20, 30, 51, 139, 167 |
| abstract_inverted_index.of | 3, 25, 48, 76, 82, 102, 111 |
| abstract_inverted_index.on | 38 |
| abstract_inverted_index.to | 119, 169 |
| abstract_inverted_index.we | 61, 96, 114 |
| abstract_inverted_index.The | 0 |
| abstract_inverted_index.and | 5, 13, 55, 74, 164 |
| abstract_inverted_index.for | 160 |
| abstract_inverted_index.its | 16 |
| abstract_inverted_index.our | 134, 155 |
| abstract_inverted_index.the | 23, 45, 72, 79, 93, 107, 158 |
| abstract_inverted_index.yet | 15 |
| abstract_inverted_index.LMs, | 49, 84, 113 |
| abstract_inverted_index.LMs. | 32, 104, 172 |
| abstract_inverted_index.from | 44, 126 |
| abstract_inverted_index.high | 52 |
| abstract_inverted_index.lies | 19 |
| abstract_inverted_index.more | 98 |
| abstract_inverted_index.most | 137 |
| abstract_inverted_index.need | 159 |
| abstract_inverted_index.text | 128 |
| abstract_inverted_index.that | 133 |
| abstract_inverted_index.them | 121 |
| abstract_inverted_index.this | 59 |
| abstract_inverted_index.were | 42 |
| abstract_inverted_index.when | 27, 150 |
| abstract_inverted_index.(LMs) | 9 |
| abstract_inverted_index.arise | 29 |
| abstract_inverted_index.cost, | 14 |
| abstract_inverted_index.costs | 54 |
| abstract_inverted_index.large | 4, 103 |
| abstract_inverted_index.model | 162 |
| abstract_inverted_index.small | 6, 31, 83, 112 |
| abstract_inverted_index.which | 41, 70 |
| abstract_inverted_index.across | 145 |
| abstract_inverted_index.assist | 120 |
| abstract_inverted_index.better | 122 |
| abstract_inverted_index.called | 68 |
| abstract_inverted_index.during | 78 |
| abstract_inverted_index.metric | 67 |
| abstract_inverted_index.models | 8 |
| abstract_inverted_index.moment | 24 |
| abstract_inverted_index.paper, | 60 |
| abstract_inverted_index.reveal | 132 |
| abstract_inverted_index.achieve | 97 |
| abstract_inverted_index.capture | 123 |
| abstract_inverted_index.chunks. | 129 |
| abstract_inverted_index.complex | 152 |
| abstract_inverted_index.display | 165 |
| abstract_inverted_index.efforts | 35 |
| abstract_inverted_index.errors. | 89 |
| abstract_inverted_index.focused | 37 |
| abstract_inverted_index.limited | 56, 108 |
| abstract_inverted_index.pivotal | 17 |
| abstract_inverted_index.process | 47, 81 |
| abstract_inverted_index.propose | 62 |
| abstract_inverted_index.various | 170 |
| abstract_inverted_index.Previous | 33 |
| abstract_inverted_index.accurate | 99 |
| abstract_inverted_index.adapting | 168 |
| abstract_inverted_index.balances | 11 |
| abstract_inverted_index.capacity | 110 |
| abstract_inverted_index.critical | 124 |
| abstract_inverted_index.language | 7 |
| abstract_inverted_index.leverage | 115 |
| abstract_inverted_index.multiple | 146 |
| abstract_inverted_index.paradigm | 2 |
| abstract_inverted_index.queries. | 153 |
| abstract_inverted_index.separate | 43 |
| abstract_inverted_index.training | 163 |
| abstract_inverted_index.Extensive | 130 |
| abstract_inverted_index.Moreover, | 154 |
| abstract_inverted_index.adjusting | 92 |
| abstract_inverted_index.baselines | 138 |
| abstract_inverted_index.challenge | 18 |
| abstract_inverted_index.datasets, | 148 |
| abstract_inverted_index.detection | 94, 143 |
| abstract_inverted_index.different | 127 |
| abstract_inverted_index.eliminate | 157 |
| abstract_inverted_index.enhancing | 140 |
| abstract_inverted_index.knowledge | 117 |
| abstract_inverted_index.potential | 87 |
| abstract_inverted_index.practical | 64 |
| abstract_inverted_index.precisely | 21 |
| abstract_inverted_index.primarily | 36 |
| abstract_inverted_index.real-time | 100, 141 |
| abstract_inverted_index.reasoning | 46, 88, 109 |
| abstract_inverted_index.resulting | 50 |
| abstract_inverted_index.additional | 161 |
| abstract_inverted_index.addressing | 151 |
| abstract_inverted_index.amplifying | 86 |
| abstract_inverted_index.calculates | 71 |
| abstract_inverted_index.especially | 149 |
| abstract_inverted_index.evaluation | 66 |
| abstract_inverted_index.generation | 80 |
| abstract_inverted_index.invocation | 26, 65, 101 |
| abstract_inverted_index.strategies | 156 |
| abstract_inverted_index.threshold, | 95 |
| abstract_inverted_index.AttenHScore | 135 |
| abstract_inverted_index.considering | 106 |
| abstract_inverted_index.dynamically | 91 |
| abstract_inverted_index.effectively | 10 |
| abstract_inverted_index.experiments | 131 |
| abstract_inverted_index.flexibility | 166 |
| abstract_inverted_index.information | 125 |
| abstract_inverted_index.outperforms | 136 |
| abstract_inverted_index.performance | 12 |
| abstract_inverted_index.pinpointing | 22 |
| abstract_inverted_index.propagation | 75 |
| abstract_inverted_index.techniques, | 40 |
| abstract_inverted_index.AttenHScore, | 69 |
| abstract_inverted_index.accumulation | 73 |
| abstract_inverted_index.capabilities | 144 |
| abstract_inverted_index.continuously | 85 |
| abstract_inverted_index.optimization | 34 |
| abstract_inverted_index.Additionally, | 105 |
| abstract_inverted_index.collaborative | 1 |
| abstract_inverted_index.computational | 53 |
| abstract_inverted_index.hallucination | 142 |
| abstract_inverted_index.effectiveness. | 57 |
| abstract_inverted_index.hallucinations | 28, 77 |
| abstract_inverted_index.reorganization | 118 |
| abstract_inverted_index.post-processing | 39 |
| abstract_inverted_index.transformer-based | 171 |
| abstract_inverted_index.uncertainty-aware | 116 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |