arXiv (Cornell University)
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering
May 2025 • J. Zhao, Chunlai Zhou, Biao Qin
The collaborative paradigm of large and small language models (LMs) effectively balances performance and cost, yet its pivotal challenge lies in precisely pinpointing the moment of invocation when hallucinations arise in small LMs. Previous optimization efforts primarily focused on post-processing techniques, which were separate from the reasoning process of LMs, resulting in high computational costs and limited effectiveness. In this paper, we propose a practical invocation evaluation metric called AttenHScore, w…