Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell Article Swipe
Taiming Lu
,
Muhan Gao
,
Kuai Yu
,
Adam Byerly
,
Daniel Khashabi
·
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2406.14673
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2406.14673
Large Language Models (LLMs) exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs' long-context reasoning by probing their hidden representations. We find that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses. This reveals a disconnect between information retrieval and utilization, a "know but don't tell" phenomenon. We further analyze the relationship between extraction time and final accuracy, offering insights into the underlying mechanics of transformer models.
Related Topics To Compare & Contrast
Concepts
Psychology
Computer science
Metadata
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2406.14673
- https://arxiv.org/pdf/2406.14673
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4399986568
All OpenAlex metadata
Raw OpenAlex JSON
No additional metadata available.