Explanipedia

ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs Open

Adi Simhi, Jonathan Herzig, Martin Tutek, Itay Itzhak, Idan Szpektor , et al. · 2025

As large language models (LLMs) evolve from conversational assistants into autonomous agents, evaluating the safety of their actions becomes critical. Prior safety benchmarks have primarily focused on preventing generation of harmful conte…

DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs Open

Arie Cattan, Alon Jacovi, Ori Ram, Jonathan Herzig, Roee Aharoni , et al. · 2025

Retrieval Augmented Generation (RAG) is a commonly used approach for enhancing large language models (LLMs) with relevant and up-to-date information. However, the retrieved sources can often contain conflicting information and it remains u…

Inside-Out: Hidden Factual Knowledge in LLMs Open

Zorik Gekhman, Eyal Ben David, Hadas Orgad, E. O. Ofek, Yonatan Belinkov , et al. · 2025

This work presents a framework for assessing whether large language models (LLMs) encode more factual knowledge in their parameters than what they express in their outputs. While a few studies hint at this possibility, none has clearly def…

Advancing Sustainable Prototyping of Future Aircraft Cabin Designs Through Extended Reality Technologies Open

Jonathan Herzig, Fabian Reimer, Sebastian Cornelje, Jörn Biedermann, Björn Nagel · 2025

This paper explores the virtual development of cabin concepts for hydrogen-powered aircraft, emphasizing sustainable, safe, and comfortable transport solutions. It examines how digital technologies can accelerate product development by inv…

Applied Design Thinking in urban air mobility: creating the airtaxi cabin design of the future from a user perspective Open

Fabian Reimer, Jonathan Herzig, Linda Winkler, J. Biedermann, Frank Meller , et al. · 2025

Design Thinking is essential for user-centered cabin design concepts in future transportation vehicles, as it facilitates the identification of user needs, creative problem-solving and iterative development to ensure optimal user experienc…

Distinguishing Ignorance from Error in LLM Hallucinations Open

Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov · 2024

Large language models (LLMs) are susceptible to hallucinations -- factually incorrect outputs -- leading to a large body of work on detecting and mitigating such cases. We argue that it is important to distinguish between two types of hall…

DoubleDipper: Improving Long-Context LLMs via Context Recycling Open

Arie Cattan, Alon Jacovi, Alex Fabrikant, Jonathan Herzig, Roee Aharoni , et al. · 2024

Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In this work, we propose DoubleDipper, a novel In-Context-Learning method that automatically generates few…

TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools Open

Avi Caciularu, Alon Jacovi, Eyal Ben-David, Sasha Goldshtein, Tal Schuster , et al. · 2024

Large Language Models (LLMs) often do not perform well on queries that require the aggregation of information across texts. To better evaluate this setting and facilitate modeling efforts, we introduce TACT - Text And Calculations through …

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? Open

Zorik Gekhman, Gal Yona, Roee Aharoni, Matan Eyal, Amir Feder , et al. · 2024

When large language models are aligned via supervised fine-tuning, they may encounter new factual information that was not acquired through pre-training. It is often conjectured that this can teach the model the behavior of hallucinating f…

Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs Open

Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov · 2024

Large language models (LLMs) are prone to hallucinations, which sparked a widespread effort to detect and prevent them. Recent work attempts to mitigate hallucinations by intervening in the model's generation, typically computing represent…

Create Your Own Cabin - Connecting Multidisciplinary User Perspectives and a Future Rescue Helicopter Concept Within the Applied Xr Co-design Process Open

Fabian Reimer, Jonathan Herzig, Markus Lindlar, Peter Weiand, Sebastian Cornelje , et al. · 2024

The German air rescue and healthcare system is facing a number of changes and challenges, that require a rethink in the development of new rescue helicopter concepts. As part of its research activities, the German Aerospace Centre is, ther…

Representation Surgery: Theory and Practice of Affine Steering Open

Shashwat Singh, Shauli Ravfogel, Jonathan Herzig, Roee Aharoni, Ryan Cotterell , et al. · 2024

Language models often exhibit undesirable behavior, e.g., generating toxic or gender-biased text. In the case of neural language models, an encoding of the undesirable behavior is often present in the model's representations. Thus, one nat…

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains Open

Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich , et al. · 2024

Prompting language models to provide step-by-step answers (e.g., "Chain-of-Thought") is the prominent approach for complex reasoning tasks, where more accurate reasoning chains typically improve downstream task performance. Recent literatu…

Multilingual Instruction Tuning With Just a Pinch of Multilinguality Open

Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty , et al. · 2024

As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial. In this work, we investigate how multilinguality during instruction tuning of …

Applying an interior VR co-design approach for the medical deployment vehicle of the future Open

Jonathan Herzig · 2023

-

A Comprehensive Evaluation of Tool-Assisted Generation Strategies Open

Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet , et al. · 2023

A growing area of research investigates augmenting language models with tools (e.g., search engines, calculators) to overcome their shortcomings (e.g., missing or incorrect knowledge, incorrect logical inferences). Various few-shot tool-us…

Applied design thinking in urban air mobility: creating the airtaxi cabin design of the future from a user perspective Open

Fabian Reimer, Jonathan Herzig, Linda Winkler, J. Biedermann, Frank Meller , et al. · 2023

In the course of developing digital and future aviation cabin concepts at the German Aerospace Center, the exploration of user-centered and acceptance-enhancing methods plays a central role. The challenge here is to identify the flexible r…

Evaluating and Modeling Attribution for Cross-Lingual Question Answering Open

Benjamin Müller, John Wieting, Jonathan H. Clark, Tom Kwiatkowski, Sebastian Ruder , et al. · 2023

Trustworthy answer content is abundant in many high-resource languages and is instantly accessible through question answering systems, yet this content can be hard to access for those that do not speak these languages. The leap forward in …

TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models Open

Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor · 2023

Factual consistency evaluation is often conducted using Natural Language Inference (NLI) models, yet these models exhibit limited success in evaluating summaries. Previous work improved such models with synthetic training data. However, th…

What You See is What You Read? Improving Text-Image Alignment Evaluation Open

Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig , et al. · 2023

Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks. In this work, we stud…