Explanipedia

One and a Half Year of ChatGPT: An Umbrella Review of Large Language Model (LLM) Perception Across Time and Research Fields Open

Leonardo Bergmann, Robin Beckenbach, Lisa Zach, Benjamin Roth, Ulrich S. Tran · 2025

Large language models (LLMs) like ChatGPT have emerged as transformative tools across various research fields. To illustrate the development of perceived benefits and concerns of LLMs across research fields, this umbrella review synthesize…

Agree, Disagree, Explain: Decomposing Human Label Variation in NLI through the Lens of Explanations Open

Hong Peng, Beiduo Chen, Siyao Peng, Marie-Catherine de Marneffe, Benjamin Roth , et al. · 2025

Natural Language Inference datasets often exhibit human label variation. To better understand these variations, explanation-based approaches analyze the underlying reasoning behind annotators' decisions. One such approach is the LiTEx taxo…

Exploring prompts to elicit memorization in masked language model-based named entity recognition Open

Yuxi Xia, Anastasiia Sedova, Pedro Henrique Luz de Araujo, Vasiliki Kougia, Lisa Nußbaumer , et al. · 2025

The possibility of identifying specific information about the training data a language model memorized poses a privacy risk. In this study, we analyze the ability of prompts to detect training data memorization in six masked language model…

The Impact of Graph Structure, Cluster Centroid and Text Review Embeddings on Recommendation Methods Open

Peter Dolog, Sergio David Rico Torres, Yllka Velaj, Ylli Sadikaj, Andreas Stephan , et al. · 2025

It is generally accepted that collaborative information is important for the performance of recommender systems. It is also generally accepted that if this information is sparser, it impacts recommendation systems negatively. Various appro…

Helpful assistant or fruitful facilitator? Investigating how personas affect language model behavior Open

Pedro Henrique Luz de Araujo, Benjamin Roth · 2025

One way to steer generations from large language models (LLM) is to assign a persona: a role that describes how the user expects the LLM to behave (e.g., a helpful assistant, a teacher, a woman). This paper investigates how personas affect…

Streamlining and Accelerating the Molecular Tumor Board Process at the University Medical Center Hamburg-Eppendorf Open

Layla Tabea Riemann, Maximilian Ataian, Felicia P. S. Hähner, Benjamin Roth, Alexander Knurr , et al. · 2025

As the first open-source and extendable solution for standardized MTB documentation, MONOCLE enables wider adoption by other medical centers.

Knowledge Connector: Decision support system for multiomics-based precision oncology Open

Daniel Hübschmann, Simon Kreutzfeldt, Benjamin Roth, Katrin Glocker, Janine Schoop , et al. · 2025

Precision cancer medicine aims to improve patient outcomes by providing individually tailored recommendations for clinical management based on the evaluation of biological disease profiles in multidisciplinary molecular tumor boards (MTBs)…

Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles Open

Ye Xia, Pedro Araujo, Klim Zaporojets, Benjamin Roth · 2025

Calibration, the alignment between model confidence and prediction accuracy, is critical for the reliable deployment of large language models (LLMs). Existing works neglect to measure the generalization of their methods to other prompt sty…

Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles Open

Ye Xia, Pedro Araujo, Klim Zaporojets, Benjamin Roth · 2025

Influence-driven Curriculum Learning for Pre-training on Limited Data Open

Loris Schoenegger, Lukas Thoma, Terra Blevins, Benjamin Roth · 2025

Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance Open

Pedro Henrique Luz de Araujo, Paul Röttger, Dirk Hovy, Benjamin Roth · 2025

RecombiText: Compositional Data Augmentation for Enhancing LLM Pre-Training Datasets in Low-Resource Scenarios Open

Alexander Tampier, Lukas Thoma, Loris Schoenegger, Benjamin Roth · 2025

Specification overfitting in artificial intelligence Open

Benjamin Roth, Pedro Henrique Luz de Araujo, Yuxi Xia, Saskia Kaltenbrunner, Christoph Korab · 2024

Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this t…

From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks Open

Andreas Stephan, Dawei Zhu, Matthias Aßenmacher, Xiaoyu Shen, Benjamin Roth · 2024

To reduce the need for human annotations, large language models (LLMs) have been proposed as judges of the quality of other candidate models. The performance of LLM judges is typically evaluated by measuring the correlation with human judg…

An Evaluation of Explanation Methods for Black-Box Detectors of Machine-Generated Text Open

Loris Schoenegger, Yuxi Xia, Benjamin Roth · 2024

The increasing difficulty to distinguish language-model-generated from human-written text has led to the development of detectors of machine-generated text (MGT). However, in many contexts, a black-box prediction is not sufficient, it is e…

To Know or Not To Know? Analyzing Self-Consistency of Large Language Models under Ambiguity Open

Anastasiia Sedova, Robert Litschko, Diego Frassinelli, Benjamin Roth, Barbara Plank · 2024

One of the major aspects contributing to the striking performance of large language models (LLMs) is the vast amount of factual knowledge accumulated during pre-training. Yet, many LLMs suffer from self-inconsistency, which raises doubts a…

Black-box Model Ensembling for Textual and Visual Question Answering via Information Fusion Open

Yuxi Xia, Kilm Zaporojets, Benjamin Roth · 2024

A diverse range of large language models (LLMs), e.g., ChatGPT, and visual question answering (VQA) models, e.g., BLIP, have been developed for solving textual and visual question answering tasks. However, fine-tuning these models is eithe…

Helpful assistant or fruitful facilitator? Investigating how personas affect language model behavior Open

Pedro Henrique Luz de Araujo, Benjamin Roth · 2024

One way to personalize and steer generations from large language models (LLM) is to assign a persona: a role that describes how the user expects the LLM to behave (e.g., a helpful assistant, a teacher, a woman). This paper investigates how…

Analysing zero-shot temporal relation extraction on clinical notes using temporal consistency Open

Vasiliki Kougia, Anastasiia Sedova, Andreas Stephan, Klim Zaporojets, Benjamin Roth · 2024

This paper presents the first study for temporal relation extraction in a zero-shot setting focusing on biomedical text. We employ two types of prompts and five LLMs (GPT-3.5, Mixtral, Llama 2, Gemma, and PMC-LLaMA) to obtain responses abo…

Text-Guided Alternative Image Clustering Open

Andreas Stephan, Lukas Miklautz, Collin Leiber, Pedro Henrique Luz de Araujo, Dominik Répás , et al. · 2024

Traditional image clustering techniques only find a single grouping within visual data. In particular, they do not provide a possibility to explicitly define multiple types of clustering. This work explores the potential of large vision-la…

The Impact of Cluster Centroid and Text Review Embeddings on Recommendation Methods Open

Peter Dolog, Ylli Sadikaj, Yllka Velaj, Andreas Stephan, Benjamin Roth , et al. · 2024

Recommendation systems often neglect global patterns that can be provided by clusters of similar items or even additional information such as text. Therefore, we study the impact of integrating clustering embeddings, review embeddings, and…

Exploring prompts to elicit memorization in masked language model-based named entity recognition Open

Yuxi Xia, Anastasiia Sedova, Pedro Henrique Luz de Araujo, Vasiliki Kougia, Lisa Nußbaumer , et al. · 2024

Training data memorization in language models impacts model capability (generalization) and safety (privacy risk). This paper focuses on analyzing prompts' impact on detecting the memorization of 6 masked language model-based named entity …

Specification Overfitting in Artificial Intelligence Open

Benjamin Roth, Pedro Henrique Luz de Araujo, Yuxi Xia, Saskia Kaltenbrunner, Christoph Korab · 2024

Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this t…

Specification Overfitting in Artificial Intelligence Open

Benjamin Roth, Pedro Henrique Luz de Araujo, Yuxi Xia, Saskia Kaltenbrunner, Christoph Korab · 2024

Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this t…

Counterfactual Reasoning with Knowledge Graph Embeddings Open

Lena Zellinger, Andreas Stephan, Benjamin Roth · 2024

Knowledge graph embeddings (KGEs) were originally developed to infer true but missing facts in incomplete knowledge repositories. In this paper, we link knowledge graph completion and counterfactual reasoning via our new task CFKGR. We mod…

Text-Guided Image Clustering Open

Andreas Stephan, Lukas Miklautz, Kevin Sidak, Jan Philip Wahle, Béla Gipp , et al. · 2024

Image clustering divides a collection of images into meaningful groups, typically interpreted post-hoc via human-given annotations. Those are usually in the form of text, begging the question of using text as an abstraction for image clust…

Linking Danish Parser Output to a Central Word Repository:From Morphosemantic Disambiguation to Unique Identifiers Open

Eckhard Bick, Munir Georges, Aaricia Herygers, Annemarie Friedrich, Benjamin Roth · 2024

Counterfactual Reasoning with Knowledge Graph Embeddings Open

Lena Zellinger, Andreas Stephan, Benjamin Roth · 2024

Text-Guided Image Clustering Open

Andreas Stephan, Lukas Miklautz, Kevin Sidak, Jan Philip Wahle, Béla Gipp , et al. · 2024

Functionality learning through specification instructions Open

Pedro Henrique Luz de Araujo, Benjamin Roth · 2023

Test suites assess natural language processing models' performance on specific functionalities: cases of interest involving model robustness, fairness, or particular linguistic capabilities. This paper introduces specification instructions…

Benjamin Roth YOU? Author Swipe