Explanipedia

JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation Open

Huixuan Zhang, Xiaojun Wan · 2025

Current large language models (LLMs) often suffer from hallucination issues, i,e, generating content that appears factual but is actually unreliable. A typical hallucination detection pipeline involves response decomposition (i.e., claim e…

HAD: HAllucination Detection Language Models Based on a Comprehensive Hallucination Taxonomy Open

Fan Xu, Xinyu Hu, Zhengtao Yu, Lin Li, Yang Zhang , et al. · 2025

The increasing reliance on natural language generation (NLG) models, particularly large language models, has raised concerns about the reliability and accuracy of their outputs. A key challenge is hallucination, where models produce plausi…

LoaQ: Layer-wise Output Approximation Quantization Open

Lin Li, Xiaojun Wan · 2025

A natural and intuitive idea in model quantization is to approximate each component's quantized output to match its original. Layer-wise post-training quantization (PTQ), though based on this idea, adopts a strictly local view and can achi…

Exploring Causal Effect of Social Bias on Faithfulness Hallucinations in Large Language Models Open

Zhenliang Zhang, Xinyu Hu, Huixuan Zhang, Xiaojun Wan · 2025

Large language models (LLMs) have achieved remarkable success in various tasks, yet they remain vulnerable to faithfulness hallucinations, where the output does not align with the input. In this study, we investigate whether social bias co…

Circadian Rhythm Genes-based Prognostic Signature for Bladder Cancer: Association of EZH2 Expression with Anesthetic-related Changes in Circulating Tumor Cells Open

Xiaojun Wan, Kunxiang Wang, Peng Ren, Xuezhou Zhang, Fa Sun · 2025

Introduction: Circadian rhythm genes (CRGs) play a significant role in the pathogenesis of various cancers, yet their impact on bladder cancer (BC) remains to be fully elucidated. EZH2, as a potential oncological biomarker, lacks clear del…

ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs Open

Zhenliang Zhang, Xinyu Hu, Huixuan Zhang, Junzhe Zhang, Xiaojun Wan · 2025

Large language models (LLMs) excel at various natural language processing tasks, but their tendency to generate hallucinations undermines their reliability. Existing hallucination detection methods leveraging hidden states predominantly fo…

Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks Open

Huanming Shen, Baizhou Huang, Xiaojun Wan · 2025

Watermarking is a promising defense against the misuse of large language models (LLMs), yet it remains vulnerable to scrubbing and spoofing attacks. This vulnerability stems from an inherent trade-off governed by watermark window size: sma…

Minos: A Multimodal Evaluation Model for Bidirectional Generation Between Image and Text Open

Junzhe Zhang, Huixuan Zhang, Xinyu Hu, Lin Li, Mingqi Gao , et al. · 2025

Evaluation is important for multimodal generation tasks. With the rapid progress of MLLMs, there is growing interest in applying MLLMs to build general evaluation systems. However, existing work overlooks two aspects: (1) the development o…

NeUQI: Near-Optimal Uniform Quantization Parameter Initialization Open

Lin Li, Xinyu Hu, Xiaojun Wan · 2025

Large language models (LLMs) achieve impressive performance across domains but face significant challenges when deployed on consumer-grade GPUs or personal devices such as laptops, due to high memory consumption and inference costs. Post-t…

AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection Open

Jiatao Li, M. H. Ye, Peng Cheng, Xunjian Yin, Xiaojun Wan · 2025

Existing AI-generated text detection methods heavily depend on large annotated datasets and external threshold tuning, restricting interpretability, adaptability, and zero-shot effectiveness. To address these limitations, we propose AGENT-…

ICIs-Associated Adverse Events in Patients with Advanced or Metastatic Renal cell Carcinoma: A Systematic Review and Meta-Analysis Open

Qiting Zuo, Jianqing Zhang, Qingqing Luo, Zhengyuan Wang, Ya Yuan , et al. · 2025

Analyzing Cognitive Differences Among Large Language Models through the Lens of Social Worldview Open

Jiatao Li, Yanheng Li, Xiaojun Wan · 2025

Large Language Models (LLMs) have become integral to daily life, widely adopted in communication, decision-making, and information retrieval, raising critical questions about how these systems implicitly form and express socio-cognitive at…

C-FAITH: A Chinese Fine-Grained Benchmark for Automated Hallucination Evaluation Open

Xu Zhang, Zhifei Liu, Jiahao Wang, Huixuan Zhang, Fan Xu , et al. · 2025

Despite the rapid advancement of large language models, they remain highly susceptible to generating hallucinations, which significantly hinders their widespread application. Hallucination research requires dynamic and fine-grained evaluat…

DSGram: Dynamic Weighting Sub-Metrics for Grammatical Error Correction in the Era of Large Language Models Open

Jichao Xie, Yilin Li, Xunjian Yin, Xiaojun Wan · 2025

Evaluating the performance of Grammatical Error Correction (GEC) models has become increasingly challenging, as large language model (LLM)-based GEC systems often produce corrections that diverge from provided gold references. This discrep…

Exploring the Multilingual NLG Evaluation Abilities of LLM-Based Evaluators Open

Jiayi Chang, Mingqi Gao, Xiaojun Wan · 2025

Previous research has shown that LLMs have potential in multilingual NLG evaluation tasks. However, existing research has not fully explored the differences in the evaluation capabilities of LLMs across different languages. To this end, th…

A Dual-Perspective NLG Meta-Evaluation Framework with Automatic Benchmark and Better Interpretability Open

Xinyu Hu, Mingqi Gao, Lin Li, Zhengtao Yu, Xiaojun Wan · 2025

In NLG meta-evaluation, evaluation metrics are typically assessed based on their consistency with humans. However, we identify some limitations in traditional NLG meta-evaluation approaches, such as issues in handling human ratings and amb…

Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models Open

Jiatao Li, Xinyu Hu, Xunjian Yin, Xiaojun Wan · 2025

B4: A Black-Box Scrubbing Attack on LLM Watermarks Open

Baizhou Huang, Pu Xiao, Xiaojun Wan · 2025

DAMON: A Dialogue-Aware MCTS Framework for Jailbreaking Large Language Models Open

Xu Zhang, Xunjian Yin, Da Jing, Huixuan Zhang, Xinyu Hu , et al. · 2025

MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency Open

Junzhe Zhang, Huixuan Zhang, Xunjian Yin, Baizhou Huang, Xu Zhang , et al. · 2025

WaterPool: A Language Model Watermark Mitigating Trade-Offs among Imperceptibility, Efficacy and Robustness Open

Baizhou Huang, Xiaojun Wan · 2025

Exploring and Evaluating Multimodal Knowledge Reasoning Consistency of Multimodal Large Language Models Open

Boyu Jia, Junzhe Zhang, Huixuan Zhang, Xiaojun Wan · 2025

Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation Open

Mingqi Gao, Xinyu Hu, Lin Li, Xiaojun Wan · 2025

Towards A “Novel” Benchmark: Evaluating Literary Fiction with Large Language Models Open

Wenqing Wang, Mingqi Gao, Xinyu Hu, Xiaojun Wan · 2025

Tracing Training Footprints: A Calibration Approach for Membership Inference Attacks Against Multimodal Large Language Models Open

Xiaofan Zheng, Huixuan Zhang, Xiaojun Wan · 2025

Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection Open

Jiatao Li, Xiaojun Wan · 2025

Re-evaluating Automatic LLM System Ranking for Alignment with Human Preference Open

Mingqi Gao, Yixin Liu, Xinyu Hu, Xiaojun Wan, Jonathan Bragg , et al. · 2025

Gödel Agent: A Self-Referential Agent Framework for Recursively Self-Improvement Open

Yin Xiong, Xinyi Wang, Liangming Pan, Lin Li, Xiaojun Wan , et al. · 2025

R-Bind: Unified Enhancement of Attribute and Relation Binding in Text-to-Image Diffusion Models Open

Huixuan Zhang, Xiaojun Wan · 2025

TriEmbed: Bridge the Gap between Text and Token Indices with Embedding Reparameterization Open

Baizhou Huang, Xiaojun Wan · 2025

Xiaojun Wan YOU? Author Swipe