Explanipedia

Reimagining Safety Alignment with An Image Open

Yifan Xia, Guorui Chen, Wenqian Yu, Zhijiang Li, Philip Torr , et al. · 2025

Large language models (LLMs) excel in diverse applications but face dual challenges: generating harmful content under jailbreak attacks and over-refusal of benign queries due to rigid safety mechanisms. These issues are further complicated…

SVAG-Bench: A Large-Scale Benchmark for Multi-Instance Spatio-temporal Video Action Grounding Open

Tanveer Hannan, Sen Wu, Mark Weber, Suprosanna Shit, Jindong Gu , et al. · 2025

Understanding fine-grained actions and accurately localizing their corresponding actors in space and time are fundamental capabilities for advancing next-generation AI systems, including embodied agents, autonomous platforms, and human-AI …

TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models Open

S. Chang, Junchi Yu, Weixing Wang, Yongqiang Chen, Jialin Yu , et al. · 2025

Diffusion large language models (D-LLMs) have recently emerged as a promising alternative to auto-regressive LLMs (AR-LLMs). However, the hallucination problem in D-LLMs remains underexplored, limiting their reliability in real-world appli…

Can an Individual Manipulate the Collective Decisions of Multi-Agents? Open

Fengyuan Liu, Rui Zhao, Shuo Chen, Guohao Li, Philip H. S. Torr , et al. · 2025

Individual Large Language Models (LLMs) have demonstrated significant capabilities across various domains, such as healthcare and law. Recent studies also show that coordinated multi-agent systems exhibit enhanced decision-making and reaso…

Fair Generation without Unfair Distortions: Debiasing Text-to-Image Generation with Entanglement-Free Attention Open

Jeong-Hoon Park, Juyoung Lee, Chaeyeon Chung, Jae‐Seong Lee, Jaegul Choo , et al. · 2025

Recent advancements in diffusion-based text-to-image (T2I) models have enabled the generation of high-quality and photorealistic images from text. However, they often exhibit societal biases related to gender, race, and socioeconomic statu…

Image Tokens Matter: Mitigating Hallucination in Discrete Tokenizer-based Large Vision-Language Models via Latent Editing Open

Weixing Wang, Zifeng Ding, Jindong Gu, Rui Cao, Christoph Meinel , et al. · 2025

Large Vision-Language Models (LVLMs) with discrete image tokenizers unify multimodal representations by encoding visual inputs into a finite set of tokens. Despite their effectiveness, we find that these models still hallucinate non-existe…

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment Open

Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu , et al. · 2025

The remarkable success of Large Language Models (LLMs) has illuminated a promising pathway toward achieving Artificial General Intelligence for both academic and industrial communities, owing to their unprecedented performance across vario…

REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites Open

Divyansh Garg, Shaun VanWeelden, Diego Caples, Nikil Ravi, Pranav Putta , et al. · 2025

We introduce REAL, a benchmark and framework for multi-turn agent evaluations on deterministic simulations of real-world websites. REAL comprises high-fidelity, deterministic replicas of 11 widely-used websites across domains such as e-com…

FedPop: Federated Population-based Hyperparameter Tuning Open

Haokun Chen, Denis Krompaß, Jindong Gu, Volker Tresp · 2025

Federated Learning (FL) is a distributed machine learning (ML) paradigm, in which multiple clients collaboratively train ML models without centralizing their local data. Similar to conventional ML pipelines, the client local optimization a…

On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows Open

Subenoy Chakraborty, Mohammadreza Pourreza, Rongce Sun, Yong Sang Song, Nino Scherrer , et al. · 2025

Agentic AI workflows (systems that autonomously plan and act) are becoming widespread, yet their task success rate on complex tasks remains low. A promising solution is inference-time alignment, which uses extra compute at test time to imp…

Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models Open

Hao Cheng, Erjia Xiao, Yichi Wang, Lingfeng Zhang, Qiang Zhang , et al. · 2025

Current Cross-Modality Generation Models (GMs) demonstrate remarkable capabilities in various generative tasks. Given the ubiquity and information richness of vision modality inputs in real-world scenarios, Cross-Vision tasks, encompassing…

Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation Open

Fan Yin, Zifeng Wang, I-Hung Hsu, Jun Yan, Ke Jiang , et al. · 2025

Large language models (LLMs) have exhibited the ability to effectively utilize external tools to address user queries. However, their performance may be limited in complex, multi-turn interactions involving users and multiple tools. To add…

Improving Adversarial Transferability in MLLMs via Dynamic Vision-Language Alignment Attack Open

Chen Gu, Jindong Gu, Andong Hua, Yao Qin · 2025

Multimodal Large Language Models (MLLMs), built upon LLMs, have recently gained attention for their capabilities in image recognition and understanding. However, while MLLMs are vulnerable to adversarial attacks, the transferability of the…

PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving Open

Mihir Parmar, Xin Liu, Palash Goyal, Yanfei Chen, Long Bao Le , et al. · 2025

Recent agent frameworks and inference-time algorithms often struggle with complex planning problems due to limitations in verifying generated plans or reasoning and varying complexity of instances within a single task. Many existing method…

Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety Open

Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang , et al. · 2025

The rapid advancement of large models, driven by their exceptional abilities in learning and generalization through large-scale pre-training, has reshaped the landscape of Artificial Intelligence (AI). These models are now foundational to …

Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models Open

Erjia Xiao, Hao Cheng, Jing Shao, Jinhao Duan, Kaidi Xu , et al. · 2025

Large Language Models (LLMs) demonstrate impressive zero-shot performance across a wide range of natural language processing tasks. Integrating various modality encoders further expands their capabilities, giving rise to Multimodal Large L…

PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving Open

Mihir Parmar, Xin Liu, Palash Goyal, Yanfei Chen, Long Bao Le , et al. · 2025

Can an Individual Manipulate the Collective Decisions of Multi-Agents? Open

Fengyuan Liu, Rui Zhao, Shuo Chen, Guohao Li, Philip H. S. Torr , et al. · 2025

FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings Open

Tong Liu, Xiaopeng Yu, Wenxuan Zhou, Jindong Gu, Volker Tresp · 2025

Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety Open

Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang , et al. · 2025

Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models Open

Kuofeng Gao, Shu‐Tao Xia, Ke Xu, Philip H. S. Torr, Jindong Gu · 2025

Reimagining Safety Alignment with An Image Open

Yifan Xia, Guorui Chen, Wenqian Yu, Zhijiang Li, Philip Torr , et al. · 2025

Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation Open

Fan Yin, Zifeng Wang, I-Hung Hsu, Jun Yan, Ke Jiang , et al. · 2025

Text-Guided Camouflaged Object Detection Open

Z. Y. Chen, Y. Y. Xue, Zhijiang Li, Philip Torr, Jindong Gu · 2025

Multimodal Pragmatic Jailbreak on Text-to-image Models Open

Tong Liu, Zhixin Lai, Jiawen Wang, Gengyuan Zhang, Shuo Chen , et al. · 2025

Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models Open

Kuofeng Gao, Shu‐Tao Xia, Ke Xu, Philip H. S. Torr, Jindong Gu · 2024

Large Audio-Language Models (LALMs) have unclocked audio dialogue capabilities, where audio dialogues are a direct exchange of spoken language between LALMs and humans. Recent advances, such as GPT-4o, have enabled LALMs in back-and-forth …

AlignGuard: Scalable Safety Alignment for Text-to-Image Generation Open

Runtao Liu, Chen I Chieh, Jindong Gu, Jipeng Zhang, Renjie Pi , et al. · 2024

Text-to-image (T2I) models are widespread, but their limited safety guardrails expose end users to harmful content and potentially allow for model misuse. Current safety measures are typically limited to text-based filtering or concept rem…

Not Just Text: Uncovering Vision Modality Typographic Threats in Image Generation Models Open

Hao Cheng, Erjia Xiao, Jiayan Yang, Jiahang Cao, Qiang Zhang , et al. · 2024

Current image generation models can effortlessly produce high-quality, highly realistic images, but this also increases the risk of misuse. In various Text-to-Image or Image-to-Image tasks, attackers can generate a series of images contain…

Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models Open

Kuofeng Gao, Shu‐Tao Xia, Ke Xu, Philip Torr, Jindong Gu · 2024

Large Audio-Language Models (LALMs), such as GPT-4o, have recently unlocked audio dialogue capabilities, enabling direct spoken exchanges with humans. The potential of LALMs broadens their applicability across a wide range of practical sce…

ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos Open

Tanveer Hannan, Md. Mohaiminul Islam, Jindong Gu, Thomas Seidl, Gedas Bertasius · 2024

Large language models (LLMs) excel at retrieving information from lengthy text, but their vision-language counterparts (VLMs) face difficulties with hour-long videos, especially for temporal grounding. Specifically, these VLMs are constrai…

Jindong Gu YOU? Author Swipe