Explanipedia

Uncertainty Quantification for Multiple-Choice Questions is Just One-Token Deep Open

Qingcheng Zeng, Mingyu Jin, Qinkai Yu, Z.G. Wang, Wenyue Hua , et al. · 2025

Photosensitive small intestinal submucosal hydrogels loaded with the KR-12-a5 peptide promote periodontal osteogenesis and antimicrobial activity Open

Shiqing Ma, Qiquan Yan, Le Li, Yazhuo Ni, Keying Chen , et al. · 2025

Rethinking Technology Stack Selection with AI Coding Proficiency Open

Xiaoyu Zhang, Weipeng Jiang, Juan Zhai, Shiqing Ma, Qiang Bao , et al. · 2025

Large language models (LLMs) are now an integral part of software development workflows and are reshaping the whole process. Traditional technology stack selection has not caught up. Most of the existing selection methods focus solely on t…

DeCoMa: Detecting and Purifying Code Dataset Watermarks through Dual Channel Code Abstraction Open

Yuan Xiao, Yuchen Chen, Shiqing Ma, Hanlei Huang, Chunrong Fang , et al. · 2025

Watermarking is a technique to help identify the source of data points, which can be used to help prevent the misuse of protected datasets. Existing methods on code watermarking, leveraging the idea from the backdoor research, embed stealt…

The Foundation Cracks: A Comprehensive Study on Bugs and Testing Practices in LLM Libraries Open

Weipeng Jiang, Xiaoyu Zhang, Xiaofei Xie, Jinming Yu, Yuhan Zhi , et al. · 2025

Large Language Model (LLM) libraries have emerged as the foundational infrastructure powering today's AI revolution, serving as the backbone for LLM deployment, inference optimization, fine-tuning, and production serving across diverse app…

EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models Open

M. Li, Gehao Zhang, Zhenting Wang, Shiqing Ma, Shaohua Pan , et al. · 2025

Text-to-image generation models~(e.g., Stable Diffusion) have achieved significant advancements, enabling the creation of high-quality and realistic images based on textual descriptions. Prompt inversion, the task of identifying the textua…

VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models Open

Mohammadreza Teymoorianfard, Shiqing Ma, Amir Houmansadr · 2025

Video diffusion models can generate realistic and temporally consistent videos. This raises concerns about provenance, ownership, and integrity. Watermarking can help address these issues by embedding metadata directly into the content. To…

Exposing Product Bias in LLM Investment Recommendation Open

Yuhan Zhi, Xiaoyu Zhang, L. Wang, Shumin Jiang, Shiqing Ma , et al. · 2025

Large language models (LLMs), as a new generation of recommendation engines, possess powerful summarization and data analysis capabilities, surpassing traditional recommendation systems in both scope and performance. One promising applicat…

Holistic Audit Dataset Generation for LLM Unlearning via Knowledge Graph Traversal and Redundancy Removal Open

Weipeng Jiang, Juan Zhai, Shiqing Ma, Z. H. Lei, Xiaofei Xie , et al. · 2025

In recent years, Large Language Models (LLMs) have faced increasing demands to selectively remove sensitive information, protect privacy, and comply with copyright regulations through unlearning, by Machine Unlearning. While evaluating unl…

The Invisible Hand: Unveiling Provider Bias in Large Language Models for Code Generation Open

Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Qiang Bao, Weipeng Jiang , et al. · 2025

Large Language Models (LLMs) have emerged as the new recommendation engines, surpassing traditional methods in both capability and scope, particularly in code generation. In this paper, we reveal a novel provider bias in LLMs: without expl…

An Optimizable Suffix Is Worth A Thousand Templates: Efficient Black-box Jailbreaking without Affirmative Phrases via LLM as Optimizer Open

Weipeng Jiang, Zhenting Wang, Juan Zhai, Shiqing Ma, Zhengyu Zhao , et al. · 2025

The Invisible Hand: Unveiling Provider Bias in Large Language Models for Code Generation Open

Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Qiang Bao, Weipeng Jiang , et al. · 2025

MLLM-as-a-Judge for Image Safety without Human Labeling Open

Zhenting Wang, Shuming Hu, Shiyu Zhao, Xiaowen Lin, Felix Juefei-Xu , et al. · 2024

Image content safety has become a significant challenge with the rise of visual media on online platforms. Meanwhile, in the age of AI-generated content (AIGC), many image generation models are capable of producing harmful content, such as…

Continuous Concepts Removal in Text-to-image Diffusion Models Open

Tingxu Han, Weisong Sun, Yanrong Hu, Chunrong Fang, Yonglong Zhang , et al. · 2024

Text-to-image diffusion models have shown an impressive ability to generate high-quality images from input textual descriptions. However, concerns have been raised about the potential for these models to create content that infringes on co…

DREAM: Debugging and Repairing AutoML Pipelines Open

Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Xiaohong Guan, Chao Shen · 2024

Deep Learning models have become an integrated component of modern software systems. In response to the challenge of model design, researchers proposed Automated Machine Learning (AutoML) systems, which automatically search for model archi…

Speculative Coreset Selection for Task-Specific Fine-tuning Open

Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Chao Shen, Tianlin Li , et al. · 2024

Task-specific fine-tuning is essential for the deployment of large language models (LLMs), but it requires significant computational resources and time. Existing solutions have proposed coreset selection methods to improve data efficiency …

An Optimizable Suffix Is Worth A Thousand Templates: Efficient Black-box Jailbreaking without Affirmative Phrases via LLM as Optimizer Open

Weipeng Jiang, Zhenting Wang, Juan Zhai, Shiqing Ma, Zhengyu Zhao , et al. · 2024

Despite prior safety alignment efforts, mainstream LLMs can still generate harmful and unethical content when subjected to jailbreaking attacks. Existing jailbreaking methods fall into two main categories: template-based and optimization-b…

UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening Open

Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang, Guanhong Tao, Shengwei An , et al. · 2024

Deep neural networks (DNNs) have demonstrated effectiveness in various fields. However, DNNs are vulnerable to backdoor attacks, which inject a unique pattern, called trigger, into the input to cause misclassification to an attack-chosen t…

COSTELLO: Contrastive Testing for Embedding-Based Large Language Model as a Service Embeddings Open

Weipeng Jiang, Juan Zhai, Shiqing Ma, Xiaoyu Zhang, Chao Shen · 2024

Large language models have gained significant popularity and are often provided as a service (i.e., LLMaaS). Companies like OpenAI and Google provide online APIs of LLMs to allow downstream users to create innovative applications. Despite …

Efficient DNN-Powered Software with Fair Sparse Models Open

Xuanqi Gao, Weipeng Jiang, Juan Zhai, Shiqing Ma, Xiaoyu Zhang , et al. · 2024

With the emergence of the Software 3.0 era, there is a growing trend of compressing and integrating large models into software systems, with significant societal implications. Regrettably, in numerous instances, model compression technique…

CITADEL: Context Similarity Based Deep Learning Framework Bug Finding Open

Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Shiwei Wang, Chao Shen · 2024

With the application of deep learning technology, tools of DL framework testing are in high demand. Existing DL framework testing tools have limited coverage of bug types. For example, they lack the capability of effectively finding perfor…

MeanSparse: Post-Training Robustness Enhancement Through Mean-Centered Feature Sparsification Open

Sajjad Amini, Mohammadreza Teymoorianfard, Shiqing Ma, Amir Houmansadr · 2024

We present a simple yet effective method to improve the robustness of both Convolutional and attention-based Neural Networks against adversarial examples by post-processing an adversarially trained model. Our technique, MeanSparse, cascade…

From Effectiveness to Efficiency: Uncovering Linguistic Bias in Large Language Model-based Code Generation Open

Weipeng Jiang, Xuanqi Gao, Juan Zhai, Shiqing Ma, Xiaoyu Zhang , et al. · 2024

Large Language Models (LLMs) have demonstrated promising capabilities for code generation. While existing benchmarks evaluate the correctness and efficiency of LLM-generated code, the potential linguistic bias - where code quality varies b…

Invisible Backdoor Attack against Self-supervised Learning Open

Hanrong Zhang, Zhenting Wang, Tingxu Han, Mingyu Jin, Chenlu Zhan , et al. · 2024

Self-supervised learning (SSL) models are vulnerable to backdoor attacks. Existing backdoor attacks that are effective in SSL often involve noticeable triggers, like colored patches or visible noise, which are vulnerable to human inspectio…

How to Trace Latent Generative Model Generated Images without Artificial Watermark? Open

Zhenting Wang, Vikash Sehwag, Chen Chen, Lingjuan Lyu, Dimitris Metaxas , et al. · 2024

Latent generative models (e.g., Stable Diffusion) have become more and more popular, but concerns have arisen regarding potential misuse related to images generated by these models. It is, therefore, necessary to analyze the origin of imag…

Merlin: Multi-tier Optimization of eBPF Code for Performance and Compactness Open

Jinsong Mao, Hailun Ding, Juan Zhai, Shiqing Ma · 2024

LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning Open

Siyuan Cheng, Guanhong Tao, Yingqi Liu, Guangyu Shen, Shengwei An , et al. · 2024

Backdoor attack poses a significant security threat to Deep Learning applications. Existing attacks are often not evasive to established backdoor detection techniques. This susceptibility primarily stems from the fact that these attacks ty…

Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift Open

Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang, Qiuling Xu, Guanhong Tao , et al. · 2024

Diffusion models (DM) have become state-of-the-art generative models because of their capability of generating high-quality images from noises without adversarial training. However, they are vulnerable to backdoor attacks as reported by re…

Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia Open

Guangyu Shen, Siyuan Cheng, Kaiyuan Zhang, Guanhong Tao, Shengwei An , et al. · 2024

Large Language Models (LLMs) have become prevalent across diverse sectors, transforming human life with their extraordinary reasoning and comprehension abilities. As they find increased use in sensitive tasks, safety concerns have gained w…

Gradient Shaping: Enhancing Backdoor Attack Against Reverse Engineering Open

Rui Zhu, Di Tang, Siyuan Tang, Zihao Wang, Guanhong Tao , et al. · 2024

Most existing methods to detect backdoored machine learning (ML) models take one of the two approaches: trigger inversion (aka.reverse engineer) and weight analysis (aka.model diagnosis).In particular, the gradient-based trigger inversion …

Shiqing Ma YOU? Author Swipe