Explanipedia

Adaptive Diagnostic Reasoning Framework for Pathology with Multimodal Large Language Models Open

Justine Kao, Laura F. Edwards, Nuowei Liu, Chung‐Yen Huang, Alex Oliveira-Kowaleski , et al. · 2025

AI tools in pathology have improved screening throughput, standardized quantification, and revealed prognostic patterns that inform treatment. However, adoption remains limited because most systems still lack the human-readable reasoning n…

Compressing Many-Shots in In-Context Learning Open

Devvrit Khatri, Pranamya Kulkarni, Nilesh Gupta, Yerram Varun, Peng Li , et al. · 2025

Large Language Models (LLMs) have been shown to be able to learn different tasks without explicit finetuning when given many input-output examples / demonstrations through In-Context Learning (ICL). Increasing the number of examples, calle…

LLM-guided Hierarchical Retrieval Open

Nilesh Gupta, Wei-Cheng Chang, Ngot Bui, Cho‐Jui Hsieh, Inderjit S. Dhillon · 2025

Modern IR systems are increasingly tasked with answering complex, multi-faceted queries that require deep reasoning rather than simple keyword or semantic matching. While LLM-based IR has shown great promise, the prevailing retrieve-then-r…

POME: Post Optimization Model Edit via Muon-style Projection Open

Y. Liu, Da‐Wei Fu, Yang Luo, Zirui Zhu, Minhao Cheng , et al. · 2025

We introduce Post-Optimization Model Edit (POME), a new algorithm that enhances the performance of fine-tuned large language models using only their pretrained and fine-tuned checkpoints, without requiring extra data or further optimizatio…

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Open

Justin Cui, Jie Wu, Ming Li, Tao Yang, Xiaojie Li , et al. · 2025

Diffusion models have revolutionized image and video generation, achieving unprecedented visual quality. However, their reliance on transformer architectures incurs prohibitively high computational costs, particularly when extending genera…

Matryoshka Model Learning for Improved Elastic Student Models Open

Chetan Verma, Aditya Srinivas Timmaraju, Cho‐Jui Hsieh, Suyash Damle, Ngot Bui , et al. · 2025

Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development. In this paper, we propose MatTA, a framework for training multiple accurate Student m…

Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs Open

Yunqi Hong, Sohyun An, Andrew Bai, Neil Y. C. Lin, Cho‐Jui Hsieh · 2025

Despite Multimodal Large Language Models (MLLMs) showing promising results on general zero-shot image classification tasks, fine-grained image classification remains challenging. It demands precise attention to subtle visual details to dis…

Matryoshka Model Learning for Improved Elastic Student Models Open

Chetan Verma, Aditya Srinivas Timmaraju, Cho‐Jui Hsieh, Suyash Damle, Ngot Bui , et al. · 2025

Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development. In this paper, we propose MatTA, a framework for training multiple accurate Student m…

Provably Robust Training of Quantum Circuit Classifiers Against Parameter Noise Open

Lucas Tecot, Di Luo, Cho‐Jui Hsieh · 2025

Advancements in quantum computing have spurred significant interest in harnessing its potential for speedups over classical systems. However, noise remains a major obstacle to achieving reliable quantum algorithms. In this work, we present…

<span>UniDEC</span> : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification Open

Siddhant Kharbanda, Devaansh Gupta, K Gururaj, Pankaj Malhotra, Amit Prakash Singh , et al. · 2025

An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning Open

Andrew Bai, Chao-Yang Yeh, Cho‐Jui Hsieh, A B Taly · 2025

Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review Open

Neha Prakriya, J. Yen, Cho‐Jui Hsieh, Jason Cong · 2025

QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models Open

Kuei-Chun Kao, Hsu Tzu-Yin, Yunqi Hong, Ruochen Wang, Cho‐Jui Hsieh · 2025

Stabilizing Differentiable Architecture Search via Perturbation-based Regularization Open

Xiangning Chen, Cho‐Jui Hsieh · 2025

Differentiable architecture search (DARTS) is a prevailing NAS solution to identify architectures. Based on the continuous relaxation of the architecture space, DARTS learns a differentiable architecture weight and largely reduces the sear…

Neural Network Verification with Branch-and-Bound for General Nonlinearities Open

Zhouxing Shi, Qirui Jin, J. Zico Kolter, Suman Jana, Cho‐Jui Hsieh , et al. · 2025

Branch-and-bound (BaB) is among the most effective techniques for neural network (NN) verification. However, existing works on BaB for NN verification have mostly focused on NNs with piecewise linear activations, especially ReLU networks. …

SoundnessBench: A Soundness Benchmark for Neural Network Verifiers Open

Xingjian Zhou, Hongji Xu, Y. Xu, Zhouxing Shi, Cho‐Jui Hsieh , et al. · 2024

Neural network (NN) verification aims to formally verify properties of NNs, which is crucial for ensuring the behavior of NN-based models in safety-critical applications. In recent years, the community has developed many NN verifiers and b…

Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control Open

Zhouxing Shi, Cho‐Jui Hsieh, Huan Zhang · 2024

We study the problem of learning verifiably Lyapunov-stable neural controllers that provably satisfy the Lyapunov asymptotic stability condition within a region-of-attraction (ROA). Unlike previous works that adopted counterexample-guided …

On the Loss of Context-awareness in General Instruction Fine-tuning Open

Yihan Wang, Andrew Bai, Nanyun Peng, Cho‐Jui Hsieh · 2024

Pre-trained Large Language Models (LLMs) require post-training methods such as supervised fine-tuning (SFT) on instruction-response pairs to enable instruction following. However, this process can potentially harm existing capabilities lea…

LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization Open

J. Yen, Si Si, Meng Zhao, G. B. Yu, Sai Surya Duvvuri , et al. · 2024

Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements. However, current LoRA optimizers lack transformation invariance, meaning the actual updates to the weights depends on…

Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review Open

Neha Prakriya, J. Yen, Cho‐Jui Hsieh, Jason Cong · 2024

Traditional Large Language Model (LLM) pretraining relies on autoregressive language modeling with randomly sampled data from web-scale datasets. Inspired by human learning techniques like spaced repetition, we hypothesize that random samp…

CLUE: Concept-Level Uncertainty Estimation for Large Language Models Open

Yu-Hsiang Wang, Andrew Bai, Che-Ping Tsai, Cho‐Jui Hsieh · 2024

Large Language Models (LLMs) have demonstrated remarkable proficiency in various natural language generation (NLG) tasks. Previous studies suggest that LLMs' generation process involves uncertainty. However, existing approaches to uncertai…

Gandalf: Learning Label-label Correlations in Extreme Multi-label Classification via Label Features Open

Siddhant Kharbanda, Devaansh Gupta, Erik Schultheis, Atmadeep Banerjee, Cho‐Jui Hsieh , et al. · 2024

Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices. Recent works in this domain have increasingly focused on a symmetri…

A refined reweighing technique for nondiscriminatory classification Open

Yuefeng Liang, Cho‐Jui Hsieh, Thomas C. M. Lee · 2024

Discrimination-aware classification methods remedy socioeconomic disparities exacerbated by machine learning systems. In this paper, we propose a novel data pre-processing technique that assigns weights to training instances in order to re…

Embedding Space Selection for Detecting Memorization and Fingerprinting in Generative Models Open

Jack He, Jianxing Zhao, Andrew Bai, Cho‐Jui Hsieh · 2024

In the rapidly evolving landscape of artificial intelligence, generative models such as Generative Adversarial Networks (GANs) and Diffusion Models have become cornerstone technologies, driving innovation in diverse fields from art creatio…

Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns? Open

Kuei-Chun Kao, Ruochen Wang, Cho‐Jui Hsieh · 2024

Large Language Models (LLMs) have demonstrated remarkable performance in solving math problems, a hallmark of human intelligence. Despite high success rates on current benchmarks; however, these often feature simple problems with only one …

One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts Open

Ruochen Wang, Sohyun An, Minhao Cheng, Tianyi Zhou, Sung Ju Hwang , et al. · 2024

Large Language Models (LLMs) exhibit strong generalization capabilities to novel tasks when prompted with language instructions and in-context demos. Since this ability sensitively depends on the quality of prompts, various methods have be…

On Discrete Prompt Optimization for Diffusion Models Open

Ruochen Wang, Ting Liu, Cho‐Jui Hsieh, Boqing Gong · 2024

This paper introduces the first gradient-based framework for prompt optimization in text-to-image diffusion models. We formulate prompt engineering as a discrete optimization problem over the language space. Two major challenges arise in e…

Large Language Models are Interpretable Learners Open

Ruochen Wang, Si Si, Felix Yu, D. Wiesmann, Cho‐Jui Hsieh , et al. · 2024

The trade-off between expressiveness and interpretability remains a core challenge when building human-centric predictive models for classification and decision-making. While symbolic rules offer interpretability, they often lack expressiv…

MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries? Open

Xirui Li, Hengguang Zhou, Ruochen Wang, Tianyi Zhou, Minhao Cheng , et al. · 2024

Humans are prone to cognitive distortions -- biased thinking patterns that lead to exaggerated responses to specific stimuli, albeit in very different contexts. This paper demonstrates that advanced Multimodal Large Language Models (MLLMs)…

JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits Open

Minzhou Pan, Yi Zeng, Xue Lin, Ning Yu, Cho‐Jui Hsieh , et al. · 2024

In this study, we investigate the vulnerability of image watermarks to diffusion-model-based image editing, a challenge exacerbated by the computational cost of accessing gradient information and the closed-source nature of many diffusion …

Cho‐Jui Hsieh YOU? Author Swipe