Cho‐Jui Hsieh
YOU?
Author Swipe
View article: Adaptive Diagnostic Reasoning Framework for Pathology with Multimodal Large Language Models
Adaptive Diagnostic Reasoning Framework for Pathology with Multimodal Large Language Models Open
AI tools in pathology have improved screening throughput, standardized quantification, and revealed prognostic patterns that inform treatment. However, adoption remains limited because most systems still lack the human-readable reasoning n…
View article: Compressing Many-Shots in In-Context Learning
Compressing Many-Shots in In-Context Learning Open
Large Language Models (LLMs) have been shown to be able to learn different tasks without explicit finetuning when given many input-output examples / demonstrations through In-Context Learning (ICL). Increasing the number of examples, calle…
View article: LLM-guided Hierarchical Retrieval
LLM-guided Hierarchical Retrieval Open
Modern IR systems are increasingly tasked with answering complex, multi-faceted queries that require deep reasoning rather than simple keyword or semantic matching. While LLM-based IR has shown great promise, the prevailing retrieve-then-r…
View article: POME: Post Optimization Model Edit via Muon-style Projection
POME: Post Optimization Model Edit via Muon-style Projection Open
We introduce Post-Optimization Model Edit (POME), a new algorithm that enhances the performance of fine-tuned large language models using only their pretrained and fine-tuned checkpoints, without requiring extra data or further optimizatio…
View article: Self-Forcing++: Towards Minute-Scale High-Quality Video Generation
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Open
Diffusion models have revolutionized image and video generation, achieving unprecedented visual quality. However, their reliance on transformer architectures incurs prohibitively high computational costs, particularly when extending genera…
View article: Matryoshka Model Learning for Improved Elastic Student Models
Matryoshka Model Learning for Improved Elastic Student Models Open
Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development. In this paper, we propose MatTA, a framework for training multiple accurate Student m…
View article: Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs
Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs Open
Despite Multimodal Large Language Models (MLLMs) showing promising results on general zero-shot image classification tasks, fine-grained image classification remains challenging. It demands precise attention to subtle visual details to dis…
View article: Matryoshka Model Learning for Improved Elastic Student Models
Matryoshka Model Learning for Improved Elastic Student Models Open
Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development. In this paper, we propose MatTA, a framework for training multiple accurate Student m…
View article: Provably Robust Training of Quantum Circuit Classifiers Against Parameter Noise
Provably Robust Training of Quantum Circuit Classifiers Against Parameter Noise Open
Advancements in quantum computing have spurred significant interest in harnessing its potential for speedups over classical systems. However, noise remains a major obstacle to achieving reliable quantum algorithms. In this work, we present…
View article: <scp>UniDEC</scp> : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification
<span>UniDEC</span> : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification Open
View article: An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning
An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning Open
View article: Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review Open
View article: QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models
QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models Open
View article: Stabilizing Differentiable Architecture Search via Perturbation-based Regularization
Stabilizing Differentiable Architecture Search via Perturbation-based Regularization Open
Differentiable architecture search (DARTS) is a prevailing NAS solution to identify architectures. Based on the continuous relaxation of the architecture space, DARTS learns a differentiable architecture weight and largely reduces the sear…
View article: Neural Network Verification with Branch-and-Bound for General Nonlinearities
Neural Network Verification with Branch-and-Bound for General Nonlinearities Open
Branch-and-bound (BaB) is among the most effective techniques for neural network (NN) verification. However, existing works on BaB for NN verification have mostly focused on NNs with piecewise linear activations, especially ReLU networks. …
View article: SoundnessBench: A Soundness Benchmark for Neural Network Verifiers
SoundnessBench: A Soundness Benchmark for Neural Network Verifiers Open
Neural network (NN) verification aims to formally verify properties of NNs, which is crucial for ensuring the behavior of NN-based models in safety-critical applications. In recent years, the community has developed many NN verifiers and b…
View article: Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control
Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control Open
We study the problem of learning verifiably Lyapunov-stable neural controllers that provably satisfy the Lyapunov asymptotic stability condition within a region-of-attraction (ROA). Unlike previous works that adopted counterexample-guided …
View article: On the Loss of Context-awareness in General Instruction Fine-tuning
On the Loss of Context-awareness in General Instruction Fine-tuning Open
Pre-trained Large Language Models (LLMs) require post-training methods such as supervised fine-tuning (SFT) on instruction-response pairs to enable instruction following. However, this process can potentially harm existing capabilities lea…
View article: LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization Open
Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements. However, current LoRA optimizers lack transformation invariance, meaning the actual updates to the weights depends on…
View article: Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review Open
Traditional Large Language Model (LLM) pretraining relies on autoregressive language modeling with randomly sampled data from web-scale datasets. Inspired by human learning techniques like spaced repetition, we hypothesize that random samp…
View article: CLUE: Concept-Level Uncertainty Estimation for Large Language Models
CLUE: Concept-Level Uncertainty Estimation for Large Language Models Open
Large Language Models (LLMs) have demonstrated remarkable proficiency in various natural language generation (NLG) tasks. Previous studies suggest that LLMs' generation process involves uncertainty. However, existing approaches to uncertai…
View article: Gandalf: Learning Label-label Correlations in Extreme Multi-label Classification via Label Features
Gandalf: Learning Label-label Correlations in Extreme Multi-label Classification via Label Features Open
Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices. Recent works in this domain have increasingly focused on a symmetri…
View article: A refined reweighing technique for nondiscriminatory classification
A refined reweighing technique for nondiscriminatory classification Open
Discrimination-aware classification methods remedy socioeconomic disparities exacerbated by machine learning systems. In this paper, we propose a novel data pre-processing technique that assigns weights to training instances in order to re…
View article: Embedding Space Selection for Detecting Memorization and Fingerprinting in Generative Models
Embedding Space Selection for Detecting Memorization and Fingerprinting in Generative Models Open
In the rapidly evolving landscape of artificial intelligence, generative models such as Generative Adversarial Networks (GANs) and Diffusion Models have become cornerstone technologies, driving innovation in diverse fields from art creatio…
View article: Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?
Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns? Open
Large Language Models (LLMs) have demonstrated remarkable performance in solving math problems, a hallmark of human intelligence. Despite high success rates on current benchmarks; however, these often feature simple problems with only one …
View article: One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts Open
Large Language Models (LLMs) exhibit strong generalization capabilities to novel tasks when prompted with language instructions and in-context demos. Since this ability sensitively depends on the quality of prompts, various methods have be…
View article: On Discrete Prompt Optimization for Diffusion Models
On Discrete Prompt Optimization for Diffusion Models Open
This paper introduces the first gradient-based framework for prompt optimization in text-to-image diffusion models. We formulate prompt engineering as a discrete optimization problem over the language space. Two major challenges arise in e…
View article: Large Language Models are Interpretable Learners
Large Language Models are Interpretable Learners Open
The trade-off between expressiveness and interpretability remains a core challenge when building human-centric predictive models for classification and decision-making. While symbolic rules offer interpretability, they often lack expressiv…
View article: MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?
MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries? Open
Humans are prone to cognitive distortions -- biased thinking patterns that lead to exaggerated responses to specific stimuli, albeit in very different contexts. This paper demonstrates that advanced Multimodal Large Language Models (MLLMs)…
View article: JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits
JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits Open
In this study, we investigate the vulnerability of image watermarks to diffusion-model-based image editing, a challenge exacerbated by the computational cost of accessing gradient information and the closed-source nature of many diffusion …