Vincent Y. F. Tan
YOU?
Author Swipe
View article: Parameter-free Algorithms for the Stochastically Extended Adversarial Model
Parameter-free Algorithms for the Stochastically Extended Adversarial Model Open
We develop the first parameter-free algorithms for the Stochastically Extended Adversarial (SEA) model, a framework that bridges adversarial and stochastic online convex optimization. Existing approaches for the SEA model require prior kno…
View article: Muon Outperforms Adam in Tail-End Associative Memory Learning
Muon Outperforms Adam in Tail-End Associative Memory Learning Open
The Muon optimizer is consistently faster than Adam in training Large Language Models (LLMs), yet the mechanism underlying its success remains unclear. This paper demystifies this mechanism through the lens of associative memory. By ablati…
View article: Memory Limitations of Prompt Tuning in Transformers
Memory Limitations of Prompt Tuning in Transformers Open
Despite the empirical success of prompt tuning in adapting pretrained language models to new tasks, theoretical analyses of its capabilities remain limited. Existing theoretical work primarily addresses universal approximation properties, …
View article: Algorithm unrolling for solving inverse problems in signal and image processing
Algorithm unrolling for solving inverse problems in signal and image processing Open
International audience
View article: Automatic Rank Determination for Low-Rank Adaptation via Submodular Function Maximization
Automatic Rank Determination for Low-Rank Adaptation via Submodular Function Maximization Open
In this paper, we propose SubLoRA, a rank determination method for Low-Rank Adaptation (LoRA) based on submodular function maximization. In contrast to prior approaches, such as AdaLoRA, that rely on first-order (linearized) approximations…
View article: Immune Checkpoint Inhibitors for Metastatic Colorectal Cancer: A Systematic Review
Immune Checkpoint Inhibitors for Metastatic Colorectal Cancer: A Systematic Review Open
Background: Colorectal cancer is a significant health concern. Immunotherapy has become a promising approach in colorectal cancer, offering a wider array of therapeutic strategies. This study aims to summarize the current evidence regardin…
View article: Finite-Time Minimax Bounds and an Optimal Lyapunov Policy in Queueing Control
Finite-Time Minimax Bounds and an Optimal Lyapunov Policy in Queueing Control Open
We introduce an original minimax framework for finite-time performance analysis in queueing control and propose a surprisingly simple Lyapunov-based scheduling policy with superior finite-time performance. The framework quantitatively char…
View article: Log-Sum-Exponential Estimator for Off-Policy Evaluation and Learning
Log-Sum-Exponential Estimator for Off-Policy Evaluation and Learning Open
Off-policy learning and evaluation leverage logged bandit feedback datasets, which contain context, action, propensity score, and feedback for each data point. These scenarios face significant challenges due to high variance and poor perfo…
View article: Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget
Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget Open
The challenge of identifying the best feasible arm within a fixed budget has attracted considerable interest in recent years. However, a notable gap remains in the literature: the exact exponential rate at which the error probability appro…
View article: Best Arm Identification with Possibly Biased Offline Data
Best Arm Identification with Possibly Biased Offline Data Open
We study the best arm identification (BAI) problem with potentially biased offline data in the fixed confidence setting, which commonly arises in real-world scenarios such as clinical trials. We prove an impossibility result for adaptive a…
View article: BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms Open
Speculative decoding has emerged as a popular method to accelerate the inference of Large Language Models (LLMs) while retaining their superior text generation performance. Previous methods either adopt a fixed speculative decoding configu…
View article: p-Mean Regret for Stochastic Bandits
p-Mean Regret for Stochastic Bandits Open
In this work, we extend the concept of the p-mean welfare objective from social choice theory to study p-mean regret in stochastic multi-armed bandit problems. The p-mean regret, defined as the difference between the optimal mean among the…
View article: Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework Open
Auto-Regressive Video Diffusion Models (AR-VDMs) have shown strong capabilities in generating long, photorealistic videos, but suffer from two key limitations: (i) history forgetting, where the model loses track of previously generated con…
View article: Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks
Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks Open
Kolmogorov--Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptions (MLPs) in various domains, especially for science-related tasks. However, transfer learning of KANs remains a relatively unex…
View article: Ensemble-Tight Second-Order Asymptotics and Exponents for Guessing-Based Decoding with Abandonment
Ensemble-Tight Second-Order Asymptotics and Exponents for Guessing-Based Decoding with Abandonment Open
This paper considers guessing-based decoders with abandonment for discrete memoryless channels in which all codewords have the same composition. This class of decoders rank-orders all input sequences in the codebook's composition class fro…
View article: Optimal Multi-Objective Best Arm Identification with Fixed Confidence
Optimal Multi-Objective Best Arm Identification with Fixed Confidence Open
We consider a multi-armed bandit setting with finitely many arms, in which each arm yields an $M$-dimensional vector reward upon selection. We assume that the reward of each dimension (a.k.a. {\em objective}) is generated independently of …
View article: A General Framework for Clustering and Distribution Matching With Bandit Feedback
A General Framework for Clustering and Distribution Matching With Bandit Feedback Open
We develop a general framework for clustering and distribution matching problems with bandit feedback. We consider a $K$-armed bandit model where some subset of $K$ arms is partitioned into $M$ groups. Within each group, the random variabl…
View article: Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory
Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory Open
Despite the considerable progress achieved in the long video generation problem, there is still significant room to improve the consistency of the videos, particularly in terms of smoothness and transitions between scenes. We address these…
View article: p-Mean Regret for Stochastic Bandits
p-Mean Regret for Stochastic Bandits Open
In this work, we extend the concept of the $p$-mean welfare objective from social choice theory (Moulin 2004) to study $p$-mean regret in stochastic multi-armed bandit problems. The $p$-mean regret, defined as the difference between the op…
View article: Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning
Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning Open
Semi-supervised learning (SSL), exemplified by FixMatch (Sohn et al., 2020), has shown significant generalization advantages over supervised learning (SL), particularly in the context of deep neural networks (DNNs). However, it is still un…
View article: On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks
On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks Open
Kolmogorov--Arnold Networks (KANs), a recently proposed neural network architecture, have gained significant attention in the deep learning community, due to their potential as a viable alternative to multi-layer perceptrons (MLPs) and the…
View article: Almost Minimax Optimal Best Arm Identification in Piecewise Stationary Linear Bandits
Almost Minimax Optimal Best Arm Identification in Piecewise Stationary Linear Bandits Open
We propose a {\em novel} piecewise stationary linear bandit (PSLB) model, where the environment randomly samples a context from an unknown probability distribution at each changepoint, and the quality of an arm is measured by its return av…
View article: Stochastic Bandits for Egalitarian Assignment
Stochastic Bandits for Egalitarian Assignment Open
We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each us…
View article: Best Arm Identification with Minimal Regret
Best Arm Identification with Minimal Regret Open
Motivated by real-world applications that necessitate responsible experimentation, we introduce the problem of best arm identification (BAI) with minimal regret. This innovative variant of the multi-armed bandit problem elegantly amalgamat…
View article: A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval
A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval Open
In this work, we study the robust phase retrieval problem where the task is to recover an unknown signal $θ^* \in \mathbb{R}^d$ in the presence of potentially arbitrarily corrupted magnitude-only linear measurements. We propose an alternat…
View article: LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization
LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization Open
We study a robust online convex optimization framework, where an adversary can introduce outliers by corrupting loss functions in an arbitrary number of rounds k, unknown to the learner. Our focus is on a novel setting allowing unbounded d…
View article: A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent
A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent Open
Distributed gradient descent algorithms have come to the fore in modern machine learning, especially in parallelizing the handling of large datasets that are distributed across several workers. However, scant attention has been paid to ana…
View article: Influence Maximization via Graph Neural Bandits
Influence Maximization via Graph Neural Bandits Open
We consider a ubiquitous scenario in the study of Influence Maximization (IM), in which there is limited knowledge about the topology of the diffusion network. We set the IM problem in a multi-round diffusion campaign, aiming to maximize t…
View article: Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback Open
We consider offline reinforcement learning (RL) with preference feedback in which the implicit reward is a linear function of an unknown parameter. Given an offline dataset, our objective consists in ascertaining the optimal action for eac…
View article: MIMO Capacity Analysis and Channel Estimation for Electromagnetic Information Theory
MIMO Capacity Analysis and Channel Estimation for Electromagnetic Information Theory Open
Electromagnetic information theory (EIT) is an interdisciplinary subject that serves to integrate deterministic electromagnetic theory with stochastic Shannon's information theory. Existing EIT analysis operates in the continuous space dom…