Explanipedia

Restructuring Vector Quantization with the Rotation Trick Open

Christopher Fifty, Ronald G. Junkins, D. Duan, Aniketh Iger, J Liu , et al. · 2024

Vector Quantized Variational AutoEncoders (VQ-VAEs) are designed to compress a continuous input to a discrete latent space and reconstruct it with minimal distortion. They operate by maintaining a set of vectors -- often referred to as the…

RANK-SMOOTHED PAIRWISE LEARNING IN PERCEPTUAL QUALITY ASSESSMENT Open

Hossein Talebi, Ehsan Amid, Peyman Milanfar, Manfred K. Warmuth · 2024

Conducting pairwise comparisons is a widely used approach in curating human perceptual preference data. Typically raters are instructed to make their choices according to a specific set of rules that address certain dimensions of image qua…

Optimal Transport with Tempered Exponential Measures Open

Ehsan Amid, Frank Nielsen, Richard Nock, Manfred K. Warmuth · 2024

In the field of optimal transport, two prominent subfields face each other: (i) unregularized optimal transport, ``a-la-Kantorovich'', which leads to extremely sparse plans but with algorithms that scale poorly, and (ii) entropic-regulariz…

Learning from straggler clients in federated learning Open

Andrew Hard, Antonious M. Girgis, Ehsan Amid, Sean Augenstein, Lara McConnaughey , et al. · 2024

How well do existing federated learning algorithms learn from client devices that return model updates with a significant time delay? Is it even possible to learn effectively from clients that report back minutes, hours, or days after bein…

Noise misleads rotation invariant algorithms on sparse targets Open

Manfred K. Warmuth, Wojciech Kotłowski, Matt Jones, Ehsan Amid · 2024

It is well known that the class of rotation invariant algorithms are suboptimal even for learning sparse linear problems when the number of examples is below the "dimension" of the problem. This class includes any gradient descent trained …

Tempered Calculus for ML: Application to Hyperbolic Model Embedding Open

Richard Nock, Ehsan Amid, Frank Nielsen, Alexander Soen, Manfred K. Warmuth · 2024

Most mathematical distortions used in ML are fundamentally integral in nature: $f$-divergences, Bregman divergences, (regularized) optimal transport distances, integral probability metrics, geodesic distances, etc. In this paper, we unveil…

The Tempered Hilbert Simplex Distance and Its Application To Non-linear Embeddings of TEMs Open

Ehsan Amid, Frank Nielsen, Richard Nock, Manfred K. Warmuth · 2023

Tempered Exponential Measures (TEMs) are a parametric generalization of the exponential family of distributions maximizing the tempered entropy function among positive measures subject to a probability normalization of their power densitie…

Context-Aware Meta-Learning Open

Christopher Fifty, D. Duan, Ronald G. Junkins, Ehsan Amid, Jure Leskovec , et al. · 2023

Large Language Models like ChatGPT demonstrate a remarkable capacity to learn new concepts during inference without any fine-tuning. However, visual models trained to detect new objects during inference have been unable to replicate this a…

Heterogeneous Federated Learning Using Knowledge Codistillation Open

Jared Lichtarge, Ehsan Amid, Shankar Kumar, Tien-Ju Yang, Rohan Anil , et al. · 2023

Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model pe…

Distributionally Robust Post-hoc Classifiers under Prior Shifts Open

Jiaheng Wei, Harikrishna Narasimhan, Ehsan Amid, Wen–Sheng Chu, Yang Liu , et al. · 2023

The generalization ability of machine learning models degrades significantly when the test distribution shifts away from the training distribution. We investigate the problem of training models that are robust to shifts caused by changes i…

Optimal Transport with Tempered Exponential Measures Open

Ehsan Amid, Frank Nielsen, Richard Nock, Manfred K. Warmuth · 2023

In the field of optimal transport, two prominent subfields face each other: (i) unregularized optimal transport, "à-la-Kantorovich", which leads to extremely sparse plans but with algorithms that scale poorly, and (ii) entropic-regularized…

To Aggregate or Not? Learning with Separate Noisy Labels Open

Jiaheng Wei, Zhaowei Zhu, Tianyi Luo, Ehsan Amid, Abhishek Kumar , et al. · 2023

The rawly collected training data often comes with separate noisy labels collected from multiple imperfect annotators (e.g., via crowdsourcing). A typical way of using these separate labels is to first aggregate them into one and apply sta…

Benchmarking Neural Network Training Algorithms Open

George E. Dahl, Frank Schneider, Zachary Nado, Naman Agarwal, Chandramouli Shama Sastry , et al. · 2023

Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning…

Boosting with Tempered Exponential Measures Open

Richard Nock, Ehsan Amid, Manfred K. Warmuth · 2023

One of the most popular ML algorithms, AdaBoost, can be derived from the dual of a relative entropy minimization problem subject to the fact that the positive weights on the examples sum to one. Essentially, harder examples receive higher …

Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular Property Prediction Open

Christopher Fifty, Joseph M. Paggi, Ehsan Amid, Jure Leskovec, Ron O. Dror · 2023

Few-shot learning is a promising approach to molecular property prediction as supervised data is often very limited. However, many important molecular properties depend on complex molecular characteristics -- such as the various 3D geometr…

Clustering above Exponential Families with Tempered Exponential Measures Open

Ehsan Amid, Richard Nock, Manfred K. Warmuth · 2022

The link with exponential families has allowed $k$-means clustering to be generalized to a wide variety of data generating distributions in exponential families and clustering distortions among Bregman divergences. Getting the framework to…

Layerwise Bregman Representation Learning with Applications to Knowledge Distillation Open

Ehsan Amid, Rohan Anil, Christopher Fifty, Manfred K. Warmuth · 2022

In this work, we propose a novel approach for layerwise representation learning of a trained neural network. In particular, we form a Bregman divergence based on the layer's transfer function and construct an extension of the original Breg…

To Aggregate or Not? Learning with Separate Noisy Labels Open

Jiaheng Wei, Zhaowei Zhu, Tianyi Luo, Ehsan Amid, Abhishek Kumar , et al. · 2022

The rawly collected training data often comes with separate noisy labels collected from multiple imperfect annotators (e.g., via crowdsourcing). A typical way of using these separate labels is to first aggregate them into one and apply sta…

Extracting Targeted Training Data from ASR Models, and How to Mitigate It Open

Ehsan Amid, Om Thakkar, Arun Narayanan, Rajiv Mathews, Françoise Beaufays · 2022

Recent work has designed methods to demonstrate that model updates in ASR training can leak potentially sensitive attributes of the utterances used in computing the updates. In this work, we design the first method to demonstrate informati…

Learning from Randomly Initialized Neural Network Features Open

Ehsan Amid, Rohan Anil, Wojciech Kotłowski, Manfred K. Warmuth · 2022

We present the surprising result that randomly initialized neural networks are good feature extractors in expectation. These random features correspond to finite-sample realizations of what we call Neural Network Prior Kernel (NNPK), which…

Step-size Adaptation Using Exponentiated Gradient Updates Open

Ehsan Amid, Rohan Anil, Christopher Fifty, Manfred K. Warmuth · 2022

Optimizers like Adam and AdaGrad have been very successful in training large-scale neural networks. Yet, the performance of these methods is heavily dependent on a carefully tuned learning rate schedule. We show that in many large-scale ap…

Public Data-Assisted Mirror Descent for Private Model Training Open

Ehsan Amid, Arun Ganesh, Rajiv Mathews, Swaroop Ramaswamy, Shuang Song , et al. · 2021

In this paper, we revisit the problem of using in-distribution public data to improve the privacy/utility trade-offs for differentially private (DP) model training. (Here, public data refers to auxiliary data sets that have no privacy conc…

Constrained Instance and Class Reweighting for Robust Learning under Label Noise Open

Abhishek Kumar, Ehsan Amid · 2021

Deep neural networks have shown impressive performance in supervised learning, enabled by their ability to fit well to the provided training data. However, their performance is largely dependent on the quality of the training data and ofte…

Constrained Instance and Class Reweighting for Robust Learning under\n Label Noise Open

Abhishek Kumar, Ehsan Amid · 2021

Deep neural networks have shown impressive performance in supervised\nlearning, enabled by their ability to fit well to the provided training data.\nHowever, their performance is largely dependent on the quality of the training\ndata and o…

Privacy-Preserving Wireless Federated Learning Exploiting Inherent Hardware Impairments Open

Sina Rezaei Aghdam, Ehsan Amid, Marija Furdek, Alexandre Graell i Amat · 2021

We consider a wireless federated learning system where multiple data holder edge devices collaborate to train a global model via sharing their parameter updates with an honest-but-curious parameter server. We demonstrate that the inherent …

Efficiently Identifying Task Groupings for Multi-Task Learning Open

Christopher Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil , et al. · 2021

Multi-task learning can leverage information learned by one task to benefit the training of other tasks. Despite this capacity, naively training all tasks together in one model often degrades performance, and exhaustively searching through…

LocoProp: Enhancing BackProp via Local Loss Optimization Open

Ehsan Amid, Rohan Anil, Manfred K. Warmuth · 2021

Second-order methods have shown state-of-the-art performance for optimizing deep neural networks. Nonetheless, their large memory requirement and high computational complexity, compared to first-order methods, hinder their versatility in a…

Exponentiated Gradient Reweighting for Robust Training Under Label Noise and Beyond Open

Negin Majidi, Ehsan Amid, Hossein Talebi, Manfred K. Warmuth · 2021

Many learning tasks in machine learning can be viewed as taking a gradient step towards minimizing the average loss of a batch of examples in each training iteration. When noise is prevalent in the data, this uniform treatment of examples …

Privacy-Preserving Wireless Federated Learning Exploiting Inherent Hardware Impairments Open

Sina Rezaei Aghdam, Ehsan Amid, Marija Furdek, Alexandre Graell i Amat · 2021

We consider a wireless federated learning system where multiple data holder edge devices collaborate to train a global model via sharing their parameter updates with an honest-but-curious parameter server. We demonstrate that the inherent …

Measuring and Harnessing Transference in Multi-Task Learning Open

Chris Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil , et al. · 2020

Multi-task learning can leverage information learned by one task to benefit the training of other tasks. Despite this capacity, naive formulations often degrade performance and in particular, identifying the tasks that would benefit from c…

Ehsan Amid YOU? Author Swipe