Explanipedia

LatentCRF: Continuous CRF for Efficient Latent Diffusion Open

Kanchana Ranasinghe, Sadeep Jayasumana, Andreas Veit, Ayan Chakrabarti, Daniel Gläsner , et al. · 2024

Physics

Latent Diffusion Models (LDMs) produce high-quality, photo-realistic images, however, the latency incurred by multiple costly inference iterations can restrict their applicability. We introduce LatentCRF, a continuous Conditional Random Fi…

Efficient Document Ranking with Learnable Late Interactions Open

Ziwei Ji, Himanshu Jain, Andreas Veit, Sashank J. Reddi, Sadeep Jayasumana , et al. · 2024

Computer science

Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized …

Rethinking FID: Towards a Better Evaluation Metric for Image Generation Open

Sadeep Jayasumana, Srikumar Ramalingam, Andreas Veit, Daniel Gläsner, Ayan Chakrabarti , et al. · 2023

Computer science Mathematics Political science

As with many machine learning problems, the progress of image generation methods hinges on good evaluation metrics. One of the most popular is the Frechet Inception Distance (FID). FID estimates the distance between a distribution of Incep…

MarkovGen: Structured Prediction for Efficient Text-to-Image Generation Open

Sadeep Jayasumana, Daniel Gläsner, Srikumar Ramalingam, Andreas Veit, Ayan Chakrabarti , et al. · 2023

Computer science

Modern text-to-image generation models produce high-quality images that are both photorealistic and faithful to the text prompts. However, this quality comes at significant computational cost: nearly all of these models are iterative and r…

Large Language Models with Controllable Working Memory Open

Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michał Łukasik , et al. · 2023

Computer science Chemistry Business

Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP), partly owing to the massive amounts of world knowledge they memorize during pretraining.While many downstream applications provide the…

Large Language Models with Controllable Working Memory Open

Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michał Łukasik , et al. · 2022

Computer science Psychology Biology

Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP), owing to their excellent understanding and generation abilities. Remarkably, what further sets these models apart is the massive amoun…

When does mixup promote local linearity in learned representations? Open

Arslan Chaudhry, Aditya Krishna Menon, Andreas Veit, Sadeep Jayasumana, Srikumar Ramalingam , et al. · 2022

Computer science Mathematics Political science

Mixup is a regularization technique that artificially produces new samples using convex combinations of original training points. This simple technique has shown strong empirical performance, and has been heavily used as part of semi-super…

Teacher Guided Training: An Efficient Framework for Knowledge Transfer Open

Manzil Zaheer, Ankit Singh Rawat, Seung‐Yeon Kim, Chong You, Himanshu Jain , et al. · 2022

Computer science Mathematics Materials science

The remarkable performance gains realized by large pretrained models, e.g., GPT-3, hinge on the massive amounts of data they are exposed to during training. Analogously, distilling such large models to compact models for efficient deployme…

Leveraging redundancy in attention with Reuse Transformers Open

Srinadh Bhojanapalli, Ayan Chakrabarti, Andreas Veit, Michał Łukasik, Himanshu Jain , et al. · 2021

Computer science Engineering

Pairwise dot product-based attention allows Transformers to exchange information between tokens in an input-dependent way, and is key to their success across diverse applications in language and vision. However, a typical Transformer model…

Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation Open

Srinadh Bhojanapalli, Ayan Chakrabarti, Himanshu Jain, Sanjiv Kumar, Michał Łukasik , et al. · 2021

Computer science Mathematics Engineering

State-of-the-art transformer models use pairwise dot-product based self-attention, which comes at a computational cost quadratic in the input sequence length. In this paper, we investigate the global structure of attention scores computed …

Understanding Robustness of Transformers for Image Classification Open

Srinadh Bhojanapalli, Ayan Chakrabarti, Daniel Gläsner, Daliang Li, Thomas Unterthiner , et al. · 2021

Computer science Engineering Chemistry

Deep Convolutional Neural Networks (CNNs) have long been the architecture of choice for computer vision tasks. Recently, Transformer-based architectures like Vision Transformer (ViT) have matched or even surpassed ResNets for image classif…

On the Reproducibility of Neural Network Predictions Open

Srinadh Bhojanapalli, Michael J. Wilber, Andreas Veit, Ankit Singh Rawat, Seung‐Yeon Kim , et al. · 2021

Computer science Mathematics

Standard training techniques for neural networks involve multiple sources of randomness, e.g., initialization, mini-batch ordering and in some cases data augmentation. Given that neural networks are heavily over-parameterized in practice, …

Improving Calibration in Deep Metric Learning With Cross-Example Softmax Open

Andreas Veit, Michael J. Wilber · 2020

Computer science Mathematics Political science

Modern image retrieval systems increasingly rely on the use of deep neural networks to learn embedding spaces in which distance encodes the relevance between a given query and image. In this setting, existing approaches tend to emphasize o…

Coping with Label Shift via Distributionally Robust Optimisation Open

Jingzhao Zhang, Aditya Krishna Menon, Andreas Veit, Srinadh Bhojanapalli, Sanjiv Kumar , et al. · 2020

Computer science Mathematics Psychology

The label shift problem refers to the supervised learning setting where the train and test label distributions do not match. Existing work addressing label shift usually assumes access to an \emph{unlabelled} test sample. This sample may b…

Long-tail learning via logit adjustment Open

Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit , et al. · 2020

Computer science Mathematics

Real-world classification problems typically exhibit an imbalanced or long-tailed label distribution, wherein many labels are associated with only a few samples. This poses a challenge for generalisation on such labels, and also makes naïv…

Doubly-stochastic mining for heterogeneous retrieval Open

Ankit Singh Rawat, Aditya Krishna Menon, Andreas Veit, Felix X. Yu, Sashank J. Reddi , et al. · 2020

Computer science

Modern retrieval problems are characterised by training sets with potentially billions of labels, and heterogeneous data distributions across subpopulations (e.g., users of a retrieval system may be from different countries), each of which…

Why are Adaptive Methods Good for Attention Models? Open

Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J. Reddi , et al. · 2019

Computer science Philosophy Economics

While stochastic gradient descent (SGD) is still the \emph{de facto} algorithm in deep learning, adaptive methods like Clipped SGD/Adam have been observed to outperform SGD across important tasks, such as attention models. The settings und…

How To Backdoor Federated Learning Open

Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, Vitaly Shmatikov · 2018

Computer science Engineering Philosophy

Federated learning enables thousands of participants to construct a deep learning model without sharing their private training data with each other. For example, multiple smartphones can jointly train a next-word predictor for keyboards wi…

Semantic Segmentation with Scarce Data Open

Isay Katsman, Rohun Tripathi, Andreas Veit, Serge Belongie · 2018

Computer science

Semantic segmentation is a challenging vision problem that usually necessitates the collection of large amounts of finely annotated data, which is often quite expensive to obtain. Coarsely annotated data provides an interesting alternative…

Learning to Evaluate Image Captioning Open

Yin Cui, Guandao Yang, Andreas Veit, Xun Huang, Serge Belongie · 2018

Computer science Mathematics Economics

Evaluation metrics for image captioning face two challenges. Firstly, commonly used metrics such as CIDEr, METEOR, ROUGE and BLEU often do not correlate well with human judgments. Secondly, each metric has well known blind spots to patholo…

Separating Self-Expression and Visual Content in Hashtag Supervision Open

Andreas Veit, Maximilian Nickel, Serge Belongie, Laurens van der Maaten · 2018

Computer science Mathematics Philosophy

The variety, abundance, and structured nature of hashtags make them an interesting data source for training vision models. For instance, hashtags have the potential to significantly reduce the problem of manual supervision and annotation w…

Convolutional Networks with Adaptive Computation Graphs. Open

Andreas Veit, Serge Belongie · 2017

Computer science Chemistry

Do convolutional networks really need a fixed feed-forward structure? Often, a neural network is already confident after a few layers about the high-level concept shown in the image. However, due to the fixed network structure, all remaini…

Conditional Similarity Networks Open

Andreas Veit, Serge Belongie, Theofanis Karaletsos · 2017

Computer science Mathematics Philosophy

What makes images similar? To measure the similarity between images, they are typically embedded in a feature-vector space, in which their distance preserve the relative dissimilarity. However, when learning such similarity embeddings the …

Deep Learning is Robust to Massive Label Noise Open

David Rolnick, Andreas Veit, Serge Belongie, Nir Shavit · 2017

Medicine Computer science

Deep neural networks trained on large supervised datasets have led to impressive results in image classification and other tasks. However, well-annotated datasets can be time-consuming and expensive to collect, lending increased interest t…

Learning From Noisy Large-Scale Datasets With Minimal Supervision Open

Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta , et al. · 2017

Computer science Physics

We present an approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations. One common approach to combine clean and noisy data…

Residual Networks Behave Like Ensembles of Relatively Shallow Networks Open

Andreas Veit, Michael J. Wilber, Serge Belongie · 2016

Computer science

In this work we propose a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length. Moreover, residual networks seem to enable very deep networks by leveraging only the short…

Residual Networks are Exponential Ensembles of Relatively Shallow Networks. Open

Andreas Veit, Michael J. Wilber, Serge Belongie · 2016

Computer science Mathematics Physics

In this work, we introduce a novel interpretation of residual networks showing they are exponential ensembles. This observation is supported by a large-scale lesion study that demonstrates they behave just like ensembles at test time. Subs…

Disentangling Nonlinear Perceptual Embeddings With Multi-Query Triplet Networks. Open

Andreas Veit, Serge Belongie, Theofanis Karaletsos · 2016

Computer science Mathematics Psychology

In typical perceptual tasks, higher-order concepts are inferred from visual features to assist with perceptual decision making. However, there is a multitude of visual concepts which can be inferred from a single stimulus. When learning no…

COCO-Text: Dataset and Benchmark for Text Detection and Recognition in\n Natural Images Open

Andreas Veit, Tomáš Matera, Lukáš Neumann, Jiřı́ Matas, Serge Belongie · 2016

Computer science Geography Mathematics

This paper describes the COCO-Text dataset. In recent years large-scale\ndatasets like SUN and Imagenet drove the advancement of scene understanding and\nobject recognition. The goal of COCO-Text is to advance state-of-the-art in\ntext det…

Andreas Veit YOU? Author Swipe