Explanipedia

From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty Open

Maor Ivgi, Ori Yoran, Jonathan Berant, Mor Geva · 2024

Computer science Economics

Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions. We propose to view these behaviors as fallbacks that models exhibit under epistemic uncertainty, and investigate the connect…

DataComp-LM: In search of the next generation of training sets for language models Open

Jeffrey Li, Alex Chengyu Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan , et al. · 2024

Computer science Geography

We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effect…

In-Context Learning with Long-Context Models: An In-Depth Exploration Open

Amanda Bertsch, Maor Ivgi, Uri Alon, Jonathan Berant, Matthew R. Gormley , et al. · 2024

Computer science Psychology Geology

As model context lengths continue to increase, the number of demonstrations that can be provided in-context approaches the size of entire training datasets. We study the behavior of in-context learning (ICL) at this extreme scale on multip…

Accelerated Parameter-Free Stochastic Optimization Open

Itai Kreisler, Maor Ivgi, Oliver Hinder, Yair Carmon · 2024

Computer science Mathematics

We propose a method that achieves near-optimal rates for smooth stochastic convex optimization and requires essentially no prior knowledge of problem parameters. This improves on prior work which requires knowing at least the initial dista…

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding Open

Uri Shaham, Maor Ivgi, Avia Efrat, Jonathan Berant, Omer Levy · 2023

Computer science Geography Chemistry

We introduce ZeroSCROLLS, a zero-shot benchmark for natural language understanding over long texts, which contains only test and small validation sets, without training data. We adapt six tasks from the SCROLLS benchmark, and add four new …

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule Open

Maor Ivgi, Oliver Hinder, Yair Carmon · 2023

Computer science Mathematics Philosophy

We propose a tuning-free dynamic SGD step size formula, which we call Distance over Gradients (DoG). The DoG step sizes depend on simple empirical quantities (distance from the initial point and norms of gradients) and have no ``learning r…

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding Open

Uri Shaham, Maor Ivgi, Avia Efrat, Jonathan Berant, Omer Levy · 2023

Computer science Philosophy Geology

We introduce ZeroSCROLLS, a zero-shot benchmark for natural language understanding over long texts, which contains only test and small validation sets, without training data. We adapt six tasks from the SCROLLS benchmark, and add four new …

Efficient Long-Text Understanding with Short-Text Models Open

Maor Ivgi, Uri Shaham, Jonathan Berant · 2023

Computer science Chemistry Geography

Transformer-based pretrained language models (LMs) are ubiquitous across natural language understanding, but cannot be applied to long sequences such as stories, scientific articles, and long documents due to their quadratic complexity. Wh…

Efficient Long-Text Understanding with Short-Text Models Open

Maor Ivgi, Uri Shaham, Jonathan Berant · 2022

Computer science Geography Physics

Transformer-based pretrained language models (LMs) are ubiquitous across natural language understanding, but cannot be applied to long sequences such as stories, scientific articles and long documents, due to their quadratic complexity. Wh…

Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments Open

Maor Ivgi, Yair Carmon, Jonathan Berant · 2022

Computer science Mathematics Political science

Neural scaling laws define a predictable relationship between a model's parameter count and its performance after training in the form of a power law. However, most research to date has not explicitly investigated whether scaling laws can …

SCROLLS: Standardized CompaRison Over Long Language Sequences Open

Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran , et al. · 2022

Computer science History Biology

NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild. We introduce SCROLLS, a suite of tasks that require reasoning over…

SCROLLS: Standardized CompaRison Over Long Language Sequences Open

Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran , et al. · 2022

Computer science Philosophy Mathematics

Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022.

Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments Open

Maor Ivgi, Yair Carmon, Jonathan Berant · 2022

Computer science Mathematics Physics

Neural scaling laws define a predictable relationship between a model's parameter count and its performance after training in the form of a power law. However, most research to date has not explicitly investigated whether scaling laws can …

Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics Open

Amirata Ghorbani, Dina Berenbaum, Maor Ivgi, Yuval Dafna, James Zou · 2021

Computer science Philosophy

Interpretability is becoming an active research topic as machine learning (ML) models are more widely used to make critical decisions. Tabular data are one of the most commonly used modes of data in diverse applications such as healthcare …

Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature\n Semantics Open

Amirata Ghorbani, Dina Berenbaum, Maor Ivgi, Yuval Dafna, James Zou · 2021

Computer science Philosophy

Interpretability is becoming an active research topic as machine learning\n(ML) models are more widely used to make critical decisions. Tabular data is\none of the most commonly used modes of data in diverse applications such as\nhealthcar…

Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics Open

Amirata Ghorbani, Dina Berenbaum, Maor Ivgi, Yuval Dafna, James Zou · 2021

Computer science Philosophy

Interpretability is becoming an active research topic as machine learning (ML) models are more widely used to make critical decisions. Tabular data is one of the most commonly used modes of data in diverse applications such as healthcare a…

Scene Graph tO Image Generation with Contextualized Object Layout Refinement Open

Maor Ivgi, Yaniv Benny, Avichai Ben-David, Jonathan Berant, Lior Wolf · 2021

Computer science Philosophy

Generating images from scene graphs is a challenging task that attracted substantial interest recently. Prior works have approached this task by generating an intermediate layout description of the target image. However, the representation…

Achieving Model Robustness through Discrete Adversarial Training Open

Maor Ivgi, Jonathan Berant · 2021

Computer science Chemistry

Discrete adversarial attacks are symbolic perturbations to a language input that preserve the output label but lead to a prediction error. While such attacks have been extensively explored for the purpose of evaluating model robustness, th…

Achieving Model Robustness through Discrete Adversarial Training Open

Maor Ivgi, Jonathan Berant · 2021

Computer science Chemistry

Discrete adversarial attacks are symbolic perturbations to a language input that preserve the output label but lead to a prediction error. While such attacks have been extensively explored for the purpose of evaluating model robustness, th…

Maor Ivgi YOU? Author Swipe