Benjamin Spector
YOU?
Author Swipe
View article: Trivalence and Transparency: a non-dynamic approach to anaphora
Trivalence and Transparency: a non-dynamic approach to anaphora Open
View article: ThunderKittens: Simple, Fast, and Adorable AI Kernels
ThunderKittens: Simple, Fast, and Adorable AI Kernels Open
The challenge of mapping AI architectures to GPU hardware is creating a critical bottleneck in AI progress. Despite substantial efforts, hand-written custom kernels fail to meet their theoretical performance thresholds, even on well-establ…
View article: LoLCATs: On Low-Rank Linearizing of Large Language Models
LoLCATs: On Low-Rank Linearizing of Large Language Models Open
Recent works show we can linearize large language models (LLMs) -- swapping the quadratic attentions of popular Transformer-based LLMs with subquadratic analogs, such as linear attention -- avoiding the expensive pretraining costs. However…
View article: Just read twice: closing the recall gap for recurrent language models
Just read twice: closing the recall gap for recurrent language models Open
Recurrent large language models that compete with Transformers in language modeling perplexity are emerging at a rapid rate (e.g., Mamba, RWKV). Excitingly, these architectures use a constant amount of memory during inference. However, due…
View article: Explaining vague language
Explaining vague language Open
Why is language vague? Vagueness may be explained and rationalized if it can be shown that vague language is more useful to speaker and hearer than precise language. In a well-known paper, Lipman proposes a game-theoretic account of vaguen…
View article: Experimentally assessing the symmetry of presupposition filtering across disjunction
Experimentally assessing the symmetry of presupposition filtering across disjunction Open
International audience
View article: Existential and universal readings of pronouns across binary connectives: an experimental investigation
Existential and universal readings of pronouns across binary connectives: an experimental investigation Open
View article: It’s not about 'about' – comparatives, negation and intervals
It’s not about 'about' – comparatives, negation and intervals Open
Solt (2014, 2018) discovered an intriguing pattern regarding the distribution of the approximator 'about'. While 'about n' is typically infelicitous under negation, this pattern is reversed with 'more than about n', which is fine under neg…
View article: Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture
Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture Open
Machine learning models are increasingly being scaled in both sequence length and model dimension to reach longer contexts and better performance. However, existing architectures such as Transformers scale quadratically along both these ax…
View article: Accelerating LLM Inference with Staged Speculative Decoding
Accelerating LLM Inference with Staged Speculative Decoding Open
Recent advances with large language models (LLM) illustrate their diverse capabilities. We propose a novel algorithm, staged speculative decoding, to accelerate LLM inference in small-batch, on-device scenarios. We address the low arithmet…
View article: On the optimality of vagueness: “around”, “between” and the Gricean maxims
On the optimality of vagueness: “around”, “between” and the Gricean maxims Open
View article: Exhaustivity and Anti‐Exhaustivity in the RSA Framework: Testing the Effect of Prior Beliefs
Exhaustivity and Anti‐Exhaustivity in the RSA Framework: Testing the Effect of Prior Beliefs Open
During communication, the interpretation of utterances is sensitive to a listener's probabilistic prior beliefs. In this paper, we focus on the influence of prior beliefs on so‐called exhaustivity interpretations , whereby a sentence such …
View article: Exhaustivity and anti-exhaustivity in the RSA framework: Testing the effect of prior beliefs
Exhaustivity and anti-exhaustivity in the RSA framework: Testing the effect of prior beliefs Open
During communication, the interpretation of utterances is sensitive to a listener's probabilistic prior beliefs, something which is captured by one currently influential model of pragmatics, the Rational Speech Act (RSA) framework. In this…
View article: Bounding the Last Mile: Efficient Learned String Indexing
Bounding the Last Mile: Efficient Learned String Indexing Open
We introduce the RadixStringSpline (RSS) learned index structure for efficiently indexing strings. RSS is a tree of radix splines each indexing a fixed number of bytes. RSS approaches or exceeds the performance of traditional string indexe…
View article: Explaining gaps in the logical lexicon of natural languages: A decision-theoretic perspective on the square of Aristotle
Explaining gaps in the logical lexicon of natural languages: A decision-theoretic perspective on the square of Aristotle Open
International audience
View article: Modified Numerals
Modified Numerals Open
Modified numerals are expressions such as more than three , fewer than three , at least three , at most three , up to ten , between three and ten , approximately ten , about ten , exactly ten , and so forth. At first sight, their semantic …
View article: Interpreting plural predication: homogeneity and non-maximality
Interpreting plural predication: homogeneity and non-maximality Open
View article: On the Optimality of Vagueness: "Around", "Between", and the Gricean Maxims
On the Optimality of Vagueness: "Around", "Between", and the Gricean Maxims Open
Why is ordinary language vague? We argue that in contexts in which a cooperative speaker is not perfectly informed about the world, the use of vague expressions can offer an optimal tradeoff between truthfulness (Gricean Quality) and infor…
View article: An argument for the trivalent approach to presupposition projection
An argument for the trivalent approach to presupposition projection Open
International audience
View article: Preventing Adversarial Use of Datasets through Fair Core-Set Construction
Preventing Adversarial Use of Datasets through Fair Core-Set Construction Open
We propose improving the privacy properties of a dataset by publishing only a strategically chosen "core-set" of the data containing a subset of the instances. The core-set allows strong performance on primary tasks, but forces poor perfor…
View article: Preventing Adversarial Use of Datasets through Fair Core-Set\n Construction
Preventing Adversarial Use of Datasets through Fair Core-Set\n Construction Open
We propose improving the privacy properties of a dataset by publishing only a\nstrategically chosen "core-set" of the data containing a subset of the\ninstances. The core-set allows strong performance on primary tasks, but forces\npoor per…
View article: Distinctions between primary and secondary scalar implicatures
Distinctions between primary and secondary scalar implicatures Open
View article: The Role of Prior Beliefs in The Rational Speech Act Model of Pragmatics: Exhaustivity as a Case Study
The Role of Prior Beliefs in The Rational Speech Act Model of Pragmatics: Exhaustivity as a Case Study Open
This paper examines the interaction between prior beliefs andpragmatic inferences, focusing on exhaustivity effects. Wepresent three experiments that tests how prior beliefs influenceboth interpretation and production of language, and comp…
View article: Revealing abstract semantic mechanisms through priming: The distributive/collective contrast
Revealing abstract semantic mechanisms through priming: The distributive/collective contrast Open
View article: Economy and embedded exhaustification
Economy and embedded exhaustification Open
View article: Sample-Efficient Reinforcement Learning through Transfer and\n Architectural Priors
Sample-Efficient Reinforcement Learning through Transfer and\n Architectural Priors Open
Recent work in deep reinforcement learning has allowed algorithms to learn\ncomplex tasks such as Atari 2600 games just from the reward provided by the\ngame, but these algorithms presently require millions of training steps in\norder to l…
View article: Sample-Efficient Reinforcement Learning through Transfer and Architectural Priors
Sample-Efficient Reinforcement Learning through Transfer and Architectural Priors Open
Recent work in deep reinforcement learning has allowed algorithms to learn complex tasks such as Atari 2600 games just from the reward provided by the game, but these algorithms presently require millions of training steps in order to lear…
View article: Unexpected Wide‐Scope Phenomena
Unexpected Wide‐Scope Phenomena Open
It has long been known that quantificational expressions in natural language do not all have the same scope properties. While the scope of some expressions is closely related to their observable, “surface” position in syntactic structure, …
View article: The Design and Implementation of Modern Online Programming Competitions
The Design and Implementation of Modern Online Programming Competitions Open
This paper presents a framework for the implementation of online programming competitions, including a set of principles for the design of the multiplayer game and a practical framework for the construction of the competition environment. …
View article: Asymmetric inference towards the antonym: Experiments into the polarity and morphology of negated adjectives
Asymmetric inference towards the antonym: Experiments into the polarity and morphology of negated adjectives Open
In this paper, we investigate the interpretation of negated antonyms. A sentence such as Peter is not tall can be understood as meaning either that Peter is not tall tout court or that Peter is rather short (inference towards the antonym; …