Michael Carbin
YOU?
Author Swipe
View article: FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents Open
We introduce FreshStack, a holistic framework for automatically building information retrieval (IR) evaluation benchmarks by incorporating challenging questions and answers. FreshStack conducts the following steps: (1) automatic corpus col…
View article: Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding
Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding Open
Decoding with autoregressive large language models (LLMs) traditionally occurs sequentially, generating one token after another. An emerging line of work explored parallel decoding by identifying and simultaneously generating semantically …
View article: Inference Plans for Hybrid Particle Filtering
Inference Plans for Hybrid Particle Filtering Open
Advanced probabilistic programming languages (PPLs) using hybrid particle filtering combine symbolic exact inference and Monte Carlo methods to improve inference performance. These systems use heuristics to partition random variables withi…
View article: Drowning in Documents: Consequences of Scaling Reranker Inference
Drowning in Documents: Consequences of Scaling Reranker Inference Open
Rerankers, typically cross-encoders, are computationally intensive but are frequently used because they are widely assumed to outperform cheaper initial IR systems. We challenge this assumption by measuring reranker performance for full re…
View article: Long Context RAG Performance of Large Language Models
Long Context RAG Performance of Large Language Models Open
Retrieval Augmented Generation (RAG) has emerged as a crucial technique for enhancing the accuracy of Large Language Models (LLMs) by incorporating external information. With the advent of LLMs that support increasingly longer context leng…
View article: Learning to Compile Programs to Neural Networks
Learning to Compile Programs to Neural Networks Open
A $\textit{neural surrogate of a program}$ is a neural network that mimics the behavior of a program. Researchers have used these neural surrogates to automatically tune program inputs, adapt programs to new settings, and accelerate comput…
View article: The T-Complexity Costs of Error Correction for Control Flow in Quantum Computation
The T-Complexity Costs of Error Correction for Control Flow in Quantum Computation Open
Numerous quantum algorithms require the use of quantum error correction to overcome the intrinsic unreliability of physical qubits. However, quantum error correction imposes a unique performance bottleneck, known as -complexity, that can…
View article: Distributions for Compositionally Differentiating Parametric Discontinuities
Distributions for Compositionally Differentiating Parametric Discontinuities Open
Computations in physical simulation, computer graphics, and probabilistic inference often require the differentiation of discontinuous processes due to contact, occlusion, and changes at a point in time. Popular differentiable programming …
View article: Quantum Control Machine: The Limits of Control Flow in Quantum Programming
Quantum Control Machine: The Limits of Control Flow in Quantum Programming Open
Quantum algorithms for tasks such as factorization, search, and simulation rely on control flow such as branching and iteration that depends on the value of data in superposition. High-level programming abstractions for control flow, such …
View article: BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text Open
Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks. However, these models have hundreds of billions of parameters, are computationally expensive to run, require users to s…
View article: The T-Complexity Costs of Error Correction for Control Flow in Quantum Computation
The T-Complexity Costs of Error Correction for Control Flow in Quantum Computation Open
Numerous quantum algorithms require the use of quantum error correction to overcome the intrinsic unreliability of physical qubits. However, error correction imposes a unique performance bottleneck, known as T-complexity, that can make an …
View article: Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs
Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs Open
Programmers and researchers are increasingly developing surrogates of programs, models of a subset of the observable behavior of a given program, to solve a variety of software development challenges. Programmers train surrogates from meas…
View article: The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning
The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning Open
How does scaling the number of parameters in large language models (LLMs) affect their core capabilities? We study two natural scaling techniques -- weight pruning and simply training a smaller or larger model, which we refer to as dense s…
View article: Verifying Performance Properties of Probabilistic Inference
Verifying Performance Properties of Probabilistic Inference Open
In this extended abstract, we discuss the opportunity to formally verify that inference systems for probabilistic programming guarantee good performance. In particular, we focus on hybrid inference systems that combine exact and approximat…
View article: Computably Continuous Reinforcement-Learning Objectives Are PAC-Learnable
Computably Continuous Reinforcement-Learning Objectives Are PAC-Learnable Open
In reinforcement learning, the classic objectives of maximizing discounted and finite-horizon cumulative rewards are PAC-learnable: There are algorithms that learn a near-optimal policy with high probability using a finite amount of sample…
View article: Quantum Control Machine: The Limits of Control Flow in Quantum Programming
Quantum Control Machine: The Limits of Control Flow in Quantum Programming Open
Quantum algorithms for tasks such as factorization, search, and simulation rely on control flow such as branching and iteration that depends on the value of data in superposition. High-level programming abstractions for control flow, such …
View article: Computably Continuous Reinforcement-Learning Objectives are PAC-learnable
Computably Continuous Reinforcement-Learning Objectives are PAC-learnable Open
In reinforcement learning, the classic objectives of maximizing discounted and finite-horizon cumulative rewards are PAC-learnable: There are algorithms that learn a near-optimal policy with high probability using a finite amount of sample…
View article: Acela: Predictable Datacenter-level Maintenance Job Scheduling
Acela: Predictable Datacenter-level Maintenance Job Scheduling Open
Datacenter operators ensure fair and regular server maintenance by using automated processes to schedule maintenance jobs to complete within a strict time budget. Automating this scheduling problem is challenging because maintenance job du…
View article: Tower: data structures in Quantum superposition
Tower: data structures in Quantum superposition Open
Emerging quantum algorithms for problems such as element distinctness, subset sum, and closest pair demonstrate computational advantages by relying on abstract data structures. Practically realizing such an algorithm as a program for a qua…
View article: Semi-symbolic inference for efficient streaming probabilistic programming
Semi-symbolic inference for efficient streaming probabilistic programming Open
A streaming probabilistic program receives a stream of observations and produces a stream of distributions that are conditioned on these observations. Efficient inference is often possible in a streaming context using Rao-Blackwellized par…
View article: Pruning's Effect on Generalization Through the Lens of Training and Regularization
Pruning's Effect on Generalization Through the Lens of Training and Regularization Open
Practitioners frequently observe that pruning improves model generalization. A long-standing hypothesis based on bias-variance trade-off attributes this generalization improvement to model size reduction. However, recent studies on over-pa…
View article: Semi-Symbolic Inference for Efficient Streaming Probabilistic Programming
Semi-Symbolic Inference for Efficient Streaming Probabilistic Programming Open
Efficient inference is often possible in a streaming context using Rao-Blackwellized particle filters (RBPFs), which exactly solve inference problems when possible and fall back on sampling approximations when necessary. While RBPFs can be…
View article: On the (In)Tractability of Reinforcement Learning for LTL Objectives
On the (In)Tractability of Reinforcement Learning for LTL Objectives Open
In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives. Despite these advancements, there are fundamental limi…
View article: SCOPE: Safe Exploration for Dynamic Computer Systems Optimization
SCOPE: Safe Exploration for Dynamic Computer Systems Optimization Open
Modern computer systems need to execute under strict safety constraints (e.g., a power limit), but doing so often conflicts with their ability to deliver high performance (i.e. minimal latency). Prior work uses machine learning to automati…
View article: Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression
Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression Open
Sample-efficient machine learning (SEML) has been widely applied to find optimal latency and power tradeoffs for configurable computer systems. Instead of randomly sampling from the configuration space, SEML reduces the search cost by dram…
View article: Twist: sound reasoning for purity and entanglement in Quantum programs
Twist: sound reasoning for purity and entanglement in Quantum programs Open
Quantum programming languages enable developers to implement algorithms for quantum computers that promise computational breakthroughs in classically intractable tasks. Programming quantum computers requires awareness of entanglement , the…
View article: On the (In)Tractability of Reinforcement Learning for LTL Objectives
On the (In)Tractability of Reinforcement Learning for LTL Objectives Open
In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives. Despite these advancements, there are fundamental limi…
View article: Checking Bounded-Memory Execution for Delayed Sampling on Probabilistic Streams
Checking Bounded-Memory Execution for Delayed Sampling on Probabilistic Streams Open
International audience
View article: Programming with neural surrogates of programs
Programming with neural surrogates of programs Open
Surrogates, models that mimic the behavior of programs, form the basis of a\nvariety of development workflows. We study three surrogate-based design\npatterns, evaluating each in case studies on a large-scale CPU simulator.\n With surrogat…
View article: Generalizable and interpretable learning for configuration extrapolation
Generalizable and interpretable learning for configuration extrapolation Open
Modern software applications are increasingly configurable, which puts a burden on users to tune these configurations for their target hardware and workloads. To help users, machine learning techniques can model the complex relationships b…