Keyon Vafa
YOU?
Author Swipe
View article: Potemkin Understanding in Large Language Models
Potemkin Understanding in Large Language Models Open
Large language models (LLMs) are regularly evaluated using benchmark datasets. But what justifies making inferences about an LLM's capabilities based on its answers to a curated set of questions? This paper first introduces a formal framew…
View article: Estimating wage disparities using foundation models
Estimating wage disparities using foundation models Open
The rise of foundation models marks a paradigm shift in machine learning: instead of training specialized models from scratch, foundation models are trained on massive datasets before being adjusted or fine-tuned to make predictions on sma…
View article: Estimating Wage Disparities Using Foundation Models
Estimating Wage Disparities Using Foundation Models Open
The rise of foundation models marks a paradigm shift in machine learning: instead of training specialized models from scratch, foundation models are first trained on massive datasets before being adapted or fine-tuned to make predictions o…
View article: LABOR-LLM: Language-Based Occupational Representations with Large Language Models
LABOR-LLM: Language-Based Occupational Representations with Large Language Models Open
Vafa et al. (2024) introduced a transformer-based econometric model, CAREER, that predicts a worker's next job as a function of career history (an "occupation model"). CAREER was initially estimated ("pre-trained") using a large, unreprese…
View article: Evaluating the World Model Implicit in a Generative Model
Evaluating the World Model Implicit in a Generative Model Open
Recent work suggests that large language models may implicitly learn world models. How should we assess this possibility? We formalize this question for the case where the underlying reality is governed by a deterministic finite automaton.…
View article: Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function Open
What makes large language models (LLMs) impressive is also what makes them hard to evaluate: their diversity of uses. To evaluate these models, we must understand the purposes they will be used for. We consider a setting where these deploy…
View article: Revisiting Topic-Guided Language Models
Revisiting Topic-Guided Language Models Open
A recent line of work in natural language processing has aimed to combine language models and topic models. These topic-guided language models augment neural language models with topic models, unsupervised learning methods that can discove…
View article: An Invariant Learning Characterization of Controlled Text Generation
An Invariant Learning Characterization of Controlled Text Generation Open
Controlled generation refers to the problem of creating text that contains stylistic or semantic attributes of interest. Many approaches reduce this problem to training a predictor of the desired attribute. For example, researchers hoping …
View article: An Invariant Learning Characterization of Controlled Text Generation
An Invariant Learning Characterization of Controlled Text Generation Open
Controlled generation refers to the problem of creating text that contains stylistic or semantic attributes of interest. Many approaches reduce this problem to training a predictor of the desired attribute. For example, researchers hoping …
View article: CAREER: A Foundation Model for Labor Sequence Data
CAREER: A Foundation Model for Labor Sequence Data Open
Labor economists regularly analyze employment data by fitting predictive models to small, carefully constructed longitudinal survey datasets. Although machine learning methods offer promise for such problems, these survey datasets are too …
View article: Assessing the Effects of Friend-to-Friend Texting onTurnout in the 2018 US Midterm Elections
Assessing the Effects of Friend-to-Friend Texting onTurnout in the 2018 US Midterm Elections Open
Recent mobile app technology lets people systematize the process of messaging their friends to urge them to vote. Prior to the most recent US midterm elections in 2018, the mobile app Outvote randomized an aspect of their system, hoping to…
View article: Rationales for Sequential Predictions
Rationales for Sequential Predictions Open
Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain. We consider model explanations though rationales, subsets of context that can explain individual model predictions. We find seq…
View article: Text-Based Ideal Points
Text-Based Ideal Points Open
Ideal point models analyze lawmakers' votes to quantify their political positions, or ideal points. But votes are not the only way to express a political position. Lawmakers also give speeches, release press statements, and post tweets. In…
View article: Text-Based Ideal Points
Text-Based Ideal Points Open
Ideal point models analyze lawmakers’ votes to quantify their political positions, or ideal points. But votes are not the only way to express a political position. Lawmakers also give speeches, release press statements, and post tweets. In…
View article: Discrete Flows: Invertible Generative Models of Discrete Data
Discrete Flows: Invertible Generative Models of Discrete Data Open
While normalizing flows have led to significant advances in modeling high-dimensional continuous distributions, their applicability to discrete distributions remains unknown. In this paper, we show that flows can in fact be extended to dis…
View article: Discrete Flows: Invertible Generative Models of Discrete Data
Discrete Flows: Invertible Generative Models of Discrete Data Open
While normalizing flows have led to significant advances in modeling high-dimensional continuous distributions, their applicability to discrete distributions remains unknown. In this paper, we show that flows can in fact be extended to dis…
View article: Training and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian Processes Open
An ideal model for regression is not only accurate, but also computationally efficient, easy to tune without overfitting, and able to provide certainty estimates. In this thesis, we explore deep Gaussian processes (deep GPs), a class of mo…
View article: Replication Data for: Price Discrimination in The Princeton Review’s Online SAT Tutoring Service
Replication Data for: Price Discrimination in The Princeton Review’s Online SAT Tutoring Service Open
This dataset was used for this paper published on 9/1/2015 on Technology Science. http://techscience.org/a/2015090102/