Keyon Vafa — Scifaro

LABOR-LLM: Language-Based Occupational Representations with Large Language Models

This paper builds an empirical model that predicts a worker's next occupation as a function of the worker's occupational history. Because histories are sequences of occupations, the covariate space is high-dimensional, and further, the…

Machine Learning · Computer Science 2026-01-06 Susan Athey , Herman Brunborg , Tianyu Du , Ayush Kanodia , Keyon Vafa

What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models

Foundation models are premised on the idea that sequence prediction can uncover deeper domain understanding, much like how Kepler's predictions of planetary motion later led to the discovery of Newtonian mechanics. However, evaluating…

Machine Learning · Computer Science 2025-12-30 Keyon Vafa , Peter G. Chang , Ashesh Rambachan , Sendhil Mullainathan

What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models

How should we evaluate the quality of generative models? Many existing metrics focus on a model's producibility, i.e. the quality and breadth of outputs it can generate. However, the actual value from using a generative model stems not just…

Machine Learning · Computer Science 2025-11-13 Keyon Vafa , Sarah Bentley , Jon Kleinberg , Sendhil Mullainathan

Potemkin Understanding in Large Language Models

Large language models (LLMs) are regularly evaluated using benchmark datasets. But what justifies making inferences about an LLM's capabilities based on its answers to a curated set of questions? This paper first introduces a formal…

Computation and Language · Computer Science 2025-07-01 Marina Mancoridis , Bec Weeks , Keyon Vafa , Sendhil Mullainathan

Estimating Wage Disparities Using Foundation Models

The rise of foundation models marks a paradigm shift in machine learning: instead of training specialized models from scratch, foundation models are first trained on massive datasets before being adapted or fine-tuned to make predictions on…

Machine Learning · Computer Science 2025-05-01 Keyon Vafa , Susan Athey , David M. Blei

Critical Thinking: Which Kinds of Complexity Govern Optimal Reasoning Length?

Large language models (LLMs) often benefit from verbalized reasoning at inference time, but it remains unclear which aspects of task difficulty these extra reasoning tokens address. To investigate this question, we formalize a framework…

Artificial Intelligence · Computer Science 2025-04-03 Celine Lee , Alexander M. Rush , Keyon Vafa

Using large language models to promote health equity

Advances in large language models (LLMs) have driven an explosion of interest about their societal impacts. Much of the discourse around how they will impact social equity has been cautionary or negative, focusing on questions like "how…

Computers and Society · Computer Science 2025-01-08 Emma Pierson , Divya Shanmugam , Rajiv Movva , Jon Kleinberg , Monica Agrawal , Mark Dredze , Kadija Ferryman , Judy Wawira Gichoya , Dan Jurafsky , Pang Wei Koh , Karen Levy , Sendhil Mullainathan , Ziad Obermeyer , Harini Suresh , Keyon Vafa

Evaluating the World Model Implicit in a Generative Model

Recent work suggests that large language models may implicitly learn world models. How should we assess this possibility? We formalize this question for the case where the underlying reality is governed by a deterministic finite automaton.…

Computation and Language · Computer Science 2024-11-12 Keyon Vafa , Justin Y. Chen , Ashesh Rambachan , Jon Kleinberg , Sendhil Mullainathan

Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function

What makes large language models (LLMs) impressive is also what makes them hard to evaluate: their diversity of uses. To evaluate these models, we must understand the purposes they will be used for. We consider a setting where these…

Computation and Language · Computer Science 2024-06-04 Keyon Vafa , Ashesh Rambachan , Sendhil Mullainathan

CAREER: A Foundation Model for Labor Sequence Data

Labor economists regularly analyze employment data by fitting predictive models to small, carefully constructed longitudinal survey datasets. Although machine learning methods offer promise for such problems, these survey datasets are too…

Machine Learning · Computer Science 2024-03-01 Keyon Vafa , Emil Palikot , Tianyu Du , Ayush Kanodia , Susan Athey , David M. Blei

Revisiting Topic-Guided Language Models

A recent line of work in natural language processing has aimed to combine language models and topic models. These topic-guided language models augment neural language models with topic models, unsupervised learning methods that can discover…

Computation and Language · Computer Science 2023-12-06 Carolina Zheng , Keyon Vafa , David M. Blei

An Invariant Learning Characterization of Controlled Text Generation

Controlled generation refers to the problem of creating text that contains stylistic or semantic attributes of interest. Many approaches reduce this problem to training a predictor of the desired attribute. For example, researchers hoping…

Computation and Language · Computer Science 2023-06-02 Carolina Zheng , Claudia Shi , Keyon Vafa , Amir Feder , David M. Blei

Rationales for Sequential Predictions

Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain. We consider model explanations though rationales, subsets of context that can explain individual model predictions. We find…

Computation and Language · Computer Science 2021-11-19 Keyon Vafa , Yuntian Deng , David M. Blei , Alexander M. Rush

Text-Based Ideal Points

Ideal point models analyze lawmakers' votes to quantify their political positions, or ideal points. But votes are not the only way to express a political position. Lawmakers also give speeches, release press statements, and post tweets. In…

Computation and Language · Computer Science 2020-07-23 Keyon Vafa , Suresh Naidu , David M. Blei

Discrete Flows: Invertible Generative Models of Discrete Data

While normalizing flows have led to significant advances in modeling high-dimensional continuous distributions, their applicability to discrete distributions remains unknown. In this paper, we show that flows can in fact be extended to…

Machine Learning · Computer Science 2019-05-27 Dustin Tran , Keyon Vafa , Kumar Krishna Agrawal , Laurent Dinh , Ben Poole