John Langford — Scifaro

Next-Latent Prediction Transformers Learn Compact World Models

Transformers replace recurrence with a memory that grows with sequence length and self-attention that enables ad-hoc lookups over past tokens. Consequently, they lack an inherent incentive to compress history into compact latent states with…

Machine Learning · Computer Science 2026-05-26 Jayden Teoh , Manan Tomar , Kwangjun Ahn , Edward S. Hu , Tim Pearce , Pratyusha Sharma , Akshay Krishnamurthy , Riashat Islam , Alex Lamb , John Langford

MEMENTO: Teaching LLMs to Manage Their Own Context

Reasoning models think in long, unstructured streams with no mechanism for compressing or organizing their own intermediate state. We introduce MEMENTO: a method that teaches models to segment reasoning into blocks, compress each block into…

Artificial Intelligence · Computer Science 2026-04-14 Vasilis Kontonis , Yuchen Zeng , Shivam Garg , Lingjiao Chen , Hao Tang , Ziyan Wang , Ahmed Awadallah , Eric Horvitz , John Langford , Dimitris Papailiopoulos

Phi-4-reasoning-vision-15B Technical Report

We present Phi-4-reasoning-vision-15B, a compact open-weight multimodal reasoning model, and share the motivations, design choices, experiments, and learnings that informed its development. Our goal is to contribute practical insight to the…

Artificial Intelligence · Computer Science 2026-03-05 Jyoti Aneja , Michael Harrison , Neel Joshi , Tyler LaBonte , John Langford , Eduardo Salinas

When does predictive inverse dynamics outperform behavior cloning?

Behavior cloning (BC) is a practical offline imitation learning method, but it often fails when expert demonstrations are limited. Recent works have introduced a class of architectures named predictive inverse dynamics models (PIDM) that…

Machine Learning · Computer Science 2026-01-30 Lukas Schäfer , Pallavi Choudhury , Abdelhak Lemkhenter , Chris Lovett , Somjit Nath , Luis França , Matheus Ribeiro Furtado de Mendonça , Alex Lamb , Riashat Islam , Siddhartha Sen , John Langford , Katja Hofmann , Sergio Valcarcel Macua

Dion2: A Simple Method to Shrink Matrix in Muon

The Muon optimizer enjoys strong empirical performance and theoretical grounding. However, the super-linear cost of its orthonormalization step introduces increasing overhead with scale. To alleviate this cost, several works have attempted…

Machine Learning · Computer Science 2025-12-22 Kwangjun Ahn , Noah Amsel , John Langford

The Belief State Transformer

We introduce the "Belief State Transformer", a next-token predictor that takes both a prefix and suffix as inputs, with a novel objective of predicting both the next token for the prefix and the previous token for the suffix. The Belief…

Machine Learning · Computer Science 2025-09-17 Edward S. Hu , Kwangjun Ahn , Qinghua Liu , Haoran Xu , Manan Tomar , Ada Langford , Jayden Teoh , Bryon Xu , David Yan , Dinesh Jayaraman , Alex Lamb , John Langford

Dion: Distributed Orthonormalized Updates

Orthonormalized updates accelerate training, improve stability, and enable robust hyperparameter transfer, but existing methods like Muon rely on dense matrix operations that clash with sharded weights in large-scale LLM training, causing…

Machine Learning · Computer Science 2025-09-16 Kwangjun Ahn , Byron Xu , Natalie Abreu , Ying Fan , Gagik Magakyan , Pratyusha Sharma , Zheng Zhan , John Langford

EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles

With Large Language Models (LLMs) rapidly approaching and potentially surpassing human-level performance, it has become imperative to develop approaches capable of effectively supervising and enhancing these powerful models using smaller,…

Machine Learning · Computer Science 2025-07-24 Aakriti Agrawal , Mucong Ding , Zora Che , Chenghao Deng , Anirudh Satheesh , Bang An , Bayan Bruss , John Langford , Furong Huang

Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization

While generalization over tasks from easy to hard is crucial to profile language models (LLMs), the datasets with fine-grained difficulty annotations for each problem across a broad range of complexity are still blank. Aiming to address…

Machine Learning · Computer Science 2025-06-10 Mucong Ding , Chenghao Deng , Jocelyn Choo , Zichu Wu , Aakriti Agrawal , Avi Schwarzschild , Tianyi Zhou , Tom Goldstein , John Langford , Anima Anandkumar , Furong Huang

EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles

With Large Language Models (LLMs) rapidly approaching and potentially surpassing human-level performance, it has become imperative to develop approaches capable of effectively supervising and enhancing these powerful models using smaller,…

Machine Learning · Computer Science 2025-06-06 Aakriti Agrawal , Mucong Ding , Zora Che , Chenghao Deng , Anirudh Satheesh , Bang An , Bayan Bruss , John Langford , Furong Huang

Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead

Inference-time scaling can enhance the reasoning capabilities of large language models (LLMs) on complex problems that benefit from step-by-step problem solving. Although lengthening generated scratchpads has proven effective for…

Machine Learning · Computer Science 2025-04-02 Vidhisha Balachandran , Jingya Chen , Lingjiao Chen , Shivam Garg , Neel Joshi , Yash Lara , John Langford , Besmira Nushi , Vibhav Vineet , Yue Wu , Safoora Yousefi

Efficient Joint Prediction of Multiple Future Tokens

In this short report, we introduce joint multi-token prediction (JTP), a lightweight modification of standard next-token prediction designed to enrich hidden state representations by jointly predicting multiple future tokens. Unlike…

Machine Learning · Computer Science 2025-03-31 Kwangjun Ahn , Alex Lamb , John Langford

Video Occupancy Models

We introduce a new family of video prediction models designed to support downstream control tasks. We call these models Video Occupancy models (VOCs). VOCs operate in a compact latent space, thus avoiding the need to make predictions about…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Manan Tomar , Philippe Hansen-Estruch , Philip Bachman , Alex Lamb , John Langford , Matthew E. Taylor , Sergey Levine

PcLast: Discovering Plannable Continuous Latent States

Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision…

Machine Learning · Computer Science 2024-06-12 Anurag Koul , Shivakanth Sujit , Shaoru Chen , Ben Evans , Lili Wu , Byron Xu , Rajan Chari , Riashat Islam , Raihan Seraj , Yonathan Efroni , Lekan Molu , Miro Dudik , John Langford , Alex Lamb

Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

We present Premier-TACO, a multitask feature representation learning approach designed to improve few-shot policy learning efficiency in sequential decision-making tasks. Premier-TACO leverages a subset of multitask offline datasets for…

Machine Learning · Computer Science 2024-05-27 Ruijie Zheng , Yongyuan Liang , Xiyao Wang , Shuang Ma , Hal Daumé , Huazhe Xu , John Langford , Praveen Palanisamy , Kalyan Shankar Basu , Furong Huang

Towards Principled Representation Learning from Videos for Reinforcement Learning

We study pre-training representations for decision-making using video data, which is abundantly available for tasks such as game agents and software testing. Even though significant empirical advances have been made on this problem, a…

Machine Learning · Computer Science 2024-03-21 Dipendra Misra , Akanksha Saran , Tengyang Xie , Alex Lamb , John Langford

Position Paper: Agent AI Towards a Holistic Intelligence

Recent advancements in large foundation models have remarkably enhanced our understanding of sensory information in open-world environments. In leveraging the power of foundation models, it is crucial for AI research to pivot away from…

Artificial Intelligence · Computer Science 2024-03-05 Qiuyuan Huang , Naoki Wake , Bidipta Sarkar , Zane Durante , Ran Gong , Rohan Taori , Yusuke Noda , Demetri Terzopoulos , Noboru Kuno , Ade Famoti , Ashley Llorens , John Langford , Hoi Vo , Li Fei-Fei , Katsu Ikeuchi , Jianfeng Gao

EyeO: Autocalibrating Gaze Output with Gaze Input

Gaze tracking devices have the potential to greatly expand interactivity, yet miscalibration remains a significant barrier to use. As devices miscalibrate, people tend to compensate by intentionally offsetting their gaze, which makes…

Human-Computer Interaction · Computer Science 2023-11-01 Akanksha Saran , Jacob Alber , Cyril Zhang , Ann Paradiso , Danielle Bragg , John Langford

Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

Learning to control an agent from data collected offline in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input…

Machine Learning · Computer Science 2023-08-15 Riashat Islam , Manan Tomar , Alex Lamb , Yonathan Efroni , Hongyu Zang , Aniket Didolkar , Dipendra Misra , Xin Li , Harm van Seijen , Remi Tachet des Combes , John Langford

Streaming Active Learning with Deep Neural Networks

Active learning is perhaps most naturally posed as an online learning problem. However, prior active learning approaches with deep neural networks assume offline access to the entire dataset ahead of time. This paper proposes VeSSAL, a new…

Machine Learning · Computer Science 2023-06-08 Akanksha Saran , Safoora Yousefi , Akshay Krishnamurthy , John Langford , Jordan T. Ash