Ruslan Salakhutdinov

Video Active Perception: Effective Inference-Time Long-Form Video Understanding with Vision-Language Models

Large vision-language models (VLMs) have advanced multimodal tasks such as video question answering (QA). However, VLMs face the challenge of selecting frames effectively and efficiently, as standard uniform sampling is expensive and…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Martin Q. Ma , Willis Guo , Aditya Agrawal , Ankit Gupta , Paul Pu Liang , Ruslan Salakhutdinov , Louis-Philippe Morency

Act2See: Emergent Active Visual Perception for Video Reasoning

Vision-Language Models (VLMs) typically rely on static initial frames for video reasoning, restricting their ability to incorporate essential dynamic information as the reasoning process evolves. Existing methods that augment…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Martin Q. Ma , Yuxiao Qu , Aditya Agrawal , Willis Guo , Paul Pu Liang , Ruslan Salakhutdinov , Louis-Philippe Morency

Odysseys: Benchmarking Web Agents on Realistic Long Horizon Tasks

Existing web agent benchmarks have largely converged on short, single-site tasks that frontier models are approaching saturation on. However, real world web use consists of long-horizon, multi-site workflows. Common web navigation tasks,…

Machine Learning · Computer Science 2026-04-29 Lawrence Keunho Jang , Jing Yu Koh , Daniel Fried , Ruslan Salakhutdinov

Scaling Test-Time Compute for Agentic Coding

Test-time scaling has become a powerful way to improve large language models. However, existing methods are best suited to short, bounded outputs that can be directly compared, ranked or refined. Long-horizon coding agents violate this…

Software Engineering · Computer Science 2026-04-22 Joongwon Kim , Wannan Yang , Kelvin Niu , Hongming Zhang , Yun Zhu , Eryk Helenowski , Ruan Silva , Zhengxing Chen , Srinivasan Iyer , Manzil Zaheer , Daniel Fried , Hannaneh Hajishirzi , Sanjeev Arora , Gabriel Synnaeve , Ruslan Salakhutdinov , Anirudh Goyal

Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models

Log-likelihood evaluation enables important capabilities in generative models, including model comparison, certain fine-tuning objectives, and many downstream applications. Yet paradoxically, some of today's best generative models --…

Machine Learning · Computer Science 2026-04-21 Xinyue Ai , Yutong He , Albert Gu , Ruslan Salakhutdinov , J Zico Kolter , Nicholas Matthew Boffi , Max Simchowitz

IsoCompute Playbook: Optimally Scaling Sampling Compute for LLM RL

While scaling laws guide compute allocation for LLM pre-training, analogous prescriptions for reinforcement learning (RL) post-training of large language models (LLMs) remain poorly understood. We study the compute-optimal allocation of…

Machine Learning · Computer Science 2026-03-13 Zhoujun Cheng , Yutao Xie , Yuxiao Qu , Amrith Setlur , Shibo Hao , Varad Pimpalkhute , Tongtong Liang , Feng Yao , Zhengzhong Liu , Eric Xing , Virginia Smith , Ruslan Salakhutdinov , Zhiting Hu , Taylor Killian , Aviral Kumar

Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

Learning from self-sampled data and sparse environmental feedback remains a fundamental challenge in training self-evolving agents. Temporal credit assignment mitigates this issue by transforming sparse feedback into dense supervision…

Machine Learning · Computer Science 2026-02-20 Wen-Tse Chen , Jiayu Chen , Fahim Tajwar , Hao Zhu , Xintong Duan , Ruslan Salakhutdinov , Jeff Schneider

Tuning-free Visual Effect Transfer across Videos

We present RefVFX, a new framework that transfers complex temporal effects from a reference video onto a target video or image in a feed-forward manner. While existing methods excel at prompt-based or keyframe-conditioned editing, they…

Computer Vision and Pattern Recognition · Computer Science 2026-02-20 Maxwell Jones , Rameen Abdal , Or Patashnik , Ruslan Salakhutdinov , Sergey Tulyakov , Jun-Yan Zhu , Kuan-Chieh Jackson Wang

Tree Search for Language Model Agents

Autonomous agents powered by language models (LMs) have demonstrated promise in their ability to perform decision-making tasks such as web automation. However, a key limitation remains: LMs, primarily optimized for natural language…

Artificial Intelligence · Computer Science 2026-02-10 Jing Yu Koh , Stephen McAleer , Daniel Fried , Ruslan Salakhutdinov

Accelerating Diffusion Planners in Offline RL via Reward-Aware Consistency Trajectory Distillation

Although diffusion models have achieved strong results in decision-making tasks, their slow inference speed remains a key limitation. While consistency models offer a potential solution, existing applications to decision-making either…

Machine Learning · Computer Science 2026-02-09 Xintong Duan , Yutong He , Fahim Tajwar , Ruslan Salakhutdinov , J. Zico Kolter , Jeff Schneider

Maximum Likelihood Reinforcement Learning

Reinforcement learning is the method of choice to train models in sampling-based setups with binary outcome feedback, such as navigation, code generation, and mathematical problem solving. In such settings, models implicitly induce a…

Machine Learning · Computer Science 2026-02-04 Fahim Tajwar , Guanning Zeng , Yueer Zhou , Yuda Song , Daman Arora , Yiding Jiang , Jeff Schneider , Ruslan Salakhutdinov , Haiwen Feng , Andrea Zanette

POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration

Reinforcement learning (RL) has improved the reasoning abilities of large language models (LLMs), yet state-of-the-art methods still fail to learn on many training problems. On hard problems, on-policy RL rarely explores even a single…

Machine Learning · Computer Science 2026-01-27 Yuxiao Qu , Amrith Setlur , Virginia Smith , Ruslan Salakhutdinov , Aviral Kumar

State Combinatorial Generalization In Decision Making With Conditional Diffusion Models

Many real-world decision-making problems are combinatorial in nature, where states (e.g., surrounding traffic of a self-driving car) can be seen as a combination of basic elements (e.g., pedestrians, trees, and other cars). Due to…

Machine Learning · Computer Science 2025-12-16 Xintong Duan , Yutong He , Fahim Tajwar , Wen-Tse Chen , Ruslan Salakhutdinov , Jeff Schneider

Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion

This paper considers blind inverse image restoration, the task of predicting a target image from a degraded source when the degradation (i.e. the forward operator) is unknown. Existing solutions typically rely on restrictive assumptions…

Computer Vision and Pattern Recognition · Computer Science 2025-12-02 Michail Dontas , Yutong He , Naoki Murata , Yuki Mitsufuji , J. Zico Kolter , Ruslan Salakhutdinov

Training a Generally Curious Agent

Efficient exploration is essential for intelligent systems interacting with their environment, but existing language models often fall short in scenarios that require strategic information gathering. In this paper, we present Paprika, a…

Machine Learning · Computer Science 2025-11-03 Fahim Tajwar , Yiding Jiang , Abitha Thankaraj , Sumaita Sadia Rahman , J Zico Kolter , Jeff Schneider , Ruslan Salakhutdinov

Can Large Reasoning Models Self-Train?

Recent successes of reinforcement learning (RL) in training large reasoning models motivate the question of whether self-training - the process where a model learns from its own judgments - can be sustained within RL. In this work, we study…

Machine Learning · Computer Science 2025-10-10 Sheikh Shafayat , Fahim Tajwar , Ruslan Salakhutdinov , Jeff Schneider , Andrea Zanette

Contrastive Difference Predictive Coding

Predicting and reasoning about the future lie at the heart of many time-series questions. For example, goal-conditioned reinforcement learning can be viewed as learning representations to predict which states are likely to be visited in the…

Machine Learning · Computer Science 2025-10-10 Chongyi Zheng , Ruslan Salakhutdinov , Benjamin Eysenbach

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems

Reasoning requires going beyond pattern matching or memorization of solutions to identify and implement "algorithmic procedures" that can be used to deduce answers to hard problems. Doing so requires realizing the most relevant primitives,…

Artificial Intelligence · Computer Science 2025-10-03 Yuxiao Qu , Anikait Singh , Yoonho Lee , Amrith Setlur , Ruslan Salakhutdinov , Chelsea Finn , Aviral Kumar

AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents

Autonomous AI agents that can follow instructions and perform complex multi-step tasks have tremendous potential to boost human productivity. However, to perform many of these tasks, the agents need access to personal information from their…

Artificial Intelligence · Computer Science 2025-10-03 Arman Zharmagambetov , Chuan Guo , Ivan Evtimov , Maya Pavlova , Ruslan Salakhutdinov , Kamalika Chaudhuri

Rethinking Thinking Tokens: LLMs as Improvement Operators

Reasoning training incentivizes LLMs to produce long chains of thought (long CoT), which among other things, allows them to explore solution strategies with self-checking. This results in higher accuracy, but inflates context length,…

Machine Learning · Computer Science 2025-10-02 Lovish Madaan , Aniket Didolkar , Suchin Gururangan , John Quan , Ruan Silva , Ruslan Salakhutdinov , Manzil Zaheer , Sanjeev Arora , Anirudh Goyal