Jiaming Song — Scifaro

Engagement Process: Rethinking the Temporal Interface of Action and Observation

Task completion in digital and physical environments increasingly involves complex temporal interaction, where actions and observations unfold over different time scales rather than align with fixed observation--action steps. To model such…

Artificial Intelligence · Computer Science 2026-05-13 Jialian Li , Yuchen Cao , Junhong Liu , Weiran Guo , Xutao Wang , Jiaming Song , Jiahao Zhang , Jie Chen

Seer: Language Instructed Video Prediction with Latent Diffusion Models

Imagining the future trajectory is the key for robots to make sound planning and successfully reach their goals. Therefore, text-conditioned video prediction (TVP) is an essential task to facilitate general robot policy learning. To tackle…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Xianfan Gu , Chuan Wen , Weirui Ye , Jiaming Song , Yang Gao

Terminal Velocity Matching

We propose Terminal Velocity Matching (TVM), a generalization of flow matching that enables high-fidelity one- and few-step generative modeling. TVM models the transition between any two diffusion timesteps and regularizes its behavior at…

Machine Learning · Computer Science 2026-02-18 Linqi Zhou , Mathias Parger , Ayaan Haque , Jiaming Song

Self-NPO: Data-Free Diffusion Model Enhancement via Truncated Diffusion Fine-Tuning

Diffusion models have demonstrated remarkable success in various visual generation tasks, including image, video, and 3D content generation. Preference optimization (PO) is a prominent and growing area of research that aims to align these…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Fu-Yun Wang , Keqiang Sun , Yao Teng , Xihui Liu , Jiale Yuan , Jiaming Song , Hongsheng Li

SABER: Switchable and Balanced Training for Efficient LLM Reasoning

Large language models (LLMs) empowered by chain-of-thought reasoning have achieved impressive accuracy on complex tasks but suffer from excessive inference costs and latency when applied uniformly to all problems. We propose SABER…

Computation and Language · Computer Science 2025-08-15 Kai Zhao , Yanjun Zhao , Jiaming Song , Shien He , Lusheng Zhang , Qiang Zhang , Tianjiao Li

Inductive Moment Matching

Diffusion models and Flow Matching generate high-quality samples but are slow at inference, and distilling them into few-step models often leads to instability and extensive tuning. To resolve these trade-offs, we propose Inductive Moment…

Machine Learning · Computer Science 2025-05-16 Linqi Zhou , Stefano Ermon , Jiaming Song

Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms

Recent years have seen significant advancements in foundation models through generative pre-training, yet algorithmic innovation in this space has largely stagnated around autoregressive models for discrete signals and diffusion models for…

Machine Learning · Computer Science 2025-03-12 Jiaming Song , Linqi Zhou

Score-based Diffusion Models in Function Space

Diffusion models have recently emerged as a powerful framework for generative modeling. They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate…

Machine Learning · Computer Science 2025-01-23 Jae Hyun Lim , Nikola B. Kovachki , Ricardo Baptista , Christopher Beckham , Kamyar Azizzadenesheli , Jean Kossaifi , Vikram Voleti , Jiaming Song , Karsten Kreis , Jan Kautz , Christopher Pal , Arash Vahdat , Anima Anandkumar

Personalized Preference Fine-tuning of Diffusion Models

RLHF techniques like DPO can significantly improve the generation quality of text-to-image diffusion models. However, these methods optimize for a single reward that aligns model generation with population-level preferences, neglecting the…

Machine Learning · Computer Science 2025-01-14 Meihua Dang , Anikait Singh , Linqi Zhou , Stefano Ermon , Jiaming Song

Decentralized Diffusion Models

Large-scale AI model training divides work across thousands of GPUs, then synchronizes gradients across them at each step. This incurs a significant network burden that only centralized, monolithic clusters can support, driving up…

Computer Vision and Pattern Recognition · Computer Science 2025-01-13 David McAllister , Matthew Tancik , Jiaming Song , Angjoo Kanazawa

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

We introduce Edify Image, a family of diffusion models capable of generating photorealistic image content with pixel-perfect accuracy. Edify Image utilizes cascaded pixel-space diffusion models trained using a novel Laplacian diffusion…

Computer Vision and Pattern Recognition · Computer Science 2024-11-12 NVIDIA , : , Yuval Atzmon , Maciej Bala , Yogesh Balaji , Tiffany Cai , Yin Cui , Jiaojiao Fan , Yunhao Ge , Siddharth Gururani , Jacob Huffman , Ronald Isaac , Pooya Jannaty , Tero Karras , Grace Lam , J. P. Lewis , Aaron Licata , Yen-Chen Lin , Ming-Yu Liu , Qianli Ma , Arun Mallya , Ashlee Martino-Tarr , Doug Mendez , Seungjun Nah , Chris Pruett , Fitsum Reda , Jiaming Song , Ting-Chun Wang , Fangyin Wei , Xiaohui Zeng , Yu Zeng , Qinsheng Zhang

DiffiT: Diffusion Vision Transformers for Image Generation

Diffusion models with their powerful expressivity and high sample quality have achieved State-Of-The-Art (SOTA) performance in the generative domain. The pioneering Vision Transformer (ViT) has also demonstrated strong modeling capabilities…

Computer Vision and Pattern Recognition · Computer Science 2024-08-30 Ali Hatamizadeh , Jiaming Song , Guilin Liu , Jan Kautz , Arash Vahdat

AGG: Amortized Generative 3D Gaussians for Single Image to 3D

Given the growing need for automatic 3D content creation pipelines, various 3D representations have been studied to generate 3D objects from a single image. Due to its superior rendering efficiency, 3D Gaussian splatting-based models have…

Computer Vision and Pattern Recognition · Computer Science 2024-01-09 Dejia Xu , Ye Yuan , Morteza Mardani , Sifei Liu , Jiaming Song , Zhangyang Wang , Arash Vahdat

Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning

Learning Nash equilibrium (NE) in complex zero-sum games with multi-agent reinforcement learning (MARL) can be extremely computationally expensive. Curriculum learning is an effective way to accelerate learning, but an under-explored…

Machine Learning · Computer Science 2023-12-19 Jiayu Chen , Zelai Xu , Yunfei Li , Chao Yu , Jiaming Song , Huazhong Yang , Fei Fang , Yu Wang , Yi Wu

Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems

We introduce a curriculum learning algorithm, Variational Automatic Curriculum Learning (VACL), for solving challenging goal-conditioned cooperative multi-agent reinforcement learning problems. We motivate our paradigm through a variational…

Machine Learning · Computer Science 2023-12-12 Jiayu Chen , Yuanxin Zhang , Yuanfan Xu , Huimin Ma , Huazhong Yang , Jiaming Song , Yu Wang , Yi Wu

SMRD: SURE-based Robust MRI Reconstruction with Diffusion Models

Diffusion models have recently gained popularity for accelerated MRI reconstruction due to their high sample quality. They can effectively serve as rich data priors while incorporating the forward model flexibly at inference time, and they…

Image and Video Processing · Electrical Eng. & Systems 2023-10-20 Batu Ozturkler , Chao Liu , Benjamin Eckart , Morteza Mardani , Jiaming Song , Jan Kautz

SSIF: Learning Continuous Image Representation for Spatial-Spectral Super-Resolution

Existing digital sensors capture images at fixed spatial and spectral resolutions (e.g., RGB, multispectral, and hyperspectral images), and each combination requires bespoke machine learning models. Neural Implicit Functions partially…

Computer Vision and Pattern Recognition · Computer Science 2023-10-03 Gengchen Mai , Ni Lao , Weiwei Sun , Yuchi Ma , Jiaming Song , Chenlin Meng , Hongxu Ma , Jinmeng Rao , Ziyuan Li , Stefano Ermon

A Variational Perspective on Solving Inverse Problems with Diffusion Models

Diffusion models have emerged as a key pillar of foundation models in visual domains. One of their critical applications is to universally solve different downstream inverse tasks via a single diffusion prior without re-training for each…

Machine Learning · Computer Science 2023-10-03 Morteza Mardani , Jiaming Song , Jan Kautz , Arash Vahdat

PhysDiff: Physics-Guided Human Motion Diffusion Model

Denoising diffusion models hold great promise for generating diverse and realistic human motions. However, existing motion diffusion models largely disregard the laws of physics in the diffusion process and often generate…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Ye Yuan , Jiaming Song , Umar Iqbal , Arash Vahdat , Jan Kautz

Improved Order Analysis and Design of Exponential Integrator for Diffusion Models Sampling

Efficient differential equation solvers have significantly reduced the sampling time of diffusion models (DMs) while retaining high sampling quality. Among these solvers, exponential integrators (EI) have gained prominence by demonstrating…

Machine Learning · Computer Science 2023-08-07 Qinsheng Zhang , Jiaming Song , Yongxin Chen