Tianshi Cao — Scifaro

Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

World models for interactive video generation have largely focused on single-agent settings, where future observations are generated from a single control signal. However, many generated environments require multi-agent interaction:…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Fangfu Liu , Kai He , Tianchang Shen , Tianshi Cao , Sanja Fidler , Yueqi Duan , Jun Gao , Igor Gilitschenski , Zian Wang , Xuanchi Ren

Asset Harvester: Extracting 3D Assets from Autonomous Driving Logs for Simulation

Closed-loop simulation is a core component of autonomous vehicle (AV) development, enabling scalable testing, training, and safety validation before real-world deployment. Neural scene reconstruction converts driving logs into interactive…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Tianshi Cao , Jiawei Ren , Yuxuan Zhang , Jaewoo Seo , Jiahui Huang , Shikhar Solanki , Haotian Zhang , Mingfei Guo , Haithem Turki , Muxingzi Li , Yue Zhu , Sipeng Zhang , Zan Gojcic , Sanja Fidler , Kangxue Yin

Lyra 2.0: Explorable Generative 3D Worlds

Recent advances in video generation enable a new paradigm for 3D scene creation: generating camera-controlled videos that simulate scene walkthroughs, then lifting them to 3D via feed-forward reconstruction techniques. This generative…

Computer Vision and Pattern Recognition · Computer Science 2026-04-15 Tianchang Shen , Sherwin Bahmani , Kai He , Sangeetha Grama Srinivasan , Tianshi Cao , Jiawei Ren , Ruilong Li , Zian Wang , Nicholas Sharp , Zan Gojcic , Sanja Fidler , Jiahui Huang , Huan Ling , Jun Gao , Xuanchi Ren

World Simulation with Video Foundation Models for Physical AI

We introduce [Cosmos-Predict2.5], the latest generation of the Cosmos World Foundation Models for Physical AI. Built on a flow-based architecture, [Cosmos-Predict2.5] unifies Text2World, Image2World, and Video2World generation in a single…

Computer Vision and Pattern Recognition · Computer Science 2026-02-26 NVIDIA , : , Arslan Ali , Junjie Bai , Maciej Bala , Yogesh Balaji , Aaron Blakeman , Tiffany Cai , Jiaxin Cao , Tianshi Cao , Elizabeth Cha , Yu-Wei Chao , Prithvijit Chattopadhyay , Mike Chen , Yongxin Chen , Yu Chen , Shuai Cheng , Yin Cui , Jenna Diamond , Yifan Ding , Jiaojiao Fan , Linxi Fan , Liang Feng , Francesco Ferroni , Sanja Fidler , Xiao Fu , Ruiyuan Gao , Yunhao Ge , Jinwei Gu , Aryaman Gupta , Siddharth Gururani , Imad El Hanafi , Ali Hassani , Zekun Hao , Jacob Huffman , Joel Jang , Pooya Jannaty , Jan Kautz , Grace Lam , Xuan Li , Zhaoshuo Li , Maosheng Liao , Chen-Hsuan Lin , Tsung-Yi Lin , Yen-Chen Lin , Huan Ling , Ming-Yu Liu , Xian Liu , Yifan Lu , Alice Luo , Qianli Ma , Hanzi Mao , Kaichun Mo , Seungjun Nah , Yashraj Narang , Abhijeet Panaskar , Lindsey Pavao , Trung Pham , Morteza Ramezanali , Fitsum Reda , Scott Reed , Xuanchi Ren , Haonan Shao , Yue Shen , Stella Shi , Shuran Song , Bartosz Stefaniak , Shangkun Sun , Shitao Tang , Sameena Tasmeen , Lyne Tchapmi , Wei-Cheng Tseng , Jibin Varghese , Andrew Z. Wang , Hao Wang , Haoxiang Wang , Heng Wang , Ting-Chun Wang , Fangyin Wei , Jiashu Xu , Dinghao Yang , Xiaodong Yang , Haotian Ye , Seonghyeon Ye , Xiaohui Zeng , Jing Zhang , Qinsheng Zhang , Kaiwen Zheng , Andrew Zhu , Yuke Zhu

ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation

Recent advances in large generative models have greatly enhanced both image editing and in-context image generation, yet a critical gap remains in ensuring physical consistency, where edited objects must remain coherent. This capability is…

Computer Vision and Pattern Recognition · Computer Science 2025-10-20 Jay Zhangjie Wu , Xuanchi Ren , Tianchang Shen , Tianshi Cao , Kai He , Yifan Lu , Ruiyuan Gao , Enze Xie , Shiyi Lan , Jose M. Alvarez , Jun Gao , Sanja Fidler , Zian Wang , Huan Ling

Masks make discriminative models great again!

We present Image2GS, a novel approach that addresses the challenging problem of reconstructing photorealistic 3D scenes from a single image by focusing specifically on the image-to-3D lifting component of the reconstruction process. By…

Computer Vision and Pattern Recognition · Computer Science 2025-07-02 Tianshi Cao , Marie-Julie Rakotosaona , Ben Poole , Federico Tombari , Michael Niemeyer

Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models

Collecting and annotating real-world data for safety-critical physical AI systems, such as Autonomous Vehicle (AV), is time-consuming and costly. It is especially challenging to capture rare edge cases, which play a critical role in…

Computer Vision and Pattern Recognition · Computer Science 2025-06-19 Xuanchi Ren , Yifan Lu , Tianshi Cao , Ruiyuan Gao , Shengyu Huang , Amirmojtaba Sabour , Tianchang Shen , Tobias Pfaff , Jay Zhangjie Wu , Runjian Chen , Seung Wook Kim , Jun Gao , Laura Leal-Taixe , Mike Chen , Sanja Fidler , Huan Ling

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

We introduce Cosmos-Transfer, a conditional world generation model that can generate world simulations based on multiple spatial control inputs of various modalities such as segmentation, depth, and edge. In the design, the spatial…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 NVIDIA , : , Hassan Abu Alhaija , Jose Alvarez , Maciej Bala , Tiffany Cai , Tianshi Cao , Liz Cha , Joshua Chen , Mike Chen , Francesco Ferroni , Sanja Fidler , Dieter Fox , Yunhao Ge , Jinwei Gu , Ali Hassani , Michael Isaev , Pooya Jannaty , Shiyi Lan , Tobias Lasser , Huan Ling , Ming-Yu Liu , Xian Liu , Yifan Lu , Alice Luo , Qianli Ma , Hanzi Mao , Fabio Ramos , Xuanchi Ren , Tianchang Shen , Xinglong Sun , Shitao Tang , Ting-Chun Wang , Jay Wu , Jiashu Xu , Stella Xu , Kevin Xie , Yuchong Ye , Xiaodong Yang , Xiaohui Zeng , Yu Zeng

Egocentric Video: A New Tool for Capturing Hand Use of Individuals with Spinal Cord Injury at Home

Current upper extremity outcome measures for persons with cervical spinal cord injury (cSCI) lack the ability to directly collect quantitative information in home and community environments. A wearable first-person (egocentric) camera…

Human-Computer Interaction · Computer Science 2024-05-07 Jirapat Likitlersuang , Elizabeth R. Sumitro , Tianshi Cao , Ryan J. Visee , Sukhvinder Kalsi-Ryan , Jose Zariffa

LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis

Recent text-to-3D generation approaches produce impressive 3D results but require time-consuming optimization that can take up to an hour per prompt. Amortized methods like ATT3D optimize multiple prompts simultaneously to improve…

Computer Vision and Pattern Recognition · Computer Science 2024-03-25 Kevin Xie , Jonathan Lorraine , Tianshi Cao , Jun Gao , James Lucas , Antonio Torralba , Sanja Fidler , Xiaohui Zeng

Differentially Private Diffusion Models

While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains. Generative models trained with differential privacy (DP) on sensitive data can sidestep this challenge,…

Machine Learning · Statistics 2024-01-02 Tim Dockhorn , Tianshi Cao , Arash Vahdat , Karsten Kreis

TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models

We present TexFusion (Texture Diffusion), a new method to synthesize textures for given 3D geometries, using large-scale text-guided image diffusion models. In contrast to recent works that leverage 2D text-to-image diffusion models to…

Computer Vision and Pattern Recognition · Computer Science 2023-10-24 Tianshi Cao , Karsten Kreis , Sanja Fidler , Nicholas Sharp , Kangxue Yin

Zero-Shot Compositional Policy Learning via Language Grounding

Despite recent breakthroughs in reinforcement learning (RL) and imitation learning (IL), existing algorithms fail to generalize beyond the training environments. In reality, humans can adapt to new tasks quickly by leveraging prior…

Machine Learning · Computer Science 2023-04-18 Tianshi Cao , Jingkang Wang , Yining Zhang , Sivabalan Manivasagam

Scalable Neural Data Server: A Data Recommender for Transfer Learning

Absence of large-scale labeled data in the practitioner's target domain can be a bottleneck to applying machine learning algorithms in practice. Transfer learning is a popular strategy for leveraging additional data to improve the…

Machine Learning · Computer Science 2022-06-22 Tianshi Cao , Sasha Doubov , David Acuna , Sanja Fidler

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

Although machine learning models trained on massive data have led to break-throughs in several areas, their deployment in privacy-sensitive domains remains limited due to restricted access to data. Generative models trained with privacy…

Machine Learning · Computer Science 2022-06-22 Tianshi Cao , Alex Bie , Arash Vahdat , Sanja Fidler , Karsten Kreis

A Theoretical Analysis of the Number of Shots in Few-Shot Learning

Few-shot classification is the task of predicting the category of an example from a set of few labeled examples. The number of labeled examples per category is called the number of shots (or shot number). Recent works tackle this task…

Machine Learning · Computer Science 2022-06-22 Tianshi Cao , Marc Law , Sanja Fidler

A Benchmark of Medical Out of Distribution Detection

Motivation: Deep learning models deployed for use on medical tasks can be equipped with Out-of-Distribution Detection (OoDD) methods in order to avoid erroneous predictions. However it is unclear which OoDD method should be used in…

Machine Learning · Computer Science 2020-08-06 Tianshi Cao , Chin-Wei Huang , David Yu-Tung Hui , Joseph Paul Cohen