Related papers: Video Occupancy Models

OPUS: Occupancy Prediction Using a Sparse Set

Occupancy prediction, aiming at predicting the occupancy status within voxelized 3D environment, is quickly gaining momentum within the autonomous driving community. Mainstream occupancy prediction works first discretize the 3D environment…

Computer Vision and Pattern Recognition · Computer Science 2024-11-01 Jiabao Wang , Zhaojiang Liu , Qiang Meng , Liujiang Yan , Ke Wang , Jie Yang , Wei Liu , Qibin Hou , Ming-Ming Cheng

Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models

Object-centric (OC) representations, which model visual scenes as compositions of discrete objects, have the potential to be used in various downstream tasks to achieve systematic compositional generalization and facilitate reasoning.…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Amir Mohammad Karimi Mamaghan , Samuele Papa , Karl Henrik Johansson , Stefan Bauer , Andrea Dittadi

Let Occ Flow: Self-Supervised 3D Occupancy Flow Prediction

Accurate perception of the dynamic environment is a fundamental task for autonomous driving and robot systems. This paper introduces Let Occ Flow, the first self-supervised work for joint 3D occupancy and occupancy flow prediction using…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Yili Liu , Linzhan Mou , Xuan Yu , Chenrui Han , Sitong Mao , Rong Xiong , Yue Wang

Video Prediction Models as General Visual Encoders

This study explores the potential of open-source video conditional generation models as encoders for downstream tasks, focusing on instance segmentation using the BAIR Robot Pushing Dataset. The researchers propose using video prediction…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 James Maier , Nishanth Mohankumar

SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model

This paper introduces a novel architecture for trajectory-conditioned forecasting of future 3D scene occupancy. In contrast to methods that rely on variational autoencoders (VAEs) to generate discrete occupancy tokens, which inherently…

Computer Vision and Pattern Recognition · Computer Science 2026-04-15 Jiayuan Du , Yiming Zhao , Zhenglong Guo , Yong Pan , Wenbo Hou , Zhihui Hao , Kun Zhan , Qijun Chen

Scene Matters: Model-based Deep Video Compression

Video compression has always been a popular research area, where many traditional and deep video compression methods have been proposed. These methods typically rely on signal prediction theory to enhance compression performance by…

Computer Vision and Pattern Recognition · Computer Science 2023-08-31 Lv Tang , Xinfeng Zhang , Gai Zhang , Xiaoqi Ma

POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images

We describe an approach to predict open-vocabulary 3D semantic voxel occupancy map from input 2D images with the objective of enabling 3D grounding, segmentation and retrieval of free-form language queries. This is a challenging problem…

Computer Vision and Pattern Recognition · Computer Science 2024-01-18 Antonin Vobecky , Oriane Siméoni , David Hurych , Spyros Gidaris , Andrei Bursuc , Patrick Pérez , Josef Sivic

GaussianFlowOcc: Sparse and Weakly Supervised Occupancy Estimation using Gaussian Splatting and Temporal Flow

Occupancy estimation has become a prominent task in 3D computer vision, particularly within the autonomous driving community. In this paper, we present a novel approach to occupancy estimation, termed GaussianFlowOcc, which is inspired by…

Computer Vision and Pattern Recognition · Computer Science 2025-08-26 Simon Boeder , Fabian Gigengack , Benjamin Risse

VEOcc: Voxel-Centric Online Semantic Occupancy Prediction For Embodied Scene Understanding

Crucial for autonomous exploration, online 3D occupancy prediction and mapping incrementally constructs dense spatial representations on the fly. However, recent Gaussian-centric methods struggle with structural boundary fidelity and rely…

Computer Vision and Pattern Recognition · Computer Science 2026-05-29 Ruoyu Wang , Yong Liu , Sheng Tao , Yuhang Lin , Yukai Ma

Improved Conditional VRNNs for Video Prediction

Predicting future frames for a video sequence is a challenging generative modeling task. Promising approaches include probabilistic latent variable models such as the Variational Auto-Encoder. While VAEs can handle uncertainty and model…

Computer Vision and Pattern Recognition · Computer Science 2019-04-30 Lluis Castrejon , Nicolas Ballas , Aaron Courville

Using Statistical Models to Detect Occupancy in Buildings through Monitoring VOC, CO$_2$, and other Environmental Factors

Dynamic models of occupancy patterns have shown to be effective in optimizing building-systems operations. Previous research has relied on CO$_2$ sensors and vision-based techniques to determine occupancy patterns. Vision-based techniques…

Machine Learning · Computer Science 2022-03-10 Mahsa Pahlavikhah Varnosfaderani , Arsalan Heydarian , Farrokh Jazizadeh

Test-Time 3D Occupancy Prediction

Self-supervised 3D occupancy prediction offers a promising solution for understanding complex driving scenes without requiring costly 3D annotations. However, training dense occupancy decoders to capture fine-grained geometry and semantics…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Fengyi Zhang , Xiangyu Sun , Huitong Yang , Zheng Zhang , Zi Huang , Yadan Luo

Model Predictive HVAC Control with Online Occupancy Model

This paper presents an occupancy-predicting control algorithm for heating, ventilation, and air conditioning (HVAC) systems in buildings. It incorporates the building's thermal properties, local weather predictions, and a self-tuning…

Systems and Control · Computer Science 2014-07-29 Justin R. Dobbs , Brandon M. Hencey

TextOCVP: Object-Centric Video Prediction with Language Guidance

Understanding and forecasting future scene states is critical for autonomous agents to plan and act effectively in complex environments. Object-centric models, with structured latent spaces, have shown promise in modeling object dynamics…

Computer Vision and Pattern Recognition · Computer Science 2026-02-06 Angel Villar-Corrales , Gjergj Plepi , Sven Behnke

OVO: Open-Vocabulary Occupancy

Semantic occupancy prediction aims to infer dense geometry and semantics of surroundings for an autonomous agent to operate safely in the 3D environment. Existing occupancy prediction methods are almost entirely trained on human-annotated…

Computer Vision and Pattern Recognition · Computer Science 2023-06-16 Zhiyu Tan , Zichao Dong , Cheng Zhang , Weikun Zhang , Hang Ji , Hao Li

Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving

Robotic perception requires the modeling of both 3D geometry and semantics. Existing methods typically focus on estimating 3D bounding boxes, neglecting finer geometric details and struggling to handle general, out-of-vocabulary objects. 3D…

Computer Vision and Pattern Recognition · Computer Science 2023-12-14 Xiaoyu Tian , Tao Jiang , Longfei Yun , Yucheng Mao , Huitong Yang , Yue Wang , Yilun Wang , Hang Zhao

OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Understanding the evolution of 3D scenes is important for effective autonomous driving. While conventional methods mode scene development with the motion of individual instances, world models emerge as a generative framework to describe the…

Computer Vision and Pattern Recognition · Computer Science 2024-05-31 Lening Wang , Wenzhao Zheng , Yilong Ren , Han Jiang , Zhiyong Cui , Haiyang Yu , Jiwen Lu

FreeOcc: Training-Free Embodied Open-Vocabulary Occupancy Prediction

Existing learning-based occupancy prediction methods rely on large-scale 3D annotations and generalize poorly across environments. We present FreeOcc, a training-free framework for open-vocabulary occupancy prediction from monocular or…

Robotics · Computer Science 2026-05-01 Zeyu Jiang , Changqing Zhou , Xingxing Zuo , Changhao Chen

VectorFlow: Combining Images and Vectors for Traffic Occupancy and Flow Prediction

Predicting future behaviors of road agents is a key task in autonomous driving. While existing models have demonstrated great success in predicting marginal agent future behaviors, it remains a challenge to efficiently predict consistent…

Computer Vision and Pattern Recognition · Computer Science 2022-08-10 Xin Huang , Xiaoyu Tian , Junru Gu , Qiao Sun , Hang Zhao

OccTENS: 3D Occupancy World Model via Temporal Next-Scale Prediction

In this paper, we propose OccTENS, a generative occupancy world model that enables controllable, high-fidelity long-term occupancy generation while maintaining computational efficiency. Different from visual generation, the occupancy world…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Bu Jin , Songen Gu , Xiaotao Hu , Yupeng Zheng , Xiaoyang Guo , Qian Zhang , Xiaoxiao Long , Wei Yin