Related papers: Language Driven Occupancy Prediction

OVO: Open-Vocabulary Occupancy

Semantic occupancy prediction aims to infer dense geometry and semantics of surroundings for an autonomous agent to operate safely in the 3D environment. Existing occupancy prediction methods are almost entirely trained on human-annotated…

Computer Vision and Pattern Recognition · Computer Science 2023-06-16 Zhiyu Tan , Zichao Dong , Cheng Zhang , Weikun Zhang , Hang Ji , Hao Li

FreeOcc: Training-Free Embodied Open-Vocabulary Occupancy Prediction

Existing learning-based occupancy prediction methods rely on large-scale 3D annotations and generalize poorly across environments. We present FreeOcc, a training-free framework for open-vocabulary occupancy prediction from monocular or…

Robotics · Computer Science 2026-05-01 Zeyu Jiang , Changqing Zhou , Xingxing Zuo , Changhao Chen

LangOcc: Self-Supervised Open Vocabulary Occupancy Estimation via Volume Rendering

The 3D occupancy estimation task has become an important challenge in the area of vision-based autonomous driving recently. However, most existing camera-based methods rely on costly 3D voxel labels or LiDAR scans for training, limiting…

Computer Vision and Pattern Recognition · Computer Science 2024-07-26 Simon Boeder , Fabian Gigengack , Benjamin Risse

Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving

Robotic perception requires the modeling of both 3D geometry and semantics. Existing methods typically focus on estimating 3D bounding boxes, neglecting finer geometric details and struggling to handle general, out-of-vocabulary objects. 3D…

Computer Vision and Pattern Recognition · Computer Science 2023-12-14 Xiaoyu Tian , Tao Jiang , Longfei Yun , Yucheng Mao , Huitong Yang , Yue Wang , Yilun Wang , Hang Zhao

LOC: A General Language-Guided Framework for Open-Set 3D Occupancy Prediction

Vision-Language Models (VLMs) have shown significant progress in open-set challenges. However, the limited availability of 3D datasets hinders their effective application in 3D scene understanding. We propose LOC, a general language-guided…

Computer Vision and Pattern Recognition · Computer Science 2025-10-28 Yuhang Gao , Xiang Xiang , Sheng Zhong , Guoyou Wang

POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images

We describe an approach to predict open-vocabulary 3D semantic voxel occupancy map from input 2D images with the objective of enabling 3D grounding, segmentation and retrieval of free-form language queries. This is a challenging problem…

Computer Vision and Pattern Recognition · Computer Science 2024-01-18 Antonin Vobecky , Oriane Siméoni , David Hurych , Spyros Gidaris , Andrei Bursuc , Patrick Pérez , Josef Sivic

AGO: Adaptive Grounding for Open World 3D Occupancy Prediction

Open-world 3D semantic occupancy prediction aims to generate a voxelized 3D representation from sensor inputs while recognizing both known and unknown objects. Transferring open-vocabulary knowledge from vision-language models (VLMs) offers…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Peizheng Li , Shuxiao Ding , You Zhou , Qingwen Zhang , Onat Inak , Larissa Triess , Niklas Hanselmann , Marius Cordts , Andreas Zell

AutoOcc: Automatic Open-Ended Semantic Occupancy Annotation via Vision-Language Guided Gaussian Splatting

Obtaining high-quality 3D semantic occupancy from raw sensor data remains an essential yet challenging task, often requiring extensive manual labeling. In this work, we propose AutoOcc, a vision-centric automated pipeline for open-ended…

Computer Vision and Pattern Recognition · Computer Science 2025-08-05 Xiaoyu Zhou , Jingqi Wang , Yongtao Wang , Yufei Wei , Nan Dong , Ming-Hsuan Yang

FreeOcc: Training-free Panoptic Occupancy Prediction via Foundation Models

Semantic and panoptic occupancy prediction for road scene analysis provides a dense 3D representation of the ego vehicle's surroundings. Current camera-only approaches typically rely on costly dense 3D supervision or require training models…

Computer Vision and Pattern Recognition · Computer Science 2026-03-09 Andrew Caunes , Thierry Chateau , Vincent Fremont

Can we Trust Unreliable Voxels? Exploring 3D Semantic Occupancy Prediction under Label Noise

3D semantic occupancy prediction is a cornerstone of robotic perception, yet real-world voxel annotations are inherently corrupted by structural artifacts and dynamic trailing effects. This raises a critical but underexplored question: can…

Computer Vision and Pattern Recognition · Computer Science 2026-03-09 Wenxin Li , Kunyu Peng , Di Wen , Junwei Zheng , Jiale Wei , Mengfei Duan , Yuheng Zhang , Rui Fan , Kailun Yang

OccLE: Label-Efficient 3D Semantic Occupancy Prediction

3D semantic occupancy prediction offers an intuitive and efficient scene understanding and has attracted significant interest in autonomous driving perception. Existing approaches either rely on full supervision, which demands costly…

Computer Vision and Pattern Recognition · Computer Science 2026-01-23 Naiyu Fang , Zheyuan Zhou , Fayao Liu , Xulei Yang , Jiacheng Wei , Lemiao Qiu , Hongsheng Li , Guosheng Lin

WildOcc: A Benchmark for Off-Road 3D Semantic Occupancy Prediction

3D semantic occupancy prediction is an essential part of autonomous driving, focusing on capturing the geometric details of scenes. Off-road environments are rich in geometric information, therefore it is suitable for 3D semantic occupancy…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Heng Zhai , Jilin Mei , Chen Min , Liang Chen , Fangzhou Zhao , Yu Hu

ForecastOcc: Vision-based Semantic Occupancy Forecasting

Autonomous driving requires forecasting both geometry and semantics over time to effectively reason about future environment states. Existing vision-based occupancy forecasting methods focus on motion-related categories such as static and…

Computer Vision and Pattern Recognition · Computer Science 2026-02-10 Riya Mohan , Juana Valeria Hurtado , Rohit Mohan , Abhinav Valada

Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes

Open-vocabulary 3D occupancy is vital for embodied agents, which need to understand complex indoor environments where semantic categories are abundant and evolve beyond fixed taxonomies. While recent work has explored open-vocabulary…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Changqing Zhou , Yueru Luo , Han Zhang , Zeyu Jiang , Changhao Chen

A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding

Visual grounding aims to identify objects or regions in a scene based on natural language descriptions, essential for spatially aware perception in autonomous driving. However, existing visual grounding tasks typically depend on bounding…

Computer Vision and Pattern Recognition · Computer Science 2025-09-04 Zhan Shi , Song Wang , Junbo Chen , Jianke Zhu

MinkOcc: Towards real-time label-efficient semantic occupancy prediction

Developing 3D semantic occupancy prediction models often relies on dense 3D annotations for supervised learning, a process that is both labor and resource-intensive, underscoring the need for label-efficient or even label-free approaches.…

Computer Vision and Pattern Recognition · Computer Science 2025-04-04 Samuel Sze , Daniele De Martini , Lars Kunze

DSOcc: Leveraging Depth Awareness and Semantic Aid to Boost Camera-Based 3D Semantic Occupancy Prediction

Camera-based 3D semantic occupancy prediction offers an efficient and cost-effective solution for perceiving surrounding scenes in autonomous driving. However, existing works rely on explicit occupancy state inference, leading to numerous…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Naiyu Fang , Zheyuan Zhou , Kang Wang , Ruibo Li , Lemiao Qiu , Shuyou Zhang , Zhe Wang , Guosheng Lin

VEOcc: Voxel-Centric Online Semantic Occupancy Prediction For Embodied Scene Understanding

Crucial for autonomous exploration, online 3D occupancy prediction and mapping incrementally constructs dense spatial representations on the fly. However, recent Gaussian-centric methods struggle with structural boundary fidelity and rely…

Computer Vision and Pattern Recognition · Computer Science 2026-05-29 Ruoyu Wang , Yong Liu , Sheng Tao , Yuhang Lin , Yukai Ma

Test-Time 3D Occupancy Prediction

Self-supervised 3D occupancy prediction offers a promising solution for understanding complex driving scenes without requiring costly 3D annotations. However, training dense occupancy decoders to capture fine-grained geometry and semantics…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Fengyi Zhang , Xiangyu Sun , Huitong Yang , Zheng Zhang , Zi Huang , Yadan Luo

SUG-Occ: Explicit Semantics and Uncertainty Guided Sparse Learning for Efficient 3D Occupancy Prediction

3D semantic occupancy prediction has emerged as a critical perception task for autonomous driving due to its ability to offer voxel-level semantic and geometric understanding of the environment. However, such a refined representation for…

Computer Vision and Pattern Recognition · Computer Science 2026-03-31 Hanlin Wu , Pengfei Lin , Ehsan Javanmardi , Naren Bao , Bo Qian , Hao Si , Manabu Tsukada