Related papers: SCOPE: Structured Decomposition and Conditional Sk…

GenCape: Structure-Inductive Generative Modeling for Category-Agnostic Pose Estimation

Category-agnostic pose estimation (CAPE) aims to localize keypoints on query images from arbitrary categories, using only a few annotated support examples for guidance. Recent approaches either treat keypoints as isolated entities or rely…

Computer Vision and Pattern Recognition · Computer Science 2026-05-14 Jiyong Rao , Yu Wang , Shengjie Zhao

Semantic Palette: Guiding Scene Generation with Class Proportions

Despite the recent progress of generative adversarial networks (GANs) at synthesizing photo-realistic images, producing complex urban scenes remains a challenging problem. Previous works break down scene generation into two consecutive…

Computer Vision and Pattern Recognition · Computer Science 2021-06-04 Guillaume Le Moing , Tuan-Hung Vu , Himalaya Jain , Patrick Pérez , Matthieu Cord

SCOPE: Semantic Conditioning for Sim2Real Category-Level Object Pose Estimation in Robotics

Object manipulation requires accurate object pose estimation. In open environments, robots encounter unknown objects, which requires semantic understanding in order to generalize both to known categories and beyond. To resolve this…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Peter Hönig , Stefan Thalhammer , Jean-Baptiste Weibel , Matthias Hirschmanner , Markus Vincze

Semantic Bottleneck Scene Generation

Coupling the high-fidelity generation capabilities of label-conditional image synthesis methods with the flexibility of unconditional generative models, we propose a semantic bottleneck GAN model for unconditional synthesis of complex…

Machine Learning · Computer Science 2019-11-27 Samaneh Azadi , Michael Tschannen , Eric Tzeng , Sylvain Gelly , Trevor Darrell , Mario Lucic

SceneComposer: Any-Level Semantic Image Synthesis

We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes. More specifically, the input layout consists of one or more…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Yu Zeng , Zhe Lin , Jianming Zhang , Qing Liu , John Collomosse , Jason Kuen , Vishal M. Patel

A comprehensive operational semantics of the SCOOP programming model

Operational semantics has established itself as a flexible but rigorous means to describe the meaning of programming languages. Oftentimes, it is felt necessary to keep a semantics small, for example to facilitate its use for model checking…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-03-17 Benjamin Morandi , Sebastian Nanz , Bertrand Meyer

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion. In this paper, a new paradigm for…

Computer Vision and Pattern Recognition · Computer Science 2020-08-19 Xiangtai Li , Xia Li , Li Zhang , Guangliang Cheng , Jianping Shi , Zhouchen Lin , Shaohua Tan , Yunhai Tong

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Subject-driven image generation has advanced from single- to multi-subject composition, while neglecting distinction, the ability to distinguish and generate the correct subject when inputs contain multiple candidates. This limitation…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Yuran Wang , Bohan Zeng , Chengzhuo Tong , Wenxuan Liu , Yang Shi , Xiaochen Ma , Hao Liang , Yuanxing Zhang , Wentao Zhang

V-CAGE: Context-Aware Generation and Verification for Scalable Long-Horizon Embodied Tasks

Learning long-horizon embodied behaviors from synthetic data remains challenging because generated scenes are often physically implausible, language-driven programs frequently "succeed" without satisfying task semantics, and high-level…

Robotics · Computer Science 2026-01-22 Yaru Liu , Ao-bo Wang , Nanyang Ye

A Framework for Low-Effort Training Data Generation for Urban Semantic Segmentation

Synthetic datasets are widely used for training urban scene recognition models, but even highly realistic renderings show a noticeable gap to real imagery. This gap is particularly pronounced when adapting to a specific target domain, such…

Computer Vision and Pattern Recognition · Computer Science 2025-10-14 Denis Zavadski , Damjan Kalšan , Tim Küchler , Haebom Lee , Stefan Roth , Carsten Rother

CAPE: Capability Achievement via Policy Execution

Modern AI systems lack a way to express and enforce requirements. Pre-training produces intelligence, and post-training optimizes preferences, but neither guarantees that models reliably satisfy explicit, context-dependent constraints. This…

Software Engineering · Computer Science 2025-12-18 David Ball

Semantic Scene Completion via Integrating Instances and Scene in-the-Loop

Semantic Scene Completion aims at reconstructing a complete 3D scene with precise voxel-wise semantics from a single-view depth or RGBD image. It is a crucial but challenging problem for indoor scene understanding. In this work, we present…

Computer Vision and Pattern Recognition · Computer Science 2021-06-08 Yingjie Cai , Xuesong Chen , Chao Zhang , Kwan-Yee Lin , Xiaogang Wang , Hongsheng Li

Perceptual Taxonomy: Evaluating and Guiding Hierarchical Scene Reasoning in Vision-Language Models

We propose Perceptual Taxonomy, a structured process of scene understanding that first recognizes objects and their spatial configurations, then infers task-relevant properties such as material, affordance, function, and physical attributes…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Jonathan Lee , Xingrui Wang , Jiawei Peng , Luoxin Ye , Zehan Zheng , Tiezheng Zhang , Tao Wang , Wufei Ma , Siyi Chen , Yu-Cheng Chou , Prakhar Kaushik , Alan Yuille

Expand Your SCOPE: Semantic Cognition over Potential-Based Exploration for Embodied Visual Navigation

Embodied visual navigation remains a challenging task, as agents must explore unknown environments with limited knowledge. Existing zero-shot studies have shown that incorporating memory mechanisms to support goal-directed behavior can…

Robotics · Computer Science 2026-03-24 Ningnan Wang , Weihuang Chen , Liming Chen , Haoxuan Ji , Zhongyu Guo , Xuchong Zhang , Hongbin Sun

Setting the Stage: Text-Driven Scene-Consistent Image Generation

We focus on the foundational task of Scene Staging: given a reference scene image and a text condition specifying an actor category to be generated in the scene and its spatial relation to the scene, the goal is to synthesize an output…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Cong Xie , Che Wang , Yan Zhang , Ruiqi Yu , Han Zou , Zheng Pan , Zhenpeng Zhan

SPHERE: Semantic-PHysical Engaged REpresentation for 3D Semantic Scene Completion

Camera-based 3D Semantic Scene Completion (SSC) is a critical task in autonomous driving systems, assessing voxel-level geometry and semantics for holistic scene perception. While existing voxel-based and plane-based SSC methods have…

Computer Vision and Pattern Recognition · Computer Science 2025-11-12 Zhiwen Yang , Yuxin Peng

Towards 3D Object-Centric Feature Learning for Semantic Scene Completion

Vision-based 3D Semantic Scene Completion (SSC) has received growing attention due to its potential in autonomous driving. While most existing approaches follow an ego-centric paradigm by aggregating and diffusing features over the entire…

Computer Vision and Pattern Recognition · Computer Science 2025-12-23 Weihua Wang , Yubo Cui , Xiangru Lin , Zhiheng Li , Zheng Fang

Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators

Structure-guided image completion aims to inpaint a local region of an image according to an input guidance map from users. While such a task enables many practical applications for interactive editing, existing methods often struggle to…

Computer Vision and Pattern Recognition · Computer Science 2024-04-25 Haitian Zheng , Zhe Lin , Jingwan Lu , Scott Cohen , Eli Shechtman , Connelly Barnes , Jianming Zhang , Qing Liu , Yuqian Zhou , Sohrab Amirghodsi , Jiebo Luo

Controlling Style and Semantics in Weakly-Supervised Image Generation

We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene. We exploit sparse semantic maps to control object shapes and classes, as well as…

Computer Vision and Pattern Recognition · Computer Science 2020-11-23 Dario Pavllo , Aurelien Lucchi , Thomas Hofmann

SAGE: Scene Graph-Aware Guidance and Execution for Long-Horizon Manipulation Tasks

Successfully solving long-horizon manipulation tasks remains a fundamental challenge. These tasks involve extended action sequences and complex object interactions, presenting a critical gap between high-level symbolic planning and…

Robotics · Computer Science 2025-09-29 Jialiang Li , Wenzheng Wu , Gaojing Zhang , Yifan Han , Wenzhao Lian