Related papers: Interact3D: Compositional 3D Generation of Interac…

InteractMove: Text-Controlled Human-Object Interaction Generation in 3D Scenes with Movable Objects

We propose a novel task of text-controlled human object interaction generation in 3D scenes with movable objects. Existing human-scene interaction datasets suffer from insufficient interaction categories and typically only consider…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Xinhao Cai , Minghang Zheng , Xin Jin , Yang Liu

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

Recent breakthroughs in text-guided image generation have significantly advanced the field of 3D generation. While generating a single high-quality 3D object is now feasible, generating multiple objects with reasonable interactions within a…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Chongjian Ge , Chenfeng Xu , Yuanfeng Ji , Chensheng Peng , Masayoshi Tomizuka , Ping Luo , Mingyu Ding , Varun Jampani , Wei Zhan

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

Traditional image-to-3D models often struggle with scenes containing multiple objects due to biases and occlusion complexities. To address this challenge, we present REPARO, a novel approach for compositional 3D asset generation from single…

Computer Vision and Pattern Recognition · Computer Science 2025-08-28 Haonan Han , Rui Yang , Huan Liao , Jiankai Xing , Zunnan Xu , Xiaoming Yu , Junwei Zha , Xiu Li , Wanhua Li

Interactive3D: Create What You Want by Interactive 3D Generation

3D object generation has undergone significant advancements, yielding high-quality results. However, fall short of achieving precise user control, often yielding results that do not align with user expectations, thus limiting their…

Graphics · Computer Science 2024-04-26 Shaocong Dong , Lihe Ding , Zhanpeng Huang , Zibin Wang , Tianfan Xue , Dan Xu

Comp4D: LLM-Guided Compositional 4D Scene Generation

Recent advancements in diffusion models for 2D and 3D content creation have sparked a surge of interest in generating 4D content. However, the scarcity of 3D scene datasets constrains current methodologies to primarily object-centric…

Computer Vision and Pattern Recognition · Computer Science 2024-03-26 Dejia Xu , Hanwen Liang , Neel P. Bhatt , Hezhen Hu , Hanxue Liang , Konstantinos N. Plataniotis , Zhangyang Wang

Compositional Video Synthesis by Temporal Object-Centric Learning

We present a novel framework for compositional video synthesis that leverages temporally consistent object-centric representations, extending our previous work, SlotAdapt, from images to video. While existing object-centric approaches…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Adil Kaan Akan , Yucel Yemez

Scene-Conditional 3D Object Stylization and Composition

Recently, 3D generative models have made impressive progress, enabling the generation of almost arbitrary 3D assets from text or image inputs. However, these approaches generate objects in isolation without any consideration for the scene…

Computer Vision and Pattern Recognition · Computer Science 2025-05-02 Jinghao Zhou , Tomas Jakab , Philip Torr , Christian Rupprecht

DecompDreamer: A Composition-Aware Curriculum for Structured 3D Asset Generation

Current text-to-3D methods excel at generating single objects but falter on compositional prompts. We argue this failure is fundamental to their optimization schedules, as simultaneous or iterative heuristics predictably collapse under a…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Utkarsh Nath , Rajeev Goel , Rahul Khurana , Kyle Min , Mark Ollila , Pavan Turaga , Varun Jampani , Tejaswi Gowda

gCoRF: Generative Compositional Radiance Fields

3D generative models of objects enable photorealistic image synthesis with 3D control. Existing methods model the scene as a global scene representation, ignoring the compositional aspect of the scene. Compositional reasoning can enable a…

Graphics · Computer Science 2022-11-01 Mallikarjun BR , Ayush Tewari , Xingang Pan , Mohamed Elgharib , Christian Theobalt

CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting

With the onset of diffusion-based generative models and their ability to generate text-conditioned images, content generation has received a massive invigoration. Recently, these models have been shown to provide useful guidance for the…

Computer Vision and Pattern Recognition · Computer Science 2023-11-30 Alexander Vilesov , Pradyumna Chari , Achuta Kadambi

PAct: Part-Decomposed Single-View Articulated Object Generation

Articulated objects are central to interactive 3D applications, including embodied AI, robotics, and VR/AR, where functional part decomposition and kinematic motion are essential. Yet producing high-fidelity articulated assets remains…

Computer Vision and Pattern Recognition · Computer Science 2026-02-17 Qingming Liu , Xinyue Yao , Shuyuan Zhang , Yueci Deng , Guiliang Liu , Zhen Liu , Kui Jia

NCHO: Unsupervised Learning for Neural 3D Composition of Humans and Objects

Deep generative models have been recently extended to synthesizing 3D digital humans. However, previous approaches treat clothed humans as a single chunk of geometry without considering the compositionality of clothing and accessories. As a…

Computer Vision and Pattern Recognition · Computer Science 2023-05-30 Taeksoo Kim , Shunsuke Saito , Hanbyul Joo

3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework

Recent advances in 3D scene generation produce visually appealing output, but current representations hinder artists' workflows that require modifiable 3D textured mesh scenes for visual effects and game development. Despite significant…

Computer Vision and Pattern Recognition · Computer Science 2025-12-22 Tobias Sautter , Jan-Niklas Dihlmann , Hendrik P. A. Lensch

ArtiLatent: Realistic Articulated 3D Object Generation via Structured Latents

We propose ArtiLatent, a generative framework that synthesizes human-made 3D objects with fine-grained geometry, accurate articulation, and realistic appearance. Our approach jointly models part geometry and articulation dynamics by…

Computer Vision and Pattern Recognition · Computer Science 2025-10-27 Honghua Chen , Yushi Lan , Yongwei Chen , Xingang Pan

CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout

Text-to-3D form plays a crucial role in creating editable 3D scenes for AR/VR. Recent advances have shown promise in merging neural radiance fields (NeRFs) with pre-trained diffusion models for text-to-3D object generation. However, one…

Computer Vision and Pattern Recognition · Computer Science 2024-09-25 Haotian Bai , Yuanhuiyi Lyu , Lutao Jiang , Sijia Li , Haonan Lu , Xiaodong Lin , Lin Wang

Learning to Compose: Improving Object Centric Learning by Injecting Compositionality

Learning compositional representation is a key aspect of object-centric learning as it enables flexible systematic generalization and supports complex visual reasoning. However, most of the existing approaches rely on auto-encoding…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Whie Jung , Jaehoon Yoo , Sungjin Ahn , Seunghoon Hong

ObjectStitch: Generative Object Compositing

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results. Furthermore,…

Computer Vision and Pattern Recognition · Computer Science 2022-12-06 Yizhi Song , Zhifei Zhang , Zhe Lin , Scott Cohen , Brian Price , Jianming Zhang , Soo Ye Kim , Daniel Aliaga

ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance

Generating high-quality 3D assets from a given image is highly desirable in various applications such as AR/VR. Recent advances in single-image 3D generation explore feed-forward models that learn to infer the 3D model of an object without…

Computer Vision and Pattern Recognition · Computer Science 2024-03-20 Yongwei Chen , Tengfei Wang , Tong Wu , Xingang Pan , Kui Jia , Ziwei Liu

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields

Deep generative models allow for photorealistic image synthesis at high resolutions. But for many applications, this is not enough: content creation also needs to be controllable. While several recent works investigate how to disentangle…

Computer Vision and Pattern Recognition · Computer Science 2021-04-30 Michael Niemeyer , Andreas Geiger

Drag4D: Align Your Motion with Text-Driven 3D Scene Generation

We introduce Drag4D, an interactive framework that integrates object motion control within text-driven 3D scene generation. This framework enables users to define 3D trajectories for the 3D objects generated from a single image, seamlessly…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Minjun Kang , Inkyu Shin , Taeyeop Lee , In So Kweon , Kuk-Jin Yoon