Related papers: StackGen: Generating Stable Structures from Silhou…

Generating Stable Placements via Physics-guided Diffusion Models

Stably placing an object in a multi-object scene is a fundamental challenge in robotic manipulation, as placements must be penetration-free, establish precise surface contact, and result in a force equilibrium. To assess stability, existing…

Robotics · Computer Science 2025-09-29 Philippe Nadeau , Miguel Rogel , Ivan Bilić , Ivan Petrović , Jonathan Kelly

6-DoF Stability Field via Diffusion Models

A core capability for robot manipulation is reasoning over where and how to stably place objects in cluttered environments. Traditionally, robots have relied on object-specific, hand-crafted heuristics in order to perform such reasoning,…

Robotics · Computer Science 2023-10-27 Takuma Yoneda , Tianchong Jiang , Gregory Shakhnarovich , Matthew R. Walter

StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects

Robots operating in human environments must be able to rearrange objects into semantically-meaningful configurations, even if these objects are previously unseen. In this work, we focus on the problem of building physically-valid structures…

Robotics · Computer Science 2023-04-26 Weiyu Liu , Yilun Du , Tucker Hermans , Sonia Chernova , Chris Paxton

Robot Shape and Location Retention in Video Generation Using Diffusion Models

Diffusion models have marked a significant milestone in the enhancement of image and video generation technologies. However, generating videos that precisely retain the shape and location of moving objects such as robots remains a…

Robotics · Computer Science 2024-07-04 Peng Wang , Zhihao Guo , Abdul Latheef Sait , Minh Huy Pham

Toward Stable World Models: Measuring and Addressing World Instability in Generative Environments

We present a novel study on enhancing the capability of preserving the content in world models, focusing on a property we term World Stability. Recent diffusion-based generative models have advanced the synthesis of immersive and realistic…

Machine Learning · Computer Science 2025-03-12 Soonwoo Kwon , Jin-Young Kim , Hyojun Go , Kyungjune Baek

Training-Free Location-Aware Text-to-Image Synthesis

Current large-scale generative models have impressive efficiency in generating high-quality images based on text prompts. However, they lack the ability to precisely control the size and position of objects in the generated image. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-04-27 Jiafeng Mao , Xueting Wang

ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking

Physical intuition is pivotal for intelligent agents to perform complex tasks. In this paper we investigate the passive acquisition of an intuitive understanding of physical principles as well as the active utilisation of this intuition in…

Computer Vision and Pattern Recognition · Computer Science 2018-07-09 Oliver Groth , Fabian B. Fuchs , Ingmar Posner , Andrea Vedaldi

StarGen: A Spatiotemporal Autoregression Framework with Video Diffusion Model for Scalable and Controllable Scene Generation

Recent advances in large reconstruction and generative models have significantly improved scene reconstruction and novel view generation. However, due to compute limitations, each inference with these large models is confined to a small…

Computer Vision and Pattern Recognition · Computer Science 2025-04-15 Shangjin Zhai , Zhichao Ye , Jialin Liu , Weijian Xie , Jiaqi Hu , Zhen Peng , Hua Xue , Danpeng Chen , Xiaomeng Wang , Lei Yang , Nan Wang , Haomin Liu , Guofeng Zhang

StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework

Thanks to the powerful generative capacity of diffusion models, recent years have witnessed rapid progress in human motion generation. Existing diffusion-based methods employ disparate network architectures and training strategies. The…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Yiheng Huang , Hui Yang , Chuanchen Luo , Yuxi Wang , Shibiao Xu , Zhaoxiang Zhang , Man Zhang , Junran Peng

A Framework of Full-Process Generation Design for Park Green Spaces Based on Remote Sensing Segmentation-GAN-Diffusion

The development of generative design driven by artificial intelligence algorithms is speedy. There are two research gaps in the current research: 1) Most studies only focus on the relationship between design elements and pay little…

Computer Vision and Pattern Recognition · Computer Science 2023-12-19 Ran Chen , Xingjian Yi , Jing Zhao , Yueheng He , Bainian Chen , Xueqi Yao , Fangjun Liu , Haoran Li , Zeke Lian

DiffuGen: Adaptable Approach for Generating Labeled Image Datasets using Stable Diffusion Models

Generating high-quality labeled image datasets is crucial for training accurate and robust machine learning models in the field of computer vision. However, the process of manually labeling real images is often time-consuming and costly. To…

Computer Vision and Pattern Recognition · Computer Science 2023-09-04 Michael Shenoda , Edward Kim

StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Diffusion-based methods can generate realistic images and videos, but they struggle to edit existing objects in a video while preserving their appearance over time. This prevents diffusion models from being applied to natural video editing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Wenhao Chai , Xun Guo , Gaoang Wang , Yan Lu

STAY Diffusion: Styled Layout Diffusion Model for Diverse Layout-to-Image Generation

In layout-to-image (L2I) synthesis, controlled complex scenes are generated from coarse information like bounding boxes. Such a task is exciting to many downstream applications because the input layouts offer strong guidance to the…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Ruyu Wang , Xuefeng Hou , Sabrina Schmedding , Marco F. Huber

Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation

Robots hold great promise for performing repetitive or hazardous tasks, but achieving human-like dexterity, especially in contact-rich and dynamic environments, remains challenging. Rigid robots, which rely on position or velocity control,…

Robotics · Computer Science 2024-10-28 Malek Aburub , Cristian C. Beltran-Hernandez , Tatsuya Kamijo , Masashi Hamaya

Generating Driving Scenes with Diffusion

In this paper we describe a learned method of traffic scene generation designed to simulate the output of the perception system of a self-driving car. In our "Scene Diffusion" system, inspired by latent diffusion, we use a novel combination…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Ethan Pronovost , Kai Wang , Nick Roy

InstaGen: Enhancing Object Detection by Training on Synthetic Dataset

In this paper, we present a novel paradigm to enhance the ability of object detector, e.g., expanding categories or improving detection performance, by training on synthetic dataset generated from diffusion models. Specifically, we…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Chengjian Feng , Yujie Zhong , Zequn Jie , Weidi Xie , Lin Ma

Diffusion Scattering Transforms on Graphs

Stability is a key aspect of data analysis. In many applications, the natural notion of stability is geometric, as illustrated for example in computer vision. Scattering transforms construct deep convolutional representations which are…

Machine Learning · Computer Science 2018-11-28 Fernando Gama , Alejandro Ribeiro , Joan Bruna

Interactive Visual Learning for Stable Diffusion

Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex internal structures and operations often pose challenges for non-experts to grasp. We introduce…

Human-Computer Interaction · Computer Science 2024-04-26 Seongmin Lee , Benjamin Hoover , Hendrik Strobelt , Zijie J. Wang , ShengYun Peng , Austin Wright , Kevin Li , Haekyu Park , Haoyang Yang , Polo Chau

Identification of Invariant Sensorimotor Structures as a Prerequisite for the Discovery of Objects

Perceiving the surrounding environment in terms of objects is useful for any general purpose intelligent agent. In this paper, we investigate a fundamental mechanism making object perception possible, namely the identification of…

Artificial Intelligence · Computer Science 2018-10-12 Nicolas Le Hir , Olivier Sigaud , Alban Laflaquière

SPREAD: Spatial-Physical REasoning via geometry Aware Diffusion

Automated 3D scene generation is pivotal for applications spanning virtual reality, digital content creation, and Embodied AI. While computer graphics prioritizes aesthetic layouts, vision and robotics demand scenes that mirror real-world…

Graphics · Computer Science 2026-03-31 Minzhang Li , Kuixiang Shao , Xuebing Li , Yuyang Jiao , Yinuo Bai , Hengan Zhou , Sixian Shen , Jiayuan Gu , Jingyi Yu