Related papers: Diffusion-based Document Layout Generation

Trajectory-Guided Diffusion for Foreground-Preserving Background Generation in Multi-Layer Documents

We present a diffusion-based framework for document-centric background generation that achieves foreground preservation and multi-page stylistic consistency through latent-space design rather than explicit constraints. Instead of…

Computer Vision and Pattern Recognition · Computer Science 2026-01-30 Taewon Kang

LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models

Creating graphic layouts is a fundamental step in graphic designs. In this work, we present a novel generative model named LayoutDiffusion for automatic layout generation. As layout is typically represented as a sequence of discrete tokens,…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Junyi Zhang , Jiaqi Guo , Shizhao Sun , Jian-Guang Lou , Dongmei Zhang

Unifying Layout Generation with a Decoupled Diffusion Model

Layout generation aims to synthesize realistic graphic scenes consisting of elements with different attributes including category, size, position, and between-element relation. It is a crucial task for reducing the burden on heavy-duty…

Computer Vision and Pattern Recognition · Computer Science 2023-03-10 Mude Hui , Zhizheng Zhang , Xiaoyi Zhang , Wenxuan Xie , Yuwang Wang , Yan Lu

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

Controllable layout generation aims at synthesizing plausible arrangement of element bounding boxes with optional constraints, such as type or position of a specific element. In this work, we try to solve a broad range of layout generation…

Computer Vision and Pattern Recognition · Computer Science 2023-03-15 Naoto Inoue , Kotaro Kikuchi , Edgar Simo-Serra , Mayu Otani , Kota Yamaguchi

Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation

Diffusion models have recently gained recognition for generating diverse and high-quality content, especially in image synthesis. These models excel not only in creating fixed-size images but also in producing panoramic images. However,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-08 Xiaoyu Zhang , Teng Zhou , Xinlong Zhang , Jia Wei , Yongchuan Tang

LayoutDiT: Exploring Content-Graphic Balance in Layout Generation with Diffusion Transformer

Layout generation is a foundation task of graphic design, which requires the integration of visual aesthetics and harmonious expression of content delivery. However, existing methods still face challenges in generating precise and visually…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Yu Li , Yifan Chen , Gongye Liu , Fei Yin , Qingyan Bai , Jie Wu , Hongfa Wang , Ruihang Chu , Yujiu Yang

LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts

Thanks to the rapid development of diffusion models, unprecedented progress has been witnessed in image synthesis. Prior works mostly rely on pre-trained linguistic models, but a text is often too abstract to properly specify all the…

Computer Vision and Pattern Recognition · Computer Science 2023-08-15 Binbin Yang , Yi Luo , Ziliang Chen , Guangrun Wang , Xiaodan Liang , Liang Lin

Panoptic Diffusion Models: co-generation of images and segmentation maps

Recently, diffusion models have demonstrated impressive capabilities in text-guided and image-conditioned image generation. However, existing diffusion models cannot simultaneously generate an image and a panoptic segmentation of objects…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Yinghan Long , Kaushik Roy

STAY Diffusion: Styled Layout Diffusion Model for Diverse Layout-to-Image Generation

In layout-to-image (L2I) synthesis, controlled complex scenes are generated from coarse information like bounding boxes. Such a task is exciting to many downstream applications because the input layouts offer strong guidance to the…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Ruyu Wang , Xuefeng Hou , Sabrina Schmedding , Marco F. Huber

Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models

Diffusion models have shown promise in text generation, but often struggle with generating long, coherent, and contextually accurate text. Token-level diffusion doesn't model word-order dependencies explicitly and operates on short, fixed…

Computation and Language · Computer Science 2025-05-27 Xiaochen Zhu , Georgi Karadzhov , Chenxi Whitehouse , Andreas Vlachos

Context Diffusion: In-Context Aware Image Generation

We propose Context Diffusion, a diffusion-based framework that enables image generation models to learn from visual examples presented in context. Recent work tackles such in-context learning for image generation, where a query image is…

Computer Vision and Pattern Recognition · Computer Science 2025-07-24 Ivona Najdenkoska , Animesh Sinha , Abhimanyu Dubey , Dhruv Mahajan , Vignesh Ramanathan , Filip Radenovic

Human Motion Diffusion Model

Natural and expressive human motion generation is the holy grail of computer animation. It is a challenging task, due to the diversity of possible motion, human perceptual sensitivity to it, and the difficulty of accurately describing it.…

Computer Vision and Pattern Recognition · Computer Science 2022-10-04 Guy Tevet , Sigal Raab , Brian Gordon , Yonatan Shafir , Daniel Cohen-Or , Amit H. Bermano

DiffusionPhase: Motion Diffusion in Frequency Domain

In this study, we introduce a learning-based method for generating high-quality human motion sequences from text descriptions (e.g., ``A person walks forward"). Existing techniques struggle with motion diversity and smooth transitions in…

Computer Vision and Pattern Recognition · Computer Science 2023-12-08 Weilin Wan , Yiming Huang , Shutong Wu , Taku Komura , Wenping Wang , Dinesh Jayaraman , Lingjie Liu

DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models

In this paper, we present DesignDiffusion, a simple yet effective framework for the novel task of synthesizing design images from textual descriptions. A primary challenge lies in generating accurate and style-consistent textual and visual…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Zhendong Wang , Jianmin Bao , Shuyang Gu , Dong Chen , Wengang Zhou , Houqiang Li

Diffusion Modulation via Environment Mechanism Modeling for Planning

Diffusion models have shown promising capabilities in trajectory generation for planning in offline reinforcement learning (RL). However, conventional diffusion-based planning methods often fail to account for the fact that generating…

Artificial Intelligence · Computer Science 2026-02-25 Hanping Zhang , Yuhong Guo

EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation

We introduce Efficient Motion Diffusion Model (EMDM) for fast and high-quality human motion generation. Current state-of-the-art generative diffusion models have produced impressive results but struggle to achieve fast generation without…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Wenyang Zhou , Zhiyang Dou , Zeyu Cao , Zhouyingcheng Liao , Jingbo Wang , Wenjia Wang , Yuan Liu , Taku Komura , Wenping Wang , Lingjie Liu

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model

Human motion modeling is important for many modern graphics applications, which typically require professional skills. In order to remove the skill barriers for laymen, recent motion generation methods can directly generate human motions…

Computer Vision and Pattern Recognition · Computer Science 2022-09-01 Mingyuan Zhang , Zhongang Cai , Liang Pan , Fangzhou Hong , Xinying Guo , Lei Yang , Ziwei Liu

DogLayout: Denoising Diffusion GAN for Discrete and Continuous Layout Generation

Layout Generation aims to synthesize plausible arrangements from given elements. Currently, the predominant methods in layout generation are Generative Adversarial Networks (GANs) and diffusion models, each presenting its own set of…

Computer Vision and Pattern Recognition · Computer Science 2024-12-03 Zhaoxing Gan , Guangnan Ye

Layout Agnostic Scene Text Image Synthesis with Diffusion Models

While diffusion models have significantly advanced the quality of image generation their capability to accurately and coherently render text within these images remains a substantial challenge. Conventional diffusion-based methods for scene…

Computer Vision and Pattern Recognition · Computer Science 2024-09-17 Qilong Zhangli , Jindong Jiang , Di Liu , Licheng Yu , Xiaoliang Dai , Ankit Ramchandani , Guan Pang , Dimitris N. Metaxas , Praveen Krishnan

LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation

Layout-to-image generation refers to the task of synthesizing photo-realistic images based on semantic layouts. In this paper, we propose LayoutDiffuse that adapts a foundational diffusion model pretrained on large-scale image or text-image…

Computer Vision and Pattern Recognition · Computer Science 2023-02-20 Jiaxin Cheng , Xiao Liang , Xingjian Shi , Tong He , Tianjun Xiao , Mu Li