Related papers: Multi-Garment Customized Model Generation

Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation

Generating sewing patterns in garment design is receiving increasing attention due to its CG-friendly and flexible-editing nature. Previous sewing pattern generation methods have been able to produce exquisite clothing, but struggle to…

Computer Vision and Pattern Recognition · Computer Science 2025-07-08 Shengqi Liu , Yuhao Cheng , Zhuo Chen , Xingyu Ren , Wenhan Zhu , Lincheng Li , Mengxiao Bi , Xiaokang Yang , Yichao Yan

Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models

Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would…

Computer Vision and Pattern Recognition · Computer Science 2023-06-14 Krishna Sri Ipsit Mantri , Nevasini Sasikumar

Fine-Grained Controllable Apparel Showcase Image Generation via Garment-Centric Outpainting

In this paper, we propose a novel garment-centric outpainting (GCO) framework based on the latent diffusion model (LDM) for fine-grained controllable apparel showcase image generation. The proposed framework aims at customizing a fashion…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Rong Zhang , Jingnan Wang , Zhiwen Zuo , Jianfeng Dong , Wei Li , Chi Wang , Weiwei Xu , Xun Wang

Magic Clothing: Controllable Garment-Driven Image Synthesis

We propose Magic Clothing, a latent diffusion model (LDM)-based network architecture for an unexplored garment-driven image synthesis task. Aiming at generating customized characters wearing the target garments with diverse text prompts,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-25 Weifeng Chen , Tao Gu , Yuhao Xu , Chengcai Chen

Unifying Layout Generation with a Decoupled Diffusion Model

Layout generation aims to synthesize realistic graphic scenes consisting of elements with different attributes including category, size, position, and between-element relation. It is a crucial task for reducing the burden on heavy-duty…

Computer Vision and Pattern Recognition · Computer Science 2023-03-10 Mude Hui , Zhizheng Zhang , Xiaoyi Zhang , Wenxuan Xie , Yuwang Wang , Yan Lu

AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

Recent advances in garment-centric image generation from text and image prompts based on diffusion models are impressive. However, existing methods lack support for various combinations of attire, and struggle to preserve the garment…

Computer Vision and Pattern Recognition · Computer Science 2025-01-07 Xinghui Li , Qichao Sun , Pengze Zhang , Fulong Ye , Zhichao Liao , Wanquan Feng , Songtao Zhao , Qian He

Cross-Cultural Fashion Design via Interactive Large Language Models and Diffusion Models

Fashion content generation is an emerging area at the intersection of artificial intelligence and creative design, with applications ranging from virtual try-on to culturally diverse design prototyping. Existing methods often struggle with…

Computation and Language · Computer Science 2025-01-28 Spencer Ramsey , Amina Grant , Jeffrey Lee

Layered-Garment Net: Generating Multiple Implicit Garment Layers from a Single Image

Recent research works have focused on generating human models and garments from their 2D images. However, state-of-the-art researches focus either on only a single layer of the garment on a human model or on generating multiple garment…

Computer Vision and Pattern Recognition · Computer Science 2022-11-23 Alakh Aggarwal , Jikai Wang , Steven Hogue , Saifeng Ni , Madhukar Budagavi , Xiaohu Guo

Controllable Human Image Generation with Personalized Multi-Garments

We present BootComp, a novel framework based on text-to-image diffusion models for controllable human image generation with multiple reference garments. Here, the main bottleneck is data acquisition for training: collecting a large-scale…

Computer Vision and Pattern Recognition · Computer Science 2025-04-02 Yisol Choi , Sangkyung Kwak , Sihyun Yu , Hyungwon Choi , Jinwoo Shin

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

Fashion illustration is used by designers to communicate their vision and to bring the design idea from conceptualization to realization, showing how clothes interact with the human body. In this context, computer vision can thus be used to…

Computer Vision and Pattern Recognition · Computer Science 2023-08-24 Alberto Baldrati , Davide Morelli , Giuseppe Cartella , Marcella Cornia , Marco Bertini , Rita Cucchiara

DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder

Diffusion models for garment-centric human generation from text or image prompts have garnered emerging attention for their great application potential. However, existing methods often face a dilemma: lightweight approaches, such as…

Computer Vision and Pattern Recognition · Computer Science 2025-01-22 Ente Lin , Xujie Zhang , Fuwei Zhao , Yuxuan Luo , Xin Dong , Long Zeng , Xiaodan Liang

StableGarment: Garment-Centric Generation via Stable Diffusion

In this paper, we introduce StableGarment, a unified framework to tackle garment-centric(GC) generation tasks, including GC text-to-image, controllable GC text-to-image, stylized GC text-to-image, and robust virtual try-on. The main…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Rui Wang , Hailong Guo , Jiaming Liu , Huaxia Li , Haibo Zhao , Xu Tang , Yao Hu , Hao Tang , Peipei Li

Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation

Language-guided image generation has achieved great success nowadays by using diffusion models. However, texts can be less detailed to describe highly-specific subjects such as a particular dog or a certain car, which makes pure…

Computer Vision and Pattern Recognition · Computer Science 2023-03-17 Yiyang Ma , Huan Yang , Wenjing Wang , Jianlong Fu , Jiaying Liu

Generating Images with Multimodal Language Models

We propose a method to fuse frozen text-only large language models (LLMs) with pre-trained image encoder and decoder models, by mapping between their embedding spaces. Our model demonstrates a wide suite of multimodal capabilities: image…

Computation and Language · Computer Science 2023-10-16 Jing Yu Koh , Daniel Fried , Ruslan Salakhutdinov

How to Train Your Latent Diffusion Language Model Jointly With the Latent Space

Latent diffusion models offer an attractive alternative to discrete diffusion for non-autoregressive text generation by operating on continuous text representations and denoising entire sequences in parallel. The major challenge in latent…

Computation and Language · Computer Science 2026-05-11 Viacheslav Meshchaninov , Alexander Shabalin , Egor Chimbulatov , Nikita Gushchin , Ilya Koziev , Alexander Korotin , Dmitry Vetrov

FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion

The rapid evolution of the fashion industry increasingly intersects with technological advancements, particularly through the integration of generative AI. This study introduces a novel generative pipeline designed to transform the fashion…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Abhishek Kumar Singh , Ioannis Patras

GarmentDiffusion: 3D Garment Sewing Pattern Generation with Multimodal Diffusion Transformers

Garment sewing patterns are fundamental design elements that bridge the gap between design concepts and practical manufacturing. The generative modeling of sewing patterns is crucial for creating diversified garments. However, existing…

Computer Vision and Pattern Recognition · Computer Science 2025-09-23 Xinyu Li , Qi Yao , Yuanda Wang

TryOffAnyone: Tiled Cloth Generation from a Dressed Person

The fashion industry is increasingly leveraging computer vision and deep learning technologies to enhance online shopping experiences and operational efficiencies. In this paper, we address the challenge of generating high-fidelity tiled…

Computer Vision and Pattern Recognition · Computer Science 2025-01-06 Ioannis Xarchakos , Theodoros Koukopoulos

FCBoost-Net: A Generative Network for Synthesizing Multiple Collocated Outfits via Fashion Compatibility Boosting

Outfit generation is a challenging task in the field of fashion technology, in which the aim is to create a collocated set of fashion items that complement a given set of items. Previous studies in this area have been limited to generating…

Computer Vision and Pattern Recognition · Computer Science 2025-02-04 Dongliang Zhou , Haijun Zhang , Jianghong Ma , Jicong Fan , Zhao Zhang

End-to-End Training for Unified Tokenization and Latent Denoising

Latent diffusion models (LDMs) enable high-fidelity synthesis by operating in learned latent spaces. However, training state-of-the-art LDMs requires complex staging: a tokenizer must be trained first, before the diffusion model can be…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Shivam Duggal , Xingjian Bai , Zongze Wu , Richard Zhang , Eli Shechtman , Antonio Torralba , Phillip Isola , William T. Freeman