Related papers: DiffLM: Controllable Synthetic Data Generation via…

DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents

Diffusion probabilistic models have been shown to generate state-of-the-art results on several competitive image synthesis benchmarks but lack a low-dimensional, interpretable latent space, and are slow at generation. On the other hand,…

Machine Learning · Computer Science 2022-11-30 Kushagra Pandey , Avideep Mukherjee , Piyush Rai , Abhishek Kumar

Multimodal Latent Language Modeling with Next-Token Diffusion

Multimodal generative models require a unified approach to handle both discrete data (e.g., text and code) and continuous data (e.g., image, audio, video). In this work, we propose Latent Language Modeling (LatentLM), which seamlessly…

Computation and Language · Computer Science 2024-12-12 Yutao Sun , Hangbo Bao , Wenhui Wang , Zhiliang Peng , Li Dong , Shaohan Huang , Jianyong Wang , Furu Wei

Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way

Diffusion-based large language models (dLLMs) have exhibited substantial potential for parallel text generation, which may enable more efficient generation compared to autoregressive models. However, current dLLMs suffer from fixed…

Computation and Language · Computer Science 2025-10-29 Yicun Yang , Cong Wang , Shaobo Wang , Zichen Wen , Biqing Qi , Hanlin Xu , Linfeng Zhang

Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models

Computational modeling of single-cell gene expression is crucial for understanding cellular processes, but generating realistic expression profiles remains a major challenge. This difficulty arises from the count nature of gene expression…

Machine Learning · Statistics 2025-11-06 Giovanni Palla , Sudarshan Babu , Payam Dibaeinia , James D. Pearce , Donghui Li , Aly A. Khan , Theofanis Karaletsos , Jakub M. Tomczak

Multi-Source Music Generation with Latent Diffusion

Most music generation models directly generate a single music mixture. To allow for more flexible and controllable generation, the Multi-Source Diffusion Model (MSDM) has been proposed to model music as a mixture of multiple instrumental…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-18 Zhongweiyang Xu , Debottam Dutta , Yu-Lin Wei , Romit Roy Choudhury

TabDLM: Free-Form Tabular Data Generation via Joint Numerical-Language Diffusion

Synthetic tabular data generation has attracted growing attention due to its importance for data augmentation, foundation models, and privacy. However, real-world tabular datasets increasingly contain free-form text fields (e.g., reviews or…

Machine Learning · Computer Science 2026-05-13 Donghong Cai , Jiarui Feng , Yanbo Wang , Da Zheng , Yixin Chen , Muhan Zhang

DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling

In this paper, we present an effective data augmentation framework leveraging the Large Language Model (LLM) and Diffusion Model (DM) to tackle the challenges inherent in data-scarce scenarios. Recently, DMs have opened up the possibility…

Computer Vision and Pattern Recognition · Computer Science 2024-09-26 Kyuheon Jung , Yongdeuk Seo , Seongwoo Cho , Jaeyoung Kim , Hyun-seok Min , Sungchul Choi

DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation

Diffusion probabilistic models (DPMs) have shown remarkable results on various image synthesis tasks such as text-to-image generation and image inpainting. However, compared to other generative methods like VAEs and GANs, DPMs lack a…

Computer Vision and Pattern Recognition · Computer Science 2023-07-13 Yipeng Leng , Qiangjuan Huang , Zhiyuan Wang , Yangyang Liu , Haoyu Zhang

TextLDM: Language Modeling with Continuous Latent Diffusion

Diffusion Transformers (DiT) trained with flow matching in a VAE latent space have unified visual generation across images and videos. A natural next step toward a single architecture for both generation (visual synthesis) and understanding…

Computation and Language · Computer Science 2026-05-11 Jiaxiu Jiang , Jingjing Ren , Wenbo Li , Bo Wang , Haoze Sun , Yijun Yang , Jianhui Liu , Yanbing Zhang , Shenghe Zheng , Yuan Zhang , Haoyang Huang , Nan Duan , Wangmeng Zuo

DataGen: Unified Synthetic Dataset Generation via Large Language Models

Large Language Models (LLMs) such as GPT-4 and Llama3 have significantly impacted various fields by enabling high-quality synthetic data generation and reducing dependence on expensive human-generated datasets. Despite this, challenges…

Computation and Language · Computer Science 2025-11-18 Yue Huang , Siyuan Wu , Chujie Gao , Dongping Chen , Qihui Zhang , Yao Wan , Tianyi Zhou , Jianfeng Gao , Chaowei Xiao , Lichao Sun , Xiangliang Zhang

High-Resolution Image Synthesis with Latent Diffusion Models

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a…

Computer Vision and Pattern Recognition · Computer Science 2022-04-14 Robin Rombach , Andreas Blattmann , Dominik Lorenz , Patrick Esser , Björn Ommer

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

Large, pretrained latent diffusion models (LDMs) have demonstrated an extraordinary ability to generate creative content, specialize to user data through few-shot fine-tuning, and condition their output on other modalities, such as semantic…

Computer Vision and Pattern Recognition · Computer Science 2024-08-01 Yuru Jia , Lukas Hoyer , Shengyu Huang , Tianfu Wang , Luc Van Gool , Konrad Schindler , Anton Obukhov

Diffusion-LM Improves Controllable Text Generation

Controlling the behavior of language models (LMs) without re-training is a major open problem in natural language generation. While recent works have demonstrated successes on controlling simple sentence attributes (e.g., sentiment), there…

Computation and Language · Computer Science 2022-05-31 Xiang Lisa Li , John Thickstun , Ishaan Gulrajani , Percy Liang , Tatsunori B. Hashimoto

Unveiling the Potential of Diffusion Large Language Model in Controllable Generation

Controllable generation is a fundamental task in NLP with many applications, providing a basis for function calling to agentic communication. However, even state-of-the-art autoregressive Large Language Models (LLMs) today exhibit…

Computation and Language · Computer Science 2025-09-29 Zhen Xiong , Yujun Cai , Zhecheng Li , Yiwei Wang

Representation Learning with Diffusion Models

Diffusion models (DMs) have achieved state-of-the-art results for image synthesis tasks as well as density estimation. Applied in the latent space of a powerful pretrained autoencoder (LDM), their immense computational requirements can be…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Jeremias Traub

Diffusion Model-Based Data Synthesis Aided Federated Semi-Supervised Learning

Federated semi-supervised learning (FSSL) is primarily challenged by two factors: the scarcity of labeled data across clients and the non-independent and identically distribution (non-IID) nature of data among clients. In this paper, we…

Machine Learning · Computer Science 2025-01-07 Zhongwei Wang , Tong Wu , Zhiyong Chen , Liang Qian , Yin Xu , Meixia Tao

Discrete Diffusion in Large Language and Multimodal Models: A Survey

In this work, we provide a systematic survey of Discrete Diffusion Language Models (dLLMs) and Discrete Diffusion Multimodal Language Models (dMLLMs). Unlike autoregressive (AR) models, dLLMs and dMLLMs adopt a multi-token, parallel…

Machine Learning · Computer Science 2025-09-22 Runpeng Yu , Qi Li , Xinchao Wang

On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey

Within the evolving landscape of deep learning, the dilemma of data quantity and quality has been a long-standing problem. The recent advent of Large Language Models (LLMs) offers a data-centric solution to alleviate the limitations of…

Computation and Language · Computer Science 2024-06-24 Lin Long , Rui Wang , Ruixuan Xiao , Junbo Zhao , Xiao Ding , Gang Chen , Haobo Wang

Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models

Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would…

Computer Vision and Pattern Recognition · Computer Science 2023-06-14 Krishna Sri Ipsit Mantri , Nevasini Sasikumar

LayoutDM: Transformer-based Diffusion Model for Layout Generation

Automatic layout generation that can synthesize high-quality layouts is an important tool for graphic design in many applications. Though existing methods based on generative models such as Generative Adversarial Networks (GANs) and…

Computer Vision and Pattern Recognition · Computer Science 2023-05-05 Shang Chai , Liansheng Zhuang , Fengying Yan