Related papers: Reasoning with Autoregressive-Diffusion Collaborat…

Diffuse Thinking: Exploring Diffusion Language Models as Efficient Thought Proposers for Reasoning

In recent years, large language models (LLMs) have witnessed remarkable advancements, with the test-time scaling law consistently enhancing the reasoning capabilities. Through systematic evaluation and exploration of a diverse spectrum of…

Computation and Language · Computer Science 2025-11-03 Chenyang Shao , Sijian Ren , Fengli Xu , Yong Li

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

Recently, diffusion models have garnered significant interest in the field of text processing due to their many potential advantages compared to conventional autoregressive models. In this work, we propose Diffusion-of-Thought (DoT), a…

Computation and Language · Computer Science 2024-12-06 Jiacheng Ye , Shansan Gong , Liheng Chen , Lin Zheng , Jiahui Gao , Han Shi , Chuan Wu , Xin Jiang , Zhenguo Li , Wei Bi , Lingpeng Kong

The Thinking Pixel: Recursive Sparse Reasoning in Multimodal Diffusion Latents

Diffusion models have achieved success in high-fidelity data synthesis, yet their capacity for more complex, structured reasoning like text following tasks remains constrained. While advances in language models have leveraged strategies…

Computer Vision and Pattern Recognition · Computer Science 2026-04-29 Yuwei Sun , Yuxuan Yao , Hui Li , Siyu Zhu

Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning

Autoregressive language models, despite their impressive capabilities, struggle with complex reasoning and long-term planning tasks. We introduce discrete diffusion models as a novel solution to these challenges. Through the lens of subgoal…

Computation and Language · Computer Science 2025-02-19 Jiacheng Ye , Jiahui Gao , Shansan Gong , Lin Zheng , Xin Jiang , Zhenguo Li , Lingpeng Kong

Autoregressive Models Rival Diffusion Models at ANY-ORDER Generation

Diffusion language models enable any-order generation and bidirectional conditioning, offering appealing flexibility for tasks such as infilling, rewriting, and self-correction. However, their formulation-predicting one part of a sequence…

Computation and Language · Computer Science 2026-01-21 Tianqi Du , Lizhe Fang , Weijie Yang , Chenheng Zhang , Zeming Wei , Yifei Wang , Yisen Wang

Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Humans excel at discovering regular structures from limited samples and applying inferred rules to novel settings. We investigate whether modern generative models can similarly learn underlying rules from finite samples and perform…

Machine Learning · Computer Science 2024-11-13 Binxu Wang , Jiaqi Shang , Haim Sompolinsky

Self-Reflective Reinforcement Learning for Diffusion-based Image Reasoning Generation

Diffusion models have recently demonstrated exceptional performance in image generation task. However, existing image generation methods still significantly suffer from the dilemma of image reasoning, especially in logic-centered image…

Computer Vision and Pattern Recognition · Computer Science 2025-05-29 Jiadong Pan , Zhiyuan Ma , Kaiyan Zhang , Ning Ding , Bowen Zhou

When Diffusion Breaks Constraints: Sequential Autoregressive Generation with RL and MCTS

Data-driven generative models excel in language and vision, but diffusion models often fail in constrained planning and design tasks, exhibiting severe constraint violations in engineering inverse design, molecular generation, multi-robot…

Computer Vision and Pattern Recognition · Computer Science 2026-05-14 Zirui Zhao , Boye Niu , Harold Soh , David Hsu , Wee Sun Lee

Evaluating Latent Generative Paradigms for High-Fidelity 3D Shape Completion from a Single Depth Image

While generative models have seen significant adoption across a wide range of data modalities, including 3D data, a consensus on which model is best suited for which task has yet to be reached. Further, conditional information such as text…

Computer Vision and Pattern Recognition · Computer Science 2026-01-21 Matthias Humt , Ulrich Hillenbrand , Rudolph Triebel

PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model

Autoregressive models for text sometimes generate repetitive and low-quality output because errors accumulate during the steps of generation. This issue is often attributed to exposure bias - the difference between how a model is trained,…

Computation and Language · Computer Science 2024-03-26 Yizhe Zhang , Jiatao Gu , Zhuofeng Wu , Shuangfei Zhai , Josh Susskind , Navdeep Jaitly

Collaborative Diffusion for Multi-Modal Face Generation and Editing

Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further…

Computer Vision and Pattern Recognition · Computer Science 2023-04-21 Ziqi Huang , Kelvin C. K. Chan , Yuming Jiang , Ziwei Liu

Parallel Thinking, Sequential Answering: Bridging NAR and AR for Efficient Reasoning

We study reasoning tasks through a framework that integrates auto-regressive (AR) and non-autoregressive (NAR) language models. AR models, which generate text sequentially, excel at producing coherent outputs but often suffer from slow…

Artificial Intelligence · Computer Science 2025-09-26 Qihang Ai , Haiyun Jiang

Diffusion Models in Low-Level Vision: A Survey

Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities. Among them, diffusion model-based solutions, characterized by a forward diffusion process and a reverse denoising…

Computer Vision and Pattern Recognition · Computer Science 2025-02-26 Chunming He , Yuqi Shen , Chengyu Fang , Fengyang Xiao , Longxiang Tang , Yulun Zhang , Wangmeng Zuo , Zhenhua Guo , Xiu Li

CAR: Controllable Autoregressive Modeling for Visual Generation

Controllable generation, which enables fine-grained control over generated outputs, has emerged as a critical focus in visual generative models. Currently, there are two primary technical approaches in visual generation: diffusion models…

Computer Vision and Pattern Recognition · Computer Science 2024-10-08 Ziyu Yao , Jialin Li , Yifeng Zhou , Yong Liu , Xi Jiang , Chengjie Wang , Feng Zheng , Yuexian Zou , Lei Li

Reasoning with Latent Tokens in Diffusion Language Models

Discrete diffusion models have recently become competitive with autoregressive models for language modeling, even outperforming them on reasoning tasks requiring planning and global coherence, but they require more computation at inference…

Machine Learning · Computer Science 2026-02-04 Andre He , Sean Welleck , Daniel Fried

Spatial Chain-of-Thought: Bridging Understanding and Generation Models for Spatial Reasoning Generation

While diffusion models have shown exceptional capabilities in aesthetic image synthesis, they often struggle with complex spatial understanding and reasoning. Existing approaches resort to Multimodal Large Language Models (MLLMs) to enhance…

Computer Vision and Pattern Recognition · Computer Science 2026-02-13 Wei Chen , Yancheng Long , Mingqiao Liu , Haojie Ding , Yankai Yang , Hongyang Wei , Yi-Fan Zhang , Bin Wen , Fan Yang , Tingting Gao , Han Li , Long Chen

Cosmos: Compressed and Smooth Latent Space for Text Diffusion Modeling

Autoregressive language models dominate modern text generation, yet their sequential nature introduces fundamental limitations: decoding is slow, and maintaining global coherence remains challenging. Diffusion models offer a promising…

Computation and Language · Computer Science 2026-01-06 Viacheslav Meshchaninov , Egor Chimbulatov , Alexander Shabalin , Aleksandr Abramov , Dmitry Vetrov

DiMSam: Diffusion Models as Samplers for Task and Motion Planning under Partial Observability

Generative models such as diffusion models, excel at capturing high-dimensional distributions with diverse input modalities, e.g. robot trajectories, but are less effective at multi-step constraint reasoning. Task and Motion Planning (TAMP)…

Robotics · Computer Science 2024-10-15 Xiaolin Fang , Caelan Reed Garrett , Clemens Eppner , Tomás Lozano-Pérez , Leslie Pack Kaelbling , Dieter Fox

Planning with Diffusion for Flexible Behavior Synthesis

Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers. While conceptually simple,…

Machine Learning · Computer Science 2022-12-22 Michael Janner , Yilun Du , Joshua B. Tenenbaum , Sergey Levine

Unified Multimodal Discrete Diffusion

Multimodal generative models that can understand and generate across multiple modalities are dominated by autoregressive (AR) approaches, which process tokens sequentially from left to right, or top to bottom. These models jointly handle…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Alexander Swerdlow , Mihir Prabhudesai , Siddharth Gandhi , Deepak Pathak , Katerina Fragkiadaki