English
Related papers

Related papers: Large Language Diffusion Models

200 papers

In this work, we introduce LLaDA-V, a purely diffusion-based Multimodal Large Language Model (MLLM) that integrates visual instruction tuning with masked diffusion models, representing a departure from the autoregressive paradigms dominant…

Machine Learning · Computer Science 2025-06-05 Zebin You , Shen Nie , Xiaolu Zhang , Jun Hu , Jun Zhou , Zhiwu Lu , Ji-Rong Wen , Chongxuan Li

This paper presents LLaDA2.0 -- a tuple of discrete diffusion large language models (dLLM) scaling up to 100B total parameters through systematic conversion from auto-regressive (AR) models -- establishing a new paradigm for frontier-scale…

Masked diffusion models (MDMs) have shown promise in language modeling, yet their scalability and effectiveness in core language tasks, such as text generation and language understanding, remain underexplored. This paper establishes the…

Artificial Intelligence · Computer Science 2025-03-03 Shen Nie , Fengqi Zhu , Chao Du , Tianyu Pang , Qian Liu , Guangtao Zeng , Min Lin , Chongxuan Li

Recent large language models (LLMs) have demonstrated strong reasoning capabilities that benefits from online reinforcement learning (RL). These capabilities have primarily been demonstrated within the left-to-right autoregressive (AR)…

Computation and Language · Computer Science 2025-06-04 Siyan Zhao , Devaansh Gupta , Qinqing Zheng , Aditya Grover

Diffusion-based large language models (DLLMs) have recently attracted growing interest as an alternative to autoregressive decoders. In this work, we present an empirical study on using the diffusion-based large language model LLaDA for…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-02 Mengqi Wang , Zhan Liu , Zengrui Jin , Guangzhi Sun , Chao Zhang , Philip C. Woodland

Large language model (LLM)-based text-to-speech (TTS) systems achieve remarkable naturalness via autoregressive (AR) decoding, but require N sequential steps to generate N speech tokens. We present LLaDA-TTS, which replaces the AR LLM with…

Sound · Computer Science 2026-03-30 Xiaoyu Fan , Huizhi Xie , Wei Zou , Yunzhang Chen

Large Language Models (LLMs) have been transformative. They are pre-trained foundational models that are self-supervised and can be adapted with fine tuning to a wide range of natural language tasks, each of which previously would have…

Computation and Language · Computer Science 2023-02-22 Terrence Sejnowski

Embedding models are a fundamental component of modern AI systems such as semantic search and retrieval-augmented generation. Recent advances in large foundation models have substantially accelerated the development of embedding models,…

Multimedia · Computer Science 2026-02-09 Zihang Wang , Siyue Zhang , Yilun Zhao , Jingyi Yang , Tingyu Song , Anh Tuan Luu , Chen Zhao

Large Language Models (LLMs) have significantly advanced molecular discovery, but existing multimodal molecular architectures fundamentally rely on autoregressive (AR) backbones. This strict left-to-right inductive bias is sub-optimal for…

Artificial Intelligence · Computer Science 2026-04-08 Seohyeon Shin , HanJun Choi , Jun-Hyung Park , Hong Kook Kim , Mansu Kim

Autoregressive models (ARMs) have long dominated the landscape of biomedical vision-language models (VLMs). Recently, masked diffusion models such as LLaDA have emerged as promising alternatives, yet their application in the biomedical…

Computer Vision and Pattern Recognition · Computer Science 2026-02-26 Xuanzhao Dong , Wenhui Zhu , Xiwen Chen , Zhipeng Wang , Peijie Qiu , Shao Tang , Xin Li , Yalin Wang

Large language model (LLM)-based embedding models, benefiting from large scale pre-training and post-training, have begun to surpass BERT and T5-based models on general-purpose text embedding tasks such as document retrieval. However, a…

Computation and Language · Computer Science 2025-05-22 Siyue Zhang , Yilun Zhao , Liyuan Geng , Arman Cohan , Anh Tuan Luu , Chen Zhao

In the rapidly evolving field of machine learning, adversarial attacks present a significant challenge to model robustness and security. Decision-based attacks, which only require feedback on the decision of a model rather than detailed…

Cryptography and Security · Computer Science 2024-05-24 Ping Guo , Fei Liu , Xi Lin , Qingchuan Zhao , Qingfu Zhang

Diffusion Language Models (DLMs) have emerged as a promising new paradigm for text generative modeling, potentially addressing limitations of autoregressive (AR) models. However, current DLMs have been studied at a smaller scale compared to…

Computation and Language · Computer Science 2025-06-03 Shansan Gong , Shivam Agarwal , Yizhe Zhang , Jiacheng Ye , Lin Zheng , Mukai Li , Chenxin An , Peilin Zhao , Wei Bi , Jiawei Han , Hao Peng , Lingpeng Kong

Diffusion large language models (dLLMs) have emerged as a new architecture following auto regressive models. Their denoising process offers a powerful generative advantage, but they present significant challenges in learning and…

Machine Learning · Computer Science 2025-09-24 Ranfei Chen , Ming Chen

Large language models (LLMs) are often used in environments where facts evolve, yet factual knowledge updates via fine-tuning on unstructured text often suffer from 1) reliance on compute-heavy paraphrasing augmentation and 2) the reversal…

Computation and Language · Computer Science 2026-05-07 Xu Pan , Ely Hahami , Jingxuan Fan , Ziqian Xie , Haim Sompolinsky

Large Language Models (LLMs) are known for their expensive and time-consuming training. Thus, oftentimes, LLMs are fine-tuned to address a specific task, given the pretrained weights of a pre-trained LLM considered a foundation model. In…

Computation and Language · Computer Science 2025-12-05 Eshed Gal , Moshe Eliasof , Javier Turek , Uri Ascher , Eran Treister , Eldad Haber

Diffusion language models (DLMs) have recently emerged as competitive alternatives to autoregressive (AR) language models, yet differences in their activation dynamics remain poorly understood. We characterize these dynamics in LLaDA-8B and…

Machine Learning · Computer Science 2026-05-12 Alexander Conzelmann , Albert Catalan-Tatjer , Shiwei Liu

Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent…

Computation and Language · Computer Science 2025-12-08 Tianyi Li , Mingda Chen , Bowei Guo , Zhiqiang Shen

Recent advances in large language models (LLMs) have shown remarkable capabilities across textual and multimodal domains. In parallel, diffusion-based language models have emerged as a promising alternative to the autoregressive paradigm,…

Autoregressive Large Language Models (AR-LLMs) are widely used in software engineering (SE) but face limitations in processing code structure information and suffer from high inference latency. Diffusion LLMs (DLLMs) offer a promising…

Software Engineering · Computer Science 2025-10-07 Jingyao Zhang , Tianlin Li , Xiaoyu Zhang , Qiang Hu , Bin Shi
‹ Prev 1 2 3 10 Next ›