Related papers: Generative Neural Operators through Diffusion Last…

Flexible Bayesian Last Layer Models Using Implicit Priors and Diffusion Posterior Sampling

Bayesian Last Layer (BLL) models focus solely on uncertainty in the output layer of neural networks, demonstrating comparable performance to more complex Bayesian models. However, the use of Gaussian priors for last layer weights in…

Machine Learning · Computer Science 2024-08-08 Jian Xu , Zhiqi Lin , Shigui Li , Min Chen , Junmei Yang , Delu Zeng , John Paisley

Diffusion models as probabilistic neural operators for recovering unobserved states of dynamical systems

This paper explores the efficacy of diffusion-based generative models as neural operators for partial differential equations (PDEs). Neural operators are neural networks that learn a mapping from the parameter space to the solution space of…

Machine Learning · Computer Science 2024-12-17 Katsiaryna Haitsiukevich , Onur Poyraz , Pekka Marttinen , Alexander Ilin

Latent Uncertainty Representations for Video-based Driver Action and Intention Recognition

Deep neural networks (DNNs) are increasingly applied to safety-critical tasks in resource-constrained environments, such as video-based driver action and intention recognition. While last layer probabilistic deep learning (LL-PDL) methods…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Koen Vellenga , H. Joe Steinhauer , Jonas Andersson , Anders Sjögren

3D-Learning: Diffusion-Augmented Distributionally Robust Decision-Focused Learning

Predict-then-Optimize (PTO) pipelines are widely employed in computing and networked systems, where Machine Learning (ML) models are used to predict critical contextual information for downstream decision-making tasks such as cloud LLM…

Machine Learning · Computer Science 2026-02-04 Jiaqi Wen , Lei Fan , Jianyi Yang

DiLO: Decoupling Generative Priors and Neural Operators via Diffusion Latent Optimization for Inverse Problems

Diffusion models have emerged as powerful generative priors for solving PDE-constrained inverse problems. Compared to end-to-end approaches relying on massive paired datasets, explicitly decoupling the prior distribution of physical…

Numerical Analysis · Mathematics 2026-04-23 Haibo Liu , Guang Lin

Data-regularized Reinforcement Learning for Diffusion Models at Scale

Aligning generative diffusion models with human preferences via reinforcement learning (RL) is critical yet challenging. Most existing algorithms are often vulnerable to reward hacking, such as quality degradation, over-stylization, or…

Machine Learning · Computer Science 2025-12-25 Haotian Ye , Kaiwen Zheng , Jiashu Xu , Puheng Li , Huayu Chen , Jiaqi Han , Sheng Liu , Qinsheng Zhang , Hanzi Mao , Zekun Hao , Prithvijit Chattopadhyay , Dinghao Yang , Liang Feng , Maosheng Liao , Junjie Bai , Ming-Yu Liu , James Zou , Stefano Ermon

A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs

Autoregressive (AR) language models build representations incrementally via left-to-right prediction, while diffusion language models (dLLMs) are trained through full-sequence denoising. Although recent dLLMs match AR performance, whether…

Computation and Language · Computer Science 2026-05-11 Raghavv Goel , Risheek Garrepalli , Sudhanshu Agrawal , Chris Lott , Mingu Lee , Fatih Porikli

Diffusion-DFL: Decision-focused Diffusion Models for Stochastic Optimization

Decision-focused learning (DFL) integrates predictive modeling and optimization by training predictors to optimize the downstream decision target rather than merely minimizing prediction error. To date, existing DFL methods typically rely…

Machine Learning · Computer Science 2025-10-14 Zihao Zhao , Christopher Yeh , Lingkai Kong , Kai Wang

End-to-End Diffusion Latent Optimization Improves Classifier Guidance

Classifier guidance -- using the gradients of an image classifier to steer the generations of a diffusion model -- has the potential to dramatically expand the creative control over image generation and editing. However, currently…

Computer Vision and Pattern Recognition · Computer Science 2023-06-02 Bram Wallace , Akash Gokul , Stefano Ermon , Nikhil Naik

DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation

End-to-end backpropagation requires storing activations throughout all layers, creating memory bottlenecks that limit model scalability. Existing block-wise training methods offer means to alleviate this problem, but they rely on ad-hoc…

Machine Learning · Computer Science 2026-02-19 Makoto Shing , Masanori Koyama , Takuya Akiba

DiRL: An Efficient Post-Training Framework for Diffusion Language Models

Diffusion Language Models (dLLMs) have emerged as promising alternatives to Auto-Regressive (AR) models. While recent efforts have validated their pre-training potential and accelerated inference speeds, the post-training landscape for…

Machine Learning · Computer Science 2026-01-07 Ying Zhu , Jiaxin Wan , Xiaoran Liu , Siyang He , Qiqi Wang , Xu Guo , Tianyi Liang , Zengfeng Huang , Ziwei He , Xipeng Qiu

Diffusion-based Generative Prior for Low-Complexity MIMO Channel Estimation

This work proposes a novel channel estimator based on diffusion models (DMs), one of the currently top-rated generative models. Contrary to related works utilizing generative priors, a lightweight convolutional neural network (CNN) with…

Signal Processing · Electrical Eng. & Systems 2024-03-07 Benedikt Fesl , Michael Baur , Florian Strasser , Michael Joham , Wolfgang Utschick

A Linear Algebraic Approach to Model Parallelism in Deep Learning

Training deep neural networks (DNNs) in large-cluster computing environments is increasingly necessary, as networks grow in size and complexity. Local memory and processing limitations require robust data and model parallelism for crossing…

Machine Learning · Computer Science 2020-06-08 Russell J. Hewett , Thomas J. Grady

Modeling the Second Player in Distributionally Robust Optimization

Distributionally robust optimization (DRO) provides a framework for training machine learning models that are able to perform well on a collection of related data distributions (the "uncertainty set"). This is done by solving a min-max…

Machine Learning · Computer Science 2021-04-01 Paul Michel , Tatsunori Hashimoto , Graham Neubig

Neural Diffusion Models

Diffusion models have shown remarkable performance on many generative tasks. Despite recent success, most diffusion models are restricted in that they only allow linear transformation of the data distribution. In contrast, broader family of…

Machine Learning · Computer Science 2024-06-04 Grigory Bartosh , Dmitry Vetrov , Christian A. Naesseth

Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers

In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks.…

Machine Learning · Computer Science 2023-11-21 Andrea Apicella , Francesco Isgrò , Roberto Prevete

Disentangling Disentangled Representations: Towards Improved Latent Units via Diffusion Models

Disentangled representation learning (DRL) aims to break down observed data into core intrinsic factors for a profound understanding of the data. In real-world scenarios, manually defining and labeling these factors are non-trivial, making…

Machine Learning · Computer Science 2024-11-01 Youngjun Jun , Jiwoo Park , Kyobin Choo , Tae Eun Choi , Seong Jae Hwang

Diffusion Probabilistic Fields

Diffusion probabilistic models have quickly become a major approach for generative modeling of images, 3D geometry, video and other domains. However, to adapt diffusion generative modeling to these domains the denoising network needs to be…

Computer Vision and Pattern Recognition · Computer Science 2023-03-02 Peiye Zhuang , Samira Abnar , Jiatao Gu , Alex Schwing , Joshua M. Susskind , Miguel Ángel Bautista

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question…

Computation and Language · Computer Science 2021-09-06 Paul Michel

In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications

We address the problem of building agents whose goal is to learn to execute out-of distribution (OOD) multi-task instructions expressed in temporal logic (TL) by using deep reinforcement learning (DRL). Recent works provided evidence that…

Artificial Intelligence · Computer Science 2022-02-25 Borja G. León , Murray Shanahan , Francesco Belardinelli