English
Related papers

Related papers: Generative Neural Operators through Diffusion Last…

200 papers

Bayesian Last Layer (BLL) models focus solely on uncertainty in the output layer of neural networks, demonstrating comparable performance to more complex Bayesian models. However, the use of Gaussian priors for last layer weights in…

Machine Learning · Computer Science 2024-08-08 Jian Xu , Zhiqi Lin , Shigui Li , Min Chen , Junmei Yang , Delu Zeng , John Paisley

This paper explores the efficacy of diffusion-based generative models as neural operators for partial differential equations (PDEs). Neural operators are neural networks that learn a mapping from the parameter space to the solution space of…

Machine Learning · Computer Science 2024-12-17 Katsiaryna Haitsiukevich , Onur Poyraz , Pekka Marttinen , Alexander Ilin

Deep neural networks (DNNs) are increasingly applied to safety-critical tasks in resource-constrained environments, such as video-based driver action and intention recognition. While last layer probabilistic deep learning (LL-PDL) methods…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Koen Vellenga , H. Joe Steinhauer , Jonas Andersson , Anders Sjögren

Predict-then-Optimize (PTO) pipelines are widely employed in computing and networked systems, where Machine Learning (ML) models are used to predict critical contextual information for downstream decision-making tasks such as cloud LLM…

Machine Learning · Computer Science 2026-02-04 Jiaqi Wen , Lei Fan , Jianyi Yang

Diffusion models have emerged as powerful generative priors for solving PDE-constrained inverse problems. Compared to end-to-end approaches relying on massive paired datasets, explicitly decoupling the prior distribution of physical…

Numerical Analysis · Mathematics 2026-04-23 Haibo Liu , Guang Lin

Aligning generative diffusion models with human preferences via reinforcement learning (RL) is critical yet challenging. Most existing algorithms are often vulnerable to reward hacking, such as quality degradation, over-stylization, or…

Autoregressive (AR) language models build representations incrementally via left-to-right prediction, while diffusion language models (dLLMs) are trained through full-sequence denoising. Although recent dLLMs match AR performance, whether…

Computation and Language · Computer Science 2026-05-11 Raghavv Goel , Risheek Garrepalli , Sudhanshu Agrawal , Chris Lott , Mingu Lee , Fatih Porikli

Decision-focused learning (DFL) integrates predictive modeling and optimization by training predictors to optimize the downstream decision target rather than merely minimizing prediction error. To date, existing DFL methods typically rely…

Machine Learning · Computer Science 2025-10-14 Zihao Zhao , Christopher Yeh , Lingkai Kong , Kai Wang

Classifier guidance -- using the gradients of an image classifier to steer the generations of a diffusion model -- has the potential to dramatically expand the creative control over image generation and editing. However, currently…

Computer Vision and Pattern Recognition · Computer Science 2023-06-02 Bram Wallace , Akash Gokul , Stefano Ermon , Nikhil Naik

End-to-end backpropagation requires storing activations throughout all layers, creating memory bottlenecks that limit model scalability. Existing block-wise training methods offer means to alleviate this problem, but they rely on ad-hoc…

Machine Learning · Computer Science 2026-02-19 Makoto Shing , Masanori Koyama , Takuya Akiba

Diffusion Language Models (dLLMs) have emerged as promising alternatives to Auto-Regressive (AR) models. While recent efforts have validated their pre-training potential and accelerated inference speeds, the post-training landscape for…

Machine Learning · Computer Science 2026-01-07 Ying Zhu , Jiaxin Wan , Xiaoran Liu , Siyang He , Qiqi Wang , Xu Guo , Tianyi Liang , Zengfeng Huang , Ziwei He , Xipeng Qiu

This work proposes a novel channel estimator based on diffusion models (DMs), one of the currently top-rated generative models. Contrary to related works utilizing generative priors, a lightweight convolutional neural network (CNN) with…

Signal Processing · Electrical Eng. & Systems 2024-03-07 Benedikt Fesl , Michael Baur , Florian Strasser , Michael Joham , Wolfgang Utschick

Training deep neural networks (DNNs) in large-cluster computing environments is increasingly necessary, as networks grow in size and complexity. Local memory and processing limitations require robust data and model parallelism for crossing…

Machine Learning · Computer Science 2020-06-08 Russell J. Hewett , Thomas J. Grady

Distributionally robust optimization (DRO) provides a framework for training machine learning models that are able to perform well on a collection of related data distributions (the "uncertainty set"). This is done by solving a min-max…

Machine Learning · Computer Science 2021-04-01 Paul Michel , Tatsunori Hashimoto , Graham Neubig

Diffusion models have shown remarkable performance on many generative tasks. Despite recent success, most diffusion models are restricted in that they only allow linear transformation of the data distribution. In contrast, broader family of…

Machine Learning · Computer Science 2024-06-04 Grigory Bartosh , Dmitry Vetrov , Christian A. Naesseth

In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks.…

Machine Learning · Computer Science 2023-11-21 Andrea Apicella , Francesco Isgrò , Roberto Prevete

Disentangled representation learning (DRL) aims to break down observed data into core intrinsic factors for a profound understanding of the data. In real-world scenarios, manually defining and labeling these factors are non-trivial, making…

Machine Learning · Computer Science 2024-11-01 Youngjun Jun , Jiwoo Park , Kyobin Choo , Tae Eun Choi , Seong Jae Hwang

Diffusion probabilistic models have quickly become a major approach for generative modeling of images, 3D geometry, video and other domains. However, to adapt diffusion generative modeling to these domains the denoising network needs to be…

Computer Vision and Pattern Recognition · Computer Science 2023-03-02 Peiye Zhuang , Samira Abnar , Jiatao Gu , Alex Schwing , Joshua M. Susskind , Miguel Ángel Bautista

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question…

Computation and Language · Computer Science 2021-09-06 Paul Michel

We address the problem of building agents whose goal is to learn to execute out-of distribution (OOD) multi-task instructions expressed in temporal logic (TL) by using deep reinforcement learning (DRL). Recent works provided evidence that…

Artificial Intelligence · Computer Science 2022-02-25 Borja G. León , Murray Shanahan , Francesco Belardinelli
‹ Prev 1 2 3 10 Next ›