Related papers: Transformer Neural Processes - Kernel Regression

Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes

Neural Processes (NPs) are a rapidly evolving class of models designed to directly model the posterior predictive distribution of stochastic processes. While early architectures were developed primarily as a scalable alternative to Gaussian…

Machine Learning · Computer Science 2026-04-28 Daniel Jenson , Jhonathan Navott , Piotr Grynfelder , Mengyan Zhang , Makkunda Sharma , Elizaveta Semenova , Seth Flaxman

Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling

Neural Processes (NPs) are a popular class of approaches for meta-learning. Similar to Gaussian Processes (GPs), NPs define distributions over functions and can estimate uncertainty in their predictions. However, unlike GPs, NPs and their…

Machine Learning · Computer Science 2023-02-09 Tung Nguyen , Aditya Grover

Exploring Pseudo-Token Approaches in Transformer Neural Processes

Neural Processes (NPs) have gained attention in meta-learning for their ability to quantify uncertainty, together with their rapid prediction and adaptability. However, traditional NPs are prone to underfitting. Transformer Neural Processes…

Machine Learning · Computer Science 2025-04-22 Jose Lara-Rangel , Nanze Chen , Fengzhe Zhang

R\'enyi Neural Processes

Neural Processes (NPs) are deep probabilistic models that represent stochastic processes by conditioning their prior distributions on a set of context points. Despite their advantages in uncertainty estimation for complex distributions, NPs…

Machine Learning · Computer Science 2025-06-04 Xuesong Wang , He Zhao , Edwin V. Bonilla

Neural Processes

A neural network (NN) is a parameterised function that can be tuned via gradient descent to approximate a labelled collection of data with high precision. A Gaussian process (GP), on the other hand, is a probabilistic model that defines a…

Machine Learning · Computer Science 2018-07-05 Marta Garnelo , Jonathan Schwarz , Dan Rosenbaum , Fabio Viola , Danilo J. Rezende , S. M. Ali Eslami , Yee Whye Teh

Gridded Transformer Neural Processes for Large Unstructured Spatio-Temporal Data

Many important problems require modelling large-scale spatio-temporal datasets, with one prevalent example being weather forecasting. Recently, transformer-based approaches have shown great promise in a range of weather forecasting…

Machine Learning · Statistics 2024-10-11 Matthew Ashman , Cristiana Diaconu , Eric Langezaal , Adrian Weller , Richard E. Turner

Latent Bottlenecked Attentive Neural Processes

Neural Processes (NPs) are popular methods in meta-learning that can estimate predictive uncertainty on target datapoints by conditioning on a context dataset. Previous state-of-the-art method Transformer Neural Processes (TNPs) achieve…

Machine Learning · Computer Science 2023-03-03 Leo Feng , Hossein Hajimirsadeghi , Yoshua Bengio , Mohamed Osama Ahmed

The Gaussian Neural Process

Neural Processes (NPs; Garnelo et al., 2018a,b) are a rich class of models for meta-learning that map data sets directly to predictive stochastic processes. We provide a rigorous analysis of the standard maximum-likelihood objective used to…

Machine Learning · Statistics 2021-01-12 Wessel P. Bruinsma , James Requeima , Andrew Y. K. Foong , Jonathan Gordon , Richard E. Turner

Transformer Dissection: A Unified Understanding of Transformer's Attention via the Lens of Kernel

Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and sequence prediction. At the core of the Transformer is the…

Machine Learning · Computer Science 2019-11-13 Yao-Hung Hubert Tsai , Shaojie Bai , Makoto Yamada , Louis-Philippe Morency , Ruslan Salakhutdinov

RWKV: Reinventing RNNs for the Transformer Era

Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length. In contrast, recurrent neural networks (RNNs) exhibit…

Computation and Language · Computer Science 2023-12-12 Bo Peng , Eric Alcaide , Quentin Anthony , Alon Albalak , Samuel Arcadinho , Stella Biderman , Huanqi Cao , Xin Cheng , Michael Chung , Matteo Grella , Kranthi Kiran GV , Xuzheng He , Haowen Hou , Jiaju Lin , Przemyslaw Kazienko , Jan Kocon , Jiaming Kong , Bartlomiej Koptyra , Hayden Lau , Krishna Sri Ipsit Mantri , Ferdinand Mom , Atsushi Saito , Guangyu Song , Xiangru Tang , Bolun Wang , Johan S. Wind , Stanislaw Wozniak , Ruichong Zhang , Zhenyuan Zhang , Qihang Zhao , Peng Zhou , Qinghua Zhou , Jian Zhu , Rui-Jie Zhu

Spectral Transformer Neural Processes

Time series, spatial data, and images are natural applications of Neural Processes. However, when such data exhibit strong periodicity and quasi-periodicity, existing methods often suffer from underfitting and generalise poorly beyond the…

Machine Learning · Computer Science 2026-05-12 Xianhe Chen , Hao Chen , Yingzhen Li

Learning with Neural Tangent Kernels in Near Input Sparsity Time

The Neural Tangent Kernel (NTK) characterizes the behavior of infinitely wide neural nets trained under least squares loss by gradient descent. However, despite its importance, the super-quadratic runtime of kernel methods limits the use of…

Machine Learning · Computer Science 2021-07-28 Amir Zandieh

Reversible Recurrent Neural Networks

Recurrent neural networks (RNNs) provide state-of-the-art performance in processing sequential data but are memory intensive to train, limiting the flexibility of RNN models which can be trained. Reversible RNNs---RNNs for which the…

Machine Learning · Computer Science 2018-10-26 Matthew MacKay , Paul Vicol , Jimmy Ba , Roger Grosse

Neural Diffusion Processes

Neural network approaches for meta-learning distributions over functions have desirable properties such as increased flexibility and a reduced complexity of inference. Building on the successes of denoising diffusion models for generative…

Machine Learning · Statistics 2023-06-08 Vincent Dutordoir , Alan Saul , Zoubin Ghahramani , Fergus Simpson

Revisiting Kernel Attention with Correlated Gaussian Process Representation

Transformers have increasingly become the de facto method to model sequential data with state-of-the-art performance. Due to its widespread use, being able to estimate and calibrate its modeling uncertainty is important to understand and…

Machine Learning · Computer Science 2025-03-03 Long Minh Bui , Tho Tran Huu , Duy Dinh , Tan Minh Nguyen , Trong Nghia Hoang

Linear Self-Attention Approximation via Trainable Feedforward Kernel

In pursuit of faster computation, Efficient Transformers demonstrate an impressive variety of approaches -- models attaining sub-quadratic attention complexity can utilize a notion of sparsity or a low-rank approximation of inputs to reduce…

Machine Learning · Computer Science 2022-11-09 Uladzislau Yorsh , Alexander Kovalenko

Spectral Convolutional Conditional Neural Processes

Neural Processes (NPs) are meta-learning models that learn to map sets of observations to approximations of the corresponding posterior predictive distributions. By accommodating variable-sized, unstructured collections of observations and…

Machine Learning · Computer Science 2026-02-10 Peiman Mohseni , Nick Duffield

Incremental Transformer Neural Processes

Neural Processes (NPs), and specifically Transformer Neural Processes (TNPs), have demonstrated remarkable performance across tasks ranging from spatiotemporal forecasting to tabular data modelling. However, many of these applications are…

Machine Learning · Computer Science 2026-02-24 Philip Mortimer , Cristiana Diaconu , Tommy Rochussen , Bruno Mlodozeniec , Richard E. Turner

Kervolutional Neural Networks

Convolutional neural networks (CNNs) have enabled the state-of-the-art performance in many computer vision tasks. However, little effort has been devoted to establishing convolution in non-linear space. Existing works mainly leverage on the…

Computer Vision and Pattern Recognition · Computer Science 2020-05-25 Chen Wang , Jianfei Yang , Lihua Xie , Junsong Yuan

Recurrent Attentive Neural Process for Sequential Data

Neural processes (NPs) learn stochastic processes and predict the distribution of target output adaptively conditioned on a context set of observed input-output pairs. Furthermore, Attentive Neural Process (ANP) improved the prediction…

Machine Learning · Computer Science 2019-10-22 Shenghao Qin , Jiacheng Zhu , Jimmy Qin , Wenshuo Wang , Ding Zhao