English
Related papers

Related papers: Self-Attention through Kernel-Eigen Pair Sparse Va…

200 papers

Recently, a new line of works has emerged to understand and improve self-attention in Transformers by treating it as a kernel machine. However, existing works apply the methods for symmetric kernels to the asymmetric self-attention,…

Machine Learning · Computer Science 2023-12-06 Yingyi Chen , Qinghua Tao , Francesco Tonin , Johan A. K. Suykens

Transformers have increasingly become the de facto method to model sequential data with state-of-the-art performance. Due to its widespread use, being able to estimate and calibrate its modeling uncertainty is important to understand and…

Machine Learning · Computer Science 2025-03-03 Long Minh Bui , Tho Tran Huu , Duy Dinh , Tan Minh Nguyen , Trong Nghia Hoang

Transformer models have achieved profound success in prediction tasks in a wide range of applications in natural language processing, speech recognition and computer vision. Extending Transformer's success to safety-critical domains…

Machine Learning · Computer Science 2025-09-11 Wenlong Chen , Yingzhen Li

Choosing a proper set of kernel functions is an important problem in learning Gaussian Process (GP) models since each kernel structure has different model complexity and data fitness. Recently, automatic kernel composition methods provide…

Machine Learning · Computer Science 2021-02-25 Anh Tong , Toan Tran , Hung Bui , Jaesik Choi

Kolmogorov-Arnold Networks have emerged as interpretable alternatives to traditional multi-layer perceptrons. However, standard implementations lack principled uncertainty quantification capabilities essential for many scientific…

Machine Learning · Computer Science 2025-12-10 Y. Sungtaek Ju

Large, multi-dimensional spatio-temporal datasets are omnipresent in modern science and engineering. An effective framework for handling such data are Gaussian process deep generative models (GP-DGMs), which employ GP priors over the latent…

Machine Learning · Statistics 2020-10-26 Matthew Ashman , Jonathan So , Will Tebbutt , Vincent Fortuin , Michael Pearce , Richard E. Turner

Gaussian processes (GPs) are powerful probabilistic models that define flexible priors over functions, offering strong interpretability and uncertainty quantification. However, GP models often rely on simple, stationary kernels which can…

Machine Learning · Computer Science 2025-05-20 Nima Negarandeh , Carlos Mora , Ramin Bostanabad

While much research effort has been dedicated to scaling up sparse Gaussian process (GP) models based on inducing variables for big data, little attention is afforded to the other less explored class of low-rank GP approximations that…

Machine Learning · Statistics 2016-11-21 Quang Minh Hoang , Trong Nghia Hoang , Kian Hsiang Low

Kolmogorov-Arnold Networks (KANs) offer a promising alternative to Multi-Layer Perceptron (MLP) by placing learnable univariate functions on network edges, enhancing interpretability. However, standard KANs lack probabilistic outputs,…

Machine Learning · Computer Science 2025-12-02 Y. Sungtaek Ju

Gaussian Processes (GP) have become popular machine-learning methods for kernel-based learning on datasets with complicated covariance structures. In this paper, we present a novel extension to the GP framework using a contaminated normal…

Machine Learning · Computer Science 2024-07-03 Daniel Iong , Matthew McAnear , Yuezhou Qu , Shasha Zou , Gabor Toth , Yang Chen

Gaussian processes (GPs) provide a nonparametric representation of functions. However, classical GP inference suffers from high computational cost and it is difficult to design nonstationary GP priors in practice. In this paper, we propose…

Machine Learning · Computer Science 2013-03-15 Yuan Qi , Bo Dai , Yao Zhu

This paper presents a variational Bayesian kernel selection (VBKS) algorithm for sparse Gaussian process regression (SGPR) models. In contrast to existing GP kernel selection algorithms that aim to select only one kernel with the highest…

Machine Learning · Computer Science 2019-12-06 Tong Teng , Jie Chen , Yehong Zhang , Kian Hsiang Low

A central challenge in Bayesian inference is efficiently approximating posterior distributions. Stein Variational Gradient Descent (SVGD) is a popular variational inference method which transports a set of particles to approximate a target…

Machine Learning · Statistics 2025-12-05 Moritz Melcher , Simon Weissmann , Ashia C. Wilson , Jakob Zech

Gaussian processes (GPs) stand as crucial tools in machine learning and signal processing, with their effectiveness hinging on kernel design and hyper-parameter optimization. This paper presents a novel GP linear multiple kernel (LMK) and a…

Machine Learning · Computer Science 2025-01-17 Richard Cornelius Suwandi , Zhidi Lin , Feng Yin , Zhiguo Wang , Sergios Theodoridis

Gaussian processes (GPs) offer appealing properties but are costly to train at scale. Sparse variational GP (SVGP) approximations reduce cost yet still rely on Cholesky decompositions of kernel matrices, ill-suited to low-precision,…

Machine Learning · Statistics 2026-04-02 Stefano Cortinovis , Laurence Aitchison , Stefanos Eleftheriadis , Mark van der Wilk

Gaussian processes (GPs) provide a probabilistic nonparametric representation of functions in regression, classification, and other problems. Unfortunately, exact learning with GPs is intractable for large datasets. A variety of approximate…

Machine Learning · Computer Science 2010-02-23 Yuan Qi , Ahmed H. Abdel-Gawad , Thomas P. Minka

Gaussian processes (GPs) provide a principled Bayesian framework for uncertainty estimation, but their computational complexity severely limits scalability to large datasets. We propose SIKA-GP, which accelerates GP inference using sparse…

Machine Learning · Computer Science 2026-05-27 Wenyuan Zhao , Rui Tuo , Chao Tian

A Gaussian Process (GP) is a prominent mathematical framework for stochastic function approximation in science and engineering applications. This success is largely attributed to the GP's analytical tractability, robustness, non-parametric…

Machine Learning · Statistics 2022-05-19 Marcus M. Noack , Harinarayan Krishnan , Mark D. Risser , Kristofer G. Reyes

Deep Gaussian process models typically employ discrete hierarchies, but recent advancements in differential Gaussian processes (DiffGPs) have extended these models to infinite depths. However, existing DiffGP approaches often overlook the…

Machine Learning · Computer Science 2025-12-16 Jian Xu , Zhiqi Lin , Min Chen , Junmei Yang , Delu Zeng , John Paisley

We investigate the connections between sparse approximation methods for making kernel methods and Gaussian processes (GPs) scalable to large-scale data, focusing on the Nystr\"om method and the Sparse Variational Gaussian Processes (SVGP).…

Machine Learning · Statistics 2023-02-09 Veit Wild , Motonobu Kanagawa , Dino Sejdinovic
‹ Prev 1 2 3 10 Next ›