Related papers: Learning with Spectral Kernels and Heavy-Tailed Da…

Learning to Embed Distributions via Maximum Kernel Entropy

Empirical data can often be considered as samples from a set of probability distributions. Kernel methods have emerged as a natural approach for learning to classify these distributions. Although numerous kernels between distributions have…

Machine Learning · Computer Science 2024-12-02 Oleksii Kachaiev , Stefano Recanatesi

Statistical Learning Theory for Distributional Classification

In supervised learning with distributional inputs in the two-stage sampling setup, relevant to applications like learning-based medical screening or causal learning, the inputs (which are probability distributions) are not accessible in the…

Machine Learning · Computer Science 2026-01-22 Christian Fiedler

High-performance Kernel Machines with Implicit Distributed Optimization and Randomization

In order to fully utilize "big data", it is often required to use "big models". Such models tend to grow with the complexity and size of the training data, and do not make strong parametric assumptions upfront on the nature of the…

Machine Learning · Statistics 2015-04-17 Vikas Sindhwani , Haim Avron

Diffusion-based Semi-supervised Spectral Algorithm for Regression on Manifolds

We introduce a novel diffusion-based spectral algorithm to tackle regression analysis on high-dimensional data, particularly data embedded within lower-dimensional manifolds. Traditional spectral algorithms often fall short in such…

Machine Learning · Statistics 2024-10-21 Weichun Xia , Jiaxin Jiang , Lei Shi

Learning to Approximate Adaptive Kernel Convolution on Graphs

Various Graph Neural Networks (GNNs) have been successful in analyzing data in non-Euclidean spaces, however, they have limitations such as oversmoothing, i.e., information becomes excessively averaged as the number of hidden layers…

Machine Learning · Computer Science 2024-01-23 Jaeyoon Sim , Sooyeon Jeon , InJun Choi , Guorong Wu , Won Hwa Kim

Distribution Regression for Sequential Data

Distribution regression refers to the supervised learning problem where labels are only available for groups of inputs instead of individual inputs. In this paper, we develop a rigorous mathematical framework for distribution regression…

Machine Learning · Computer Science 2021-09-30 Maud Lemercier , Cristopher Salvi , Theodoros Damoulas , Edwin V. Bonilla , Terry Lyons

Long-Tailed Recognition on Binary Networks by Calibrating A Pre-trained Model

Deploying deep models in real-world scenarios entails a number of challenges, including computational efficiency and real-world (e.g., long-tailed) data distributions. We address the combined challenge of learning long-tailed distributions…

Computer Vision and Pattern Recognition · Computer Science 2024-04-02 Jihun Kim , Dahyun Kim , Hyungrok Jung , Taeil Oh , Jonghyun Choi

Learning Sets with Separating Kernels

We consider the problem of learning a set from random samples. We show how relevant geometric and topological properties of a set can be studied analytically using concepts from the theory of reproducing kernel Hilbert spaces. A new kind of…

Machine Learning · Statistics 2014-11-26 Ernesto De Vito , Lorenzo Rosasco , Alessandro Toigo

Learning Theory for Distribution Regression

We focus on the distribution regression problem: regressing to vector-valued outputs from probability measures. Many important machine learning and statistical tasks fit into this framework, including multi-instance learning and point…

Statistics Theory · Mathematics 2016-10-24 Zoltan Szabo , Bharath Sriperumbudur , Barnabas Poczos , Arthur Gretton

Deep Kernel Learning for Clustering

We propose a deep learning approach for discovering kernels tailored to identifying clusters over sample data. Our neural network produces sample embeddings that are motivated by--and are at least as expressive as--spectral clustering. Our…

Machine Learning · Computer Science 2020-01-03 Chieh Wu , Zulqarnain Khan , Yale Chang , Stratis Ioannidis , Jennifer Dy

How isotropic kernels perform on simple invariants

We investigate how the training curve of isotropic kernel methods depends on the symmetry of the task to be learned, in several settings. (i) We consider a regression task, where the target function is a Gaussian random field that depends…

Machine Learning · Computer Science 2020-12-16 Jonas Paccolat , Stefano Spigler , Matthieu Wyart

Distribution-Dependent Sample Complexity of Large Margin Learning

We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L2 regularization: We introduce the margin-adapted dimension, which is a simple function of the second order statistics of…

Machine Learning · Statistics 2013-09-19 Sivan Sabato , Nathan Srebro , Naftali Tishby

Learning Mixtures of Discrete Product Distributions using Spectral Decompositions

We study the problem of learning a distribution from samples, when the underlying distribution is a mixture of product distributions over discrete domains. This problem is motivated by several practical applications such as crowd-sourcing,…

Machine Learning · Statistics 2014-05-20 Prateek Jain , Sewoong Oh

Learning with invariances in random features and kernel models

A number of machine learning tasks entail a high degree of invariance: the data distribution does not change if we act on the data with a certain group of transformations. For instance, labels of images are invariant under translations of…

Machine Learning · Statistics 2021-03-01 Song Mei , Theodor Misiakiewicz , Andrea Montanari

Two-stage Sampled Learning Theory on Distributions

We focus on the distribution regression problem: regressing to a real-valued response from a probability distribution. Although there exist a large number of similarity measures between distributions, very little is known about their…

Statistics Theory · Mathematics 2015-01-28 Zoltan Szabo , Arthur Gretton , Barnabas Poczos , Bharath Sriperumbudur

A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization

Learning to sample from intractable distributions over discrete sets without relying on corresponding training data is a central problem in a wide range of fields, including Combinatorial Optimization. Currently, popular deep learning-based…

Machine Learning · Computer Science 2025-08-25 Sebastian Sanokowski , Sepp Hochreiter , Sebastian Lehner

Scalable Kernel Learning via the Discriminant Information

Kernel approximation methods create explicit, low-dimensional kernel feature maps to deal with the high computational and memory complexity of standard techniques. This work studies a supervised kernel learning methodology to optimize such…

Machine Learning · Computer Science 2020-02-17 Mert Al , Zejiang Hou , Sun-Yuan Kung

Generative Learning of Densities on Manifolds

A generative modeling framework is proposed that combines diffusion models and manifold learning to efficiently sample data densities on manifolds. The approach utilizes Diffusion Maps to uncover possible low-dimensional underlying (latent)…

Machine Learning · Computer Science 2025-04-22 Dimitris G. Giovanis , Ellis Crabtree , Roger G. Ghanem , Ioannis G. Kevrekidis

Dimension-free Concentration Bounds on Hankel Matrices for Spectral Learning

Learning probabilistic models over strings is an important issue for many applications. Spectral methods propose elegant solutions to the problem of inferring weighted automata from finite samples of variable-length strings drawn from an…

Machine Learning · Computer Science 2013-12-24 François Denis , Mattias Gybels , Amaury Habrard

Ties, Tails and Spectra: On Rank-Based Dependency Measures in High Dimensions

This work is concerned with the limiting spectral distribution of rank-based dependency measures in high dimensions. We provide distribution-free results for multivariate empirical versions of Kendall's $\tau$ and Spearman's $\rho$ in a…

Statistics Theory · Mathematics 2025-08-22 Nina Dörnemann , Michael Fleermann , Johannes Heiny