English
Related papers

Related papers: Distribution Regression for Sequential Data

200 papers

We focus on the distribution regression problem: regressing to vector-valued outputs from probability measures. Many important machine learning and statistical tasks fit into this framework, including multi-instance learning and point…

Statistics Theory · Mathematics 2016-10-24 Zoltan Szabo , Bharath Sriperumbudur , Barnabas Poczos , Arthur Gretton

This paper addresses the problem of distributed learning under communication constraints, motivated by distributed signal processing in wireless sensor networks and data mining with distributed databases. After formalizing a general model…

Machine Learning · Computer Science 2016-11-15 Joel B. Predd , Sanjeev R. Kulkarni , H. Vincent Poor

We focus on the distribution regression problem: regressing to a real-valued response from a probability distribution. Although there exist a large number of similarity measures between distributions, very little is known about their…

Statistics Theory · Mathematics 2015-01-28 Zoltan Szabo , Arthur Gretton , Barnabas Poczos , Bharath Sriperumbudur

We present a novel framework for kernel learning with sequential data of any kind, such as time series, sequences of graphs, or strings. Our approach is based on signature features which can be seen as an ordered variant of sample…

Machine Learning · Statistics 2016-02-01 Franz J Király , Harald Oberhauser

Distribution Regression (DR) on stochastic processes describes the learning task of regression on collections of time series. Path signatures, a technique prevalent in stochastic analysis, have been used to solve the DR problem. Recent…

Machine Learning · Computer Science 2024-10-15 Andrew Alden , Carmine Ventre , Blanka Horvath

Distributed learning is an effective way to analyze big data. In distributed regression, a typical approach is to divide the big data into multiple blocks, apply a base regression algorithm on each of them, and then simply average the…

Machine Learning · Computer Science 2017-08-08 Zhengchu Guo , Lei Shi , Qiang Wu

In domains such as health care and finance, shortage of labeled data and computational resources is a critical issue while developing machine learning algorithms. To address the issue of labeled data scarcity in training and deployment of…

Machine Learning · Computer Science 2018-10-16 Otkrist Gupta , Ramesh Raskar

In supervised learning with distributional inputs in the two-stage sampling setup, relevant to applications like learning-based medical screening or causal learning, the inputs (which are probability distributions) are not accessible in the…

Machine Learning · Computer Science 2026-01-22 Christian Fiedler

This work studies the problem of learning under both large datasets and large-dimensional feature space scenarios. The feature information is assumed to be spread across agents in a network, where each agent observes some of the features.…

Multiagent Systems · Computer Science 2020-05-26 Bicheng Ying , Kun Yuan , Ali H. Sayed

Distributed learning is the problem of inferring a function in the case where training data is distributed among multiple geographically separated sources. Particularly, the focus is on designing learning strategies with low computational…

Machine Learning · Statistics 2016-07-22 Simone Scardapane

Many machine learning algorithms have been developed under the assumption that data sets are already available in batch form. Yet in many application domains data is only available sequentially overtime via compute nodes in different…

Optimization and Control · Mathematics 2020-09-10 Alfredo Garcia , Luochao Wang , Jeff Huang , Lingzhou Hong

In the problem of domain generalization (DG), there are labeled training data sets from several related prediction problems, and the goal is to make accurate predictions on future unlabeled data sets that are not known to the learner. This…

Machine Learning · Statistics 2021-01-08 Gilles Blanchard , Aniket Anand Deshmukh , Urun Dogan , Gyemin Lee , Clayton Scott

Distributed statistical learning problems arise commonly when dealing with large datasets. In this setup, datasets are partitioned over machines, which compute locally, and communicate short messages. Communication is often the bottleneck.…

Statistics Theory · Mathematics 2022-10-25 Edgar Dobriban , Yue Sheng

Distribution regression has recently attracted much interest as a generic solution to the problem of supervised learning where labels are available at the group level, rather than at the individual level. Current approaches, however, do not…

Machine Learning · Statistics 2021-01-18 Ho Chung Leon Law , Danica J. Sutherland , Dino Sejdinovic , Seth Flaxman

We consider a family of problems that are concerned about making predictions for the majority of unlabeled, graph-structured data samples based on a small proportion of labeled samples. Relational information among the data samples, often…

Machine Learning · Computer Science 2019-11-05 Jiaqi Ma , Weijing Tang , Ji Zhu , Qiaozhu Mei

There is growing evidence that converting targets to soft targets in supervised learning can provide considerable gains in performance. Much of this work has considered classification, converting hard zero-one values to soft labels---such…

Machine Learning · Statistics 2018-06-13 Ehsan Imani , Martha White

This work proposes a novel method for semi-supervised learning from partially labeled massive network-structured datasets, i.e., big data over networks. We model the underlying hypothesis, which relates data points to labels, as a graph…

Machine Learning · Computer Science 2017-05-16 Alexander Jung , Alfred O. Hero , Alexandru Mara , Saeed Jahromi

In this paper, we aim at establishing an approximation theory and a learning theory of distribution regression via a fully connected neural network (FNN). In contrast to the classical regression methods, the input variables of distribution…

Machine Learning · Statistics 2023-07-10 Zhongjie Shi , Zhan Yu , Ding-Xuan Zhou

Standard supervised machine learning assumes that the distribution of the source samples used to train an algorithm is the same as the one of the target samples on which it is supposed to make predictions. However, as any data scientist…

Machine Learning · Computer Science 2020-02-12 Pirmin Lemberger , Ivan Panico

Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear…

Machine Learning · Statistics 2021-11-03 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén
‹ Prev 1 2 3 10 Next ›