Related papers: Multiple--Instance Learning: Christoffel Function …
For distribution regression problem, where a bag of $x$--observations is mapped to a single $y$ value, a one--step solution is proposed. The problem of random distribution to random value is transformed to random vector to random value by…
We show that the empirical Christoffel function associated with a cloud of finitely many points sampled from a distribution, can provide a simple tool for supervised classification in data analysis, with good generalization properties.
Christoffel polynomials are classical tools from approximation theory. They can be used to estimate the (compact) support of a measure $\mu$ on $\mathbb{R}^d$ based on its low-degree moments. Recently, they have been applied to problems in…
We illustrate the potential applications in machine learning of the Christoffel function, or more precisely, its empirical counterpart associated with a counting measure uniformly supported on a finite set of points. Firstly, we provide a…
An important mathematical tool in the analysis of dynamical systems is the approximation of the reach set, i.e., the set of states reachable after a given time from a given initial state. This set is difficult to compute for complex systems…
Making informed decisions about model adequacy has been an outstanding issue for regression models with discrete outcomes. Standard assessment tools for such outcomes (e.g. deviance residuals) often show a large discrepancy from the…
We study the distribution regression problem assuming the distribution of distributions has a doubling measure larger than one. First, we explore the geometry of any distributions that has doubling measure larger than one and build a small…
Distribution regression refers to the supervised learning problem where labels are only available for groups of inputs instead of individual inputs. In this paper, we develop a rigorous mathematical framework for distribution regression…
We focus on the distribution regression problem: regressing to a real-valued response from a probability distribution. Although there exist a large number of similarity measures between distributions, very little is known about their…
The use of neural networks has been very successful in a wide variety of applications. However, it has recently been observed that it is difficult to generalize the performance of neural networks under the condition of distributional shift.…
The increased availability of massive data sets provides a unique opportunity to discover subtle patterns in their distributions, but also imposes overwhelming computational challenges. To fully utilize the information contained in big…
We consider the problem of estimating the support of a measure from a finite, independent, sample. The estimators which are considered are constructed based on the empirical Christoffel function. Such estimators have been proposed for the…
The limitations resulting from the dichtomisation of continuous outcomes have been extensively described. But the need to present results based on binary outcomes in particular in health science remains. Alternatives based on the…
We propose a novel method for estimating nonseparable selection models. We show that, for a given selection function, the potential outcome distributions are nonparametrically identified from the selected outcome distributions and can be…
We focus on the distribution regression problem: regressing to vector-valued outputs from probability measures. Many important machine learning and statistical tasks fit into this framework, including multi-instance learning and point…
We formulate and solve a regression problem with time-stamped distributional data. Distributions are considered as points in the Wasserstein space of probability measures, metrized by the 2-Wasserstein metric, and may represent images,…
Large Reasoning Models have demonstrated remarkable performance with the advancement of test-time scaling techniques, which enhances prediction accuracy by generating multiple candidate responses and selecting the most reliable answer.…
We present an algorithm for data-driven reachability analysis that estimates finite-horizon forward reachable sets for general nonlinear systems using level sets of a certain class of polynomials known as Christoffel functions. The level…
We introduce a statistical physics inspired supervised machine learning algorithm for classification and regression problems. The method is based on the invariances or stability of predicted results when known data is represented as…
In this paper, we explore a method for treating survival analysis as a classification problem. The method uses a "stacking" idea that collects the features and outcomes of the survival data in a large data frame, and then treats it as a…