机器学习
Detecting anomalies in datasets is a longstanding problem in machine learning. In this context, anomalies are defined as a sample that significantly deviates from the remaining data. Meanwhile, optimal transport (OT) is a field of…
Transformers have achieved significant success in various fields, notably excelling in tasks involving sequential data like natural language processing. Despite these achievements, the theoretical understanding of transformers' capabilities…
With the emergence of large-scale pre-trained neural networks, methods to adapt such "foundation" models to data-limited downstream tasks have become a necessity. Fine-tuning, preference optimization, and transfer learning have all been…
The explicit regularization and optimality of deep neural networks estimators from independent data have made considerable progress recently. The study of such properties on dependent data is still a challenge. In this paper, we carry out…
We study the Pareto Set Identification (PSI) problem in a structured multi-output linear bandit model. In this setting, each arm is associated a feature vector belonging to $\mathbb{R}^h$, and its mean vector in $\mathbb{R}^d$ linearly…
First-order methods in convex optimization offer low per-iteration cost but often suffer from slow convergence, while second-order methods achieve fast local convergence at the expense of costly Hessian inversions. In this paper, we…
Theoretical works on supervised transfer learning (STL) -- where the learner has access to labeled samples from both source and target distributions -- have for the most part focused on statistical aspects of the problem, while efficient…
Deep reinforcement learning (RL) has gained widespread adoption in recent years but faces significant challenges, particularly in unknown and complex environments. Among these, high-dimensional action selection stands out as a critical…
This thesis focuses on the discovery of stochastic differential equations (SDEs) and stochastic partial differential equations (SPDEs) from noisy and discrete time series. A major challenge is selecting the simplest possible correct model…
In the famous Two Cultures paper, Leo Breiman provided a visionary perspective on the cultures of ''data models'' (modeling with consideration of data generation) versus ''algorithmic models'' (vanilla machine learning models). I provide a…
The success of denoising diffusion models raises important questions regarding their generalisation behaviour, particularly in high-dimensional settings. Notably, it has been shown that when training and sampling are performed perfectly,…
In this work, we propose a novel methodology for robustly estimating particle size distributions from optical scattering measurements using constrained Gaussian process regression. The estimation of particle size distributions is commonly…
Causal forest methods are powerful tools in causal inference. Similar to traditional random forest in machine learning, causal forest independently considers each causal tree. However, this independence consideration increases the…
The rise of generative AI search engines is disrupting traditional SEO, with Gartner predicting 25% reduction in conventional search usage by 2026. This necessitates new approaches for web content visibility in AI-driven search…
State space models, such as Mamba, have recently garnered attention in time series forecasting due to their ability to capture sequence patterns. However, in electricity consumption benchmarks, Mamba forecasts exhibit a mean error of…
In data mining, when binary prediction rules are used to predict a binary outcome, many performance measures are used in a vast array of literature for the purposes of evaluation and comparison. Some examples include classification…
Analyzing relationships between objects is a pivotal problem within data science. In this context, Dimensionality reduction (DR) techniques are employed to generate smaller and more manageable data representations. This paper proposes a new…
This paper extends the universal approximation property of single-hidden-layer feedforward neural networks beyond compact domains, which is of particular interest for the approximation within weighted $C^k$-spaces and weighted Sobolev…
In this paper, we prove that optimizability of any function F using Gradient Flow from all initializations implies a Poincar\'e Inequality for Gibbs measures mu_{beta} = e^{-beta F}/Z at low temperature. In particular, under mild regularity…
We consider the problem of estimating graph limits, known as graphons, from observations of sequences of sparse finite graphs. In this paper we show a simple method that can shed light on a subset of sparse graphs. The method involves…