机器学习
Offline reinforcement learning is important in many settings with available observational data but the inability to deploy new policies online due to safety, cost, and other concerns. Many recent advances in causal inference and machine…
Feature importance scores are ubiquitous tools for understanding the predictions of machine learning models. However, many popular attribution methods suffer from high instability due to random sampling. Leveraging novel ideas from…
In this paper, we introduce Randomized Q-learning (RandQL), a novel randomized model-free algorithm for regret minimization in episodic Markov Decision Processes (MDPs). To the best of our knowledge, RandQL is the first tractable model-free…
Tukey's depth (or halfspace depth) is a widely used measure of centrality for multivariate data. However, exact computation of Tukey's depth is known to be a hard problem in high dimensions. As a remedy, randomized approximations of Tukey's…
The increasing availability of granular and big data on various objects of interest has made it necessary to develop methods for condensing this information into a representative and intelligible map. Financial regulation is a field that…
Representation multi-task learning (MTL) has achieved tremendous success in practice. However, the theoretical understanding of these methods is still lacking. Most existing theoretical works focus on cases where all tasks share the same…
The separation of performance metrics from gradient based loss functions may not always give optimal results and may miss vital aggregate information. This paper investigates incorporating a performance metric alongside differentiable loss…
Discriminative analysis in neuroimaging by means of deep/machine learning techniques is usually tested with validation techniques, whereas the associated statistical significance remains largely under-developed due to their computational…
Inducing-point-based sparse variational Gaussian processes have become the standard workhorse for scaling up GP models. Recent advances show that these methods can be improved by introducing a diagonal scaling matrix to the conditional…
In this paper, we explore the knowledge transfer under the setting of matrix completion, which aims to enhance the estimation of a low-rank target matrix with auxiliary data available. We propose a transfer learning procedure given prior…
The adaptive Iterative Soft-Thresholding Algorithm (ISTA) has been a popular algorithm for finding a desirable solution to the LASSO problem without explicitly tuning the regularization parameter $\lambda$. Despite that the adaptive ISTA is…
We present the OmniMatch algorithm for seeded multiple graph matching. In the setting of $d$-dimensional Random Dot Product Graphs (RDPG), we prove that under mild assumptions, OmniMatch with $s$ seeds asymptotically and efficiently…
Regularized models are often sensitive to the scales of the features in the data and it has therefore become standard practice to normalize (center and scale) the features before fitting the model. But there are many different ways to…
Machine learning models are often brittle under distribution shift, i.e., when data distributions at test time differ from those during training. Understanding this failure mode is central to identifying and mitigating safety risks of mass…
In the field of operational modal analysis (OMA), obtained modal information is frequently used to assess the current state of aerospace, mechanical, offshore and civil structures. However, the stochasticity of operational systems and the…
In this paper, we propose a novel self-supervised transfer learning method called \underline{\textbf{D}}istribution \underline{\textbf{M}}atching (DM), which drives the representation distribution toward a predefined reference distribution…
This paper addresses the problem of long-context linear system identification, where the state $x_t$ of a dynamical system at time $t$ depends linearly on previous states $x_s$ over a fixed context window of length $p$. We establish a…
Merging datasets across institutions is a lengthy and costly procedure, especially when it involves private information. Data hosts may therefore want to prospectively gauge which datasets are most beneficial to merge with, without…
We introduce a novel capacity measure 2sED for statistical models based on the effective dimension. The new quantity provably bounds the generalization error under mild assumptions on the model. Furthermore, simulations on standard data…
Empirical studies have widely demonstrated that neural networks are highly sensitive to small, adversarial perturbations of the input. The worst-case robustness against these so-called adversarial examples can be quantified by the Lipschitz…