机器学习
Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian inference. However, HMC techniques are computationally demanding for Bayesian neural networks due to the high…
Models based on recursive adaptive partitioning such as decision trees and their ensembles are popular for high-dimensional regression as they can potentially avoid the curse of dimensionality. Because empirical risk minimization (ERM) is…
We propose a likelihood-free method for comparing two distributions given samples from each, with the goal of assessing the quality of generative models. The proposed approach, PQMass, provides a statistically rigorous method for assessing…
Recent advances in neuroimaging analysis have enabled accurate decoding of mental state from brain activation patterns during functional magnetic resonance imaging scans. A commonly applied tool for this purpose is principal components…
Nested logit (NL) has been commonly used for discrete choice analysis, including a wide range of applications such as travel mode choice, automobile ownership, or location decisions. However, the classical NL models are restricted by their…
Survival analysis is a fundamental tool for modeling time-to-event outcomes in healthcare. Recent advances have introduced flexible neural network approaches for improved predictive performance. However, most of these models do not provide…
In this paper, we consider a score-based Integer Programming (IP) approach for solving the Bayesian Network Structure Learning (BNSL) problem. State-of-the-art BNSL IP formulations suffer from the exponentially large number of variables and…
Dropout is a regularization technique widely used in training artificial neural networks to mitigate overfitting. It consists of dynamically deactivating subsets of the network during training to promote more robust representations. Despite…
We begin by briefly surveying some results on the convergence of the Stochastic Gradient Descent (SGD) Method, proved in a companion paper by the present authors. These results are based on viewing SGD as a version of Stochastic…
Since their introduction by Kipf and Welling in $2017$, a primary use of graph convolutional networks is transductive node classification, where missing labels are inferred within a single observed graph and its feature matrix. Despite the…
We propose a novel randomized framework for the estimation problem of large-scale linear statistical models, namely Sequential Least-Squares Estimators with Fast Randomized Sketching (SLSE-FRS), which integrates Sketch-and-Solve and…
Electronic Health Records (EHRs), comprising diverse clinical data such as diagnoses, medications, and laboratory results, hold great promise for translational research. EHR-derived data have advanced disease prevention, improved clinical…
Representation-based multi-task learning (MTL) improves efficiency by learning a shared structure across tasks, but its practical application is often hindered by contamination, outliers, or adversarial tasks. Most existing methods and…
We propose a new inference framework, named MOSAIC, for change-point detection in dynamic networks with the simultaneous low-rank and sparse-change structure. We establish the minimax rate of detection boundary, which relies on the sparsity…
Ranking and selection (R&S) aims to identify the alternative with the best mean performance among $k$ simulated alternatives. The practical value of R&S depends on accurate simulation input modeling, which often suffers from the curse of…
Motivated by the need for rigorous and scalable evaluation of large language models, we study contextual preference inference for pairwise comparison functionals of context-dependent preference score functions across domains. Focusing on…
Cryo-electron microscopy (Cryo-EM) enables high-resolution imaging of biomolecules, but structural heterogeneity remains a major challenge in 3D reconstruction. Traditional methods assume a discrete set of conformations, limiting their…
In-context operator networks (ICON) are a class of operator learning methods based on the novel architectures of foundation models. Trained on a diverse set of datasets of initial and boundary conditions paired with corresponding solutions…
Comparing probability distributions is a core challenge across the natural, social, and computational sciences. Existing methods, such as Maximum Mean Discrepancy (MMD), struggle in high-dimensional and non-compact domains. Here we…
Knowledge Distillation (KD) seeks to transfer the knowledge of a teacher, towards a student neural net. This process is often done by matching the networks' predictions (i.e., their output), but, recently several works have proposed to…