Promit Ghosal
Protein-protein interactions (PPIs) govern nearly all cellular processes, yet computational methods for identifying binding partners typically produce ranked predictions without mechanistic justification. This creates a fundamental barrier…
We derive finite-particle rates for the regularized Stein variational gradient descent (R-SVGD) algorithm introduced by He et al. (2024) that corrects the constant-order bias of the SVGD by applying a resolvent-type preconditioner to the…
A critical step for reliable large language models (LLMs) use in healthcare is to attribute predictions to their training data, akin to a medical case study. This requires token-level precision: pinpointing not just which training examples…
Single-cell RNA sequencing (scRNA-seq) enables the study of cellular heterogeneity. Yet, clustering accuracy, and with it downstream analyses based on cell labels, remain challenging due to measurement noise and biological variability. In…
Stochastic growth models in the Kardar-Parisi-Zhang (KPZ) universality class exhibit remarkable fluctuation phenomena. While a variety of powerful methods have led to a detailed understanding of their typical fluctuations or large…
Efficient matrix trace estimation is essential for scalable computation of log-determinants, matrix norms, and distributional divergences. In many large-scale applications, the matrices involved are too large to store or access in full,…
Stochastic gradient descent (SGD) has emerged as the quintessential method in a data scientist's toolbox. Using SGD for high-stakes applications requires, however, careful quantification of the associated uncertainty. Towards that end, in…
Stochastic Gradient Descent (SGD) has become a cornerstone method in modern data science. However, deploying SGD in high-stakes applications necessitates rigorous quantification of its inherent uncertainty. In this work, we establish…
We investigate the small regularization limit of entropic optimal transport when the cost function is the Euclidean distance in dimensions $d > 1$, and the marginal measures are absolutely continuous with respect to the Lebesgue measure.…
Distinguishing cause and effect from bivariate observational data is a foundational problem in many disciplines, but challenging without additional assumptions. Additive noise models (ANMs) are widely used to enable sample-efficient…
Inferring causal relationships from observational data is crucial when experiments are costly or infeasible. Additive noise models (ANMs) enable unique directed acyclic graph (DAG) identification, but existing sample-efficient ANM methods…
We provide finite-particle convergence rates for the Stein Variational Gradient Descent (SVGD) algorithm in the Kernelized Stein Discrepancy ($\mathsf{KSD}$) and Wasserstein-2 metrics. Our key insight is that the time derivative of the…
In-context learning (ICL)-the ability of transformer-based models to perform new tasks from examples provided at inference time-has emerged as a hallmark of modern language models. While recent works have investigated the mechanisms…
We study Sinkhorn's algorithm for solving the entropically regularized optimal transport problem. Its iterate $\pi_{t}$ is shown to satisfy $H(\pi_{t}|\pi_{*})+H(\pi_{*}|\pi_{t})=O(t^{-1})$ where $H$ denotes relative entropy and $\pi_{*}$…
Learning the unique directed acyclic graph corresponding to an unknown causal model is a challenging task. Methods based on functional causal models can identify a unique graph, but either suffer from the curse of dimensionality or impose…
In this paper we propose and study a class of nonparametric, yet interpretable measures of association between two random vectors $X$ and $Y$ taking values in $\mathbb{R}^{d_1}$ and $\mathbb{R}^{d_2}$ respectively ($d_1, d_2\ge 1$). These…
We investigate the probability that a random polynomial with independent, mean-zero and finite variance coefficients has no real zeros. Specifically, we consider a random polynomial of degree $2n$ with coefficients given by an i.i.d.…
In 1986, Zamolodchikov conjectured an exponential structure for the semi-classical limit of conformal blocks on a sphere. This paper provides a rigorous proof of the analog of Zamolodchikov conjecture for Liouville conformal blocks on a…
We derive high-dimensional scaling limits and fluctuations for the online least-squares Stochastic Gradient Descent (SGD) algorithm by taking the properties of the data generating model explicitly into consideration. Our approach treats the…
Virasoro conformal blocks are a family of important functions defined as power series via the Virasoro algebra. They are a fundamental input to the conformal bootstrap program for 2D conformal field theory (CFT) and are closely related to…