统计方法学
Count data with an excessive number of zeros frequently arise in fields such as economics, medicine, and public health. Traditional count models often fail to adequately handle such data, especially when the relationship between the…
Narratives about economic events and policies are widely recognised as influential drivers of economic and business behaviour. Yet the statistical identification of narrative emergence remains underdeveloped. Narratives evolve gradually,…
We introduce and develop a general paradigm for combining information across diverse data sources. In broad terms, suppose $\phi$ is a parameter of interest, built up via components $\psi_1,\ldots,\psi_k$ from data sources $1,\ldots,k$. The…
This is the guest editors' general introduction to a Special Issue of the Journal of Statistical Planning and Inference, dedicated to confidence distributions and related themes. Confidence distributions (CDs) are distributions for…
Driven by the recent surge in neural-inspired modeling, point processes have gained significant traction in systems and control. While the Hawkes process is the standard model for characterizing random event sequences with memory,…
We provide the first regression framework that simultaneously accommodates responses taking values in a general metric space and predictors lying on a general torus. We propose intrinsic local constant and local linear estimators that…
We construct exact confidence intervals for the average treatment effect in randomized experiments with binary outcomes using sequences of randomization tests. Our approach does not rely on large-sample approximations and is valid for all…
For large model spaces, the potential entrapment of Markov chain Monte Carlo (MCMC) based methods with spike-and-slab priors poses significant challenges in posterior computation in regression models. On the other hand, maximum a posteriori…
Heterogeneous treatment effects (HTEs) are increasingly estimated using machine learning models that produce highly personalized predictions of treatment effects. In practice, however, predicted treatment effects are rarely interpreted,…
We propose an optimal algorithm for estimating conditional average treatment effects (CATEs) when response functions lie in a reproducing kernel Hilbert space (RKHS). We study settings in which the contrast function is structurally simpler…
We develop a version of variational inference for Bayesian count response regression-type models that possesses attractive attributes such as convexity and closed form updates. The convex solution aspect entails numerically stable fitting…
Estimation in exploratory factor analysis often yields estimates on the boundary of the parameter space. Such occurrences, known as Heywood cases, are characterised by non-positive variance estimates and can cause issues in numerical…
When selecting from a list of potential candidates, it is important to ensure not only that the selected items are of high quality, but also that they are sufficiently dissimilar so as to both avoid redundancy and to capture a broader range…
Latent class models are widely used for identifying unobserved subgroups from multivariate categorical data in social sciences, with binary data as a particularly popular example. However, accurately recovering individual latent class…
Advances in artificial intelligence (AI) and deep learning have led to neural networks being used to generate lightning-speed answers to complex science questions, paintings in the style of Monet, or stories like those of Twain. Leveraging…
Recent advances in multiplex imaging have enabled researchers to locate different types of cells within a tissue sample. This is especially relevant for tumor immunology, as clinical regimes corresponding to different stages of disease or…
We introduce a generic estimator for the false discovery rate of any model selection procedure, in common statistical modeling settings including the Gaussian linear model, Gaussian graphical model, and model-X setting. We prove that our…
A key output of network meta-analysis (NMA) is the relative ranking of treatments; nevertheless, it has attracted substantial criticism. Existing ranking methods often lack clear interpretability and fail to adequately account for…
Markov state modeling has gained popularity in various scientific fields since it reduces complex time-series data sets into transitions between a few states. Yet common Markov state modeling frameworks assume a single Markov chain…
Heterogeneity is a fundamental characteristic of cancer. To accommodate heterogeneity, subgroup identification has been extensively studied and broadly categorized into unsupervised and supervised analysis. Compared to unsupervised…