应用统计
Debates about juridical proof are often framed as a conflict between probabilistic approaches and relative plausibility theory (RPT). This paper argues that this opposition rests on a level-of-analysis error. Drawing on Marr's distinction…
Process capability indices such as $C_{pk}$ are widely used for manufacturing decisions, yet are typically applied via deterministic thresholding of finite-sample estimates, ignoring uncertainty and leading to unstable outcomes near the…
As demand-side flexibility becomes increasingly necessary to integrate variable renewable energy, understanding electricity demand composition across different grid levels is essential. However, at regional and national scales, visibility…
The linked micromaps approach was originally developed as an improvement to choropleth maps for displaying statistical summaries connected with spatial areal units, such as countries, states, and counties. Two R packages to create linked…
This study investigates the relationship between longitudinal serum creatinine measurements and the risk of adverse kidney outcomes in paediatric patients with auto-immune disorders at Great Ormond Street Hospital for Children NHS…
The menstrual cycle influences numerous physiological and psychological outcomes, yet standardised, open-source statistical methods for quantifying these cyclic effects remain lacking. We developed mcanalysis, an open-source package in R…
Reliable analysis of migration is critically dependent on the quality and consistency of the underlying data. Indian migration data, primarily derived from decennial census records, are affected by systematic gaps arising from uneven…
Racial differences in authoritarianism are widely used to explain variation in political attitudes, yet it is unclear whether they reflect true latent differences or measurement artifacts. Using anchor-based multi-group confirmatory factor…
Background: Randomized controlled trials (RCTs) are costly, time-consuming, and often infeasible, while treatment-effect estimation from observational data is limited by unobserved confounding. Methods: We developed a three-step framework…
This memo compares two methods, Powerwise (PWR) and the NCAA Power Index (NPI), that aim to rank NCAA Division I, II, and III teams on the basis of deservedness of an invite to end-of-season championship tournaments. It find that while the…
Hourly consumption from multiple providers displays pronounced intra-day, intra-week, and annual seasonalities, as well as strong cross-sectional correlations. We introduce a novel approach for forecasting high-dimensional U.S. electricity…
In many problems of data-driven modeling for dynamical systems, the governing equations are not known a priori and must be selected phenomenologically from a large set of candidate interactions and basis functions. In such situations, point…
This study applies the Causal Fairness Analysis (CFA) framework of Plecko and Bareinboim (2024) to decompose the total variation in STEM outcomes attributable to ADHD status into direct, indirect, and spurious components using Pearl's…
Sociological research has framed collective action in science, innovation, and culture as tripartite networks connecting teams of actors, lists of prior works, and sets of labels (e.g., keywords, topics). While methods for multipartite…
Existing studies indicate that complex system degradation is characterized by degradation of multiple dependent parameters. Capturing the dependencies is crucial for accurate degradation modeling and effective degradation control. This work…
Routinely collected data from electronic health records (EHR) provide opportunities to study effects of longitudinal treatment strategies in real-world clinical settings. A challenge presented by EHR data is that frequency of covariate…
An extension of the Ising model is proposed as a viable alternative for data with values $-1$, $0$ and $+1$ in the inverse problem, i.e., estimation of the parameters. This model is called the Blume-Capel (BC) model, adapted from physics…
The summer of 2023 was the second hottest on record, with numerous extreme heatwaves across the globe. Using the Spherical Fourier Neural Operator machine learning (ML) weather model, we generated a massive ensemble of 7,424 weather…
Take a look around you -- in your family, your school or workplace, in the streets, and you see boys & girls in about equal proportion, and without any easily visible gender patterns in case of siblings. So, to the famous first order of…
We propose a mathematical formalisation of the ``wave model'' originally developed in historical linguistics but with further applications in human sciences. This model assumes new traits appear in a population and spread to nearby…