应用统计
We present a framework for the scale-invariance characterization of stochastic processes in reconstructed finite-dimensional phase spaces. This framework analyses the structural and dynamical properties of the phase space and is based on a…
Large-scale portfolio choice is highly sensitive to estimation error, making the preliminary asset selection essential in empirical implementation. Existing selection rules typically rely on scalar returns or low dimensional high frequency…
Operational hazards in Manufacturing Industrial Internet (MII) systems generate severe data outliers that cripple traditional statistical analysis. This paper proposes a novel robust regression method, DPD-Lasso, which integrates Density…
Clinical assessments for neuromuscular disorders, such as Spinal Muscular Atrophy (SMA) and Duchenne Muscular Dystrophy (DMD), continue to rely on subjective measures to monitor treatment response and disease progression. We introduce a…
The Global Carbon Budget, maintained by the Global Carbon Project, summarizes Earth's global carbon cycle through four annual time series beginning in 1959: atmospheric CO$_2$ concentrations, anthropogenic CO$_2$ emissions, and CO$_2$…
Digital technologies (e.g., mobile phones) can be used to obtain objective, frequent, and real-world digital phenotypes from individuals. However, modeling these data poses substantial challenges since observational data are subject to…
A typical power calculation is performed by replacing unknown population-level quantities in the power function with what is observed in external studies. Many authors and practitioners view this as an assumed value of power and offer the…
This paper discusses wrongful convictions in a medical setting, focusing on nurses. Common features are lack of strong direct evidence: the nurse was never seen doing anything wrong. There is no DNA evidence of tampering of apparatus or…
Wind-speed processes exhibit substantial temporal variability and spatial dependence, yet volatility dynamics across monitoring networks remain relatively unexplored. This study investigates the spatiotemporal behaviour of wind-speed…
K-means clustering is widely used in psychological and psychometric research to identify profiles, subgroups, and potential typologies, yet its classical formulation does not test whether such groups exist as latent psychological…
This paper studies the propagation of finite-sample uncertainty under nonlinear transformations commonly used in statistical decision systems. In particular, we consider process capability indices, which are widely used in manufacturing…
A C library for random number generation, Randompack, is presented. The library implements several modern random number generators (engines), including xoshiro256, PCG64, Philox, ranlux++, and sfc64; 14 continuous distributions including…
The link between age and migration propensity is long established, but existing models of country-level net migration ignore the effect of population age distribution on past and projected migration rates. We propose a method to estimate…
The Plackett-Luce model is widely used to deal with probabilities in discrete choice settings. This paper introduces a novel two-level Plackett-Luce model combined with a multinomial logistic scheme that provides the basis for the route…
Weekly healthcare activity data are typically non-negative counts with temporal dependence and occasional system-wide disruptions, settings in which Gaussian time-series models may be inadequate. Solid organ transplant (SOT) activity…
In routine care, individuals identified a priori as high-risk are usually tested for conditions more frequently. Protected attributes, such as sex or ethnicity may also determine testing frequency. Such heterogeneous detection rates across…
Standard cardiovascular risk calculators, including the Framingham Risk Score and the ACC/AHA Pooled Cohort Equations, estimate the conditional probability P(CHD | SysBP = s) rather than the interventional quantity P(CHD | do(SysBP = s)).…
Sampling geographically dispersed minority populations poses substantial challenges when individual group membership cannot be directly observed. Although stratified sampling can offer efficiency gains, these gains are typically modest…
Droughts and flash droughts (rapidly developing droughts; FDs) remain impactful events that are known to desiccate landscape and destroy crops. In particular, droughts in Africa are often more impactful than in other locations, such as the…
We consider statistical inference for errors-in-variables regression models with dependent observations under the high dimensionality of the error covariance matrix. It is tempting to prewhiten the model and data that had led to efficient…