Related papers: Delicatessen: M-Estimation in Python
We present an open source Python 3 library aimed at practitioners of molecular simulation, especially Monte Carlo simulation. The aims of the library are to facilitate the generation of simulation data for a wide range of problems; and to…
M-estimation, or estimating equation, methods are widely applicable for point estimation and asymptotic inference. In this paper, we present an R package that can find roots and compute the empirical sandwich variance estimator for any set…
Performance assessment is a key issue in the process of proposing new machine learning/statistical estimators. A possible method to complete such task is by using simulation studies, which can be defined as the procedure of estimating and…
The direpack package aims to establish a set of modern statistical dimension reduction techniques into the Python universe as a single, consistent package. The dimension reduction methods included resort into three categories: projection…
We introduce an algorithm that simplifies the construction of efficient estimators, making them accessible to a broader audience. 'Dimple' takes as input computer code representing a parameter of interest and outputs an efficient estimator.…
Dealing with biased data samples is a common task across many statistical fields. In survey sampling, bias often occurs due to unrepresentative samples. In causal studies with observational data, the treated versus untreated group…
Machine learning applications, especially in the fields of me\-di\-cine and social sciences, are slowly being subjected to increasing scrutiny. Similarly to sample size planning performed in clinical and social studies, lawmakers and…
We mainly study the M-estimation method for the high-dimensional linear regression model, and discuss the properties of M-estimator when the penalty term is the local linear approximation. In fact, M-estimation method is a framework, which…
TextDescriptives is a Python package for calculating a large variety of metrics from text. It is built on top of spaCy and can be easily integrated into existing workflows. The package has already been used for analysing the linguistic…
DeeProb-kit is a unified library written in Python consisting of a collection of deep probabilistic models (DPMs) that are tractable and exact representations for the modelled probability distributions. The availability of a representative…
Mechanistic models are important tools to describe and understand biological processes. However, they typically rely on unknown parameters, the estimation of which can be challenging for large and complex systems. We present pyPESTO, a…
Surveys are an important research tool, providing unique measurements on subjective experiences such as sentiment and opinions that cannot be measured by other means. However, because survey data is collected from a self-selected group of…
We present dynesty, a public, open-source, Python package to estimate Bayesian posteriors and evidences (marginal likelihoods) using Dynamic Nested Sampling. By adaptively allocating samples based on posterior structure, Dynamic Nested…
The classification problem's complexity assessment is an essential element of many topics in the supervised learning domain. It plays a significant role in meta-learning -- becoming the basis for determining meta-attributes or…
Large language models (LLMs) have been explored in a variety of reasoning tasks including solving of mathematical problems. Each math dataset typically includes its own specially designed evaluation script, which, while suitable for its…
Meta-analysis is a data aggregation method that establishes an overall and objective level of evidence based on the results of several studies. It is necessary to maintain a high level of homogeneity in the aggregation of data collected…
The fast-paced development of machine learning (ML) methods coupled with its increasing adoption in research poses challenges for researchers without extensive training in ML. In neuroscience, for example, ML can help understand…
A large amount of data is produced every second from modern information systems such as mobile devices, the world wide web, Internet of Things, social media, etc. Analysis and mining of this massive data requires a lot of advanced tools and…
anesthetic is a Python package for processing nested sampling runs, and will be useful for any scientist or statistician who uses nested sampling software. anesthetic unifies many existing tools and techniques in an extensible framework…
We introduce a Python framework designed to automate the most common tasks associated with the extraction and upscaling of the statistics of single-impact crater functions to inform coefficients of continuum equations describing surface…