Related papers: Sampling Issues in Bibliometric Analysis

Comparing People with Bibliometrics

Bibliometric indicators, citation counts and/or download counts are increasingly being used to inform personnel decisions such as hiring or promotions. These statistics are very often misused. Here we provide a guide to the factors which…

Physics and Society · Physics 2018-08-08 Michael J. Kurtz

A Bibliometrics Analysis on 28 years of Authentication and Threat Model Area

The large volume of publications in any research area can make it difficult for researchers to track their research areas' trends, challenges, and characteristics. Bibliometrics solves this problem by bringing statistical tools to help the…

Cryptography and Security · Computer Science 2022-09-28 Wesley dos Reis Bezerra , Cristiano Antônio de Souza , Carla Merkle Westphall , Carlos Becker Westphall

Statistical sensitiveness for science

Research often necessitates of samples, yet obtaining large enough samples is not always possible. When it is, the researcher may use one of two methods for deciding upon the required sample size: rules-of-thumb, quick yet uncertain, and…

Methodology · Statistics 2016-04-08 Jose D. Perezgonzalez

Computation of statistical power and sample size for in vivo research models

Sample size calculation is crucial in biomedical in vivo research investigations mainly for two reasons: to design the most resource-efficient studies and to safeguard ethical issues when alive animals are subjects of testing. In this…

Applications · Statistics 2025-05-27 Hasan Al-Nashash , Jiajin Wei , Ke Yang , Ayman Alzaatreh , Mohsen Adeli , Tiejun Tong , Angelo All

Efficient and flexible simulation-based sample size determination for clinical trials with multiple design parameters

Simulation offers a simple and flexible way to estimate the power of a clinical trial when analytic formulae are not available. The computational burden of using simulation has, however, restricted its application to only the simplest of…

Methodology · Statistics 2020-12-04 Duncan T. Wilson , Rebecca E. A. Walwyn , Richard Hooper , Julia Brown , Amanda J. Farrin

The use of percentiles and percentile rank classes in the analysis of bibliometric data: Opportunities and limits

Percentiles have been established in bibliometrics as an important alternative to mean-based indicators for obtaining a normalized citation impact of publications. Percentiles have a number of advantages over standard bibliometric…

Digital Libraries · Computer Science 2012-11-05 Lutz Bornmann , Loet Leydesdorff , Ruediger Mutz

A note on efficient audit sample selection

Auditing is a widely used method for quality improvement, and many guidelines are available advising on how to draw samples for auditing. However, researchers or auditors sometimes find themselves in situations that are not straightforward…

Methodology · Statistics 2021-05-25 Laura Boeschoten , Sander Scholtus , Arnout van Delden

Designing efficient randomized trials: power and sample size calculation when using semiparametric efficient estimators

Trials enroll a large number of subjects in order to attain power, making them expensive and time-consuming. Sample size calculations are often performed with the assumption of an unadjusted analysis, even if the trial analysis plan…

Methodology · Statistics 2021-07-06 Alejandro Schuler

Usage Bibliometrics

Scholarly usage data provides unique opportunities to address the known shortcomings of citation analysis. However, the collection, processing and analysis of usage data remains an area of active research. This article provides a review of…

Digital Libraries · Computer Science 2015-05-27 Michael J. Kurtz , Johan Bollen

Do We Really Sample Right In Model-Based Diagnosis?

Statistical samples, in order to be representative, have to be drawn from a population in a random and unbiased way. Nevertheless, it is common practice in the field of model-based diagnosis to make estimations from (biased) best-first…

Artificial Intelligence · Computer Science 2022-08-05 Patrick Rodler , Fatima Elichanova

Importance Sampling: Intrinsic Dimension and Computational Cost

The basic idea of importance sampling is to use independent samples from a proposal measure in order to approximate expectations with respect to a target measure. It is key to understand how many samples are required in order to guarantee…

Computation · Statistics 2017-01-17 S. Agapiou , O. Papaspiliopoulos , D. Sanz-Alonso , A. M. Stuart

Power and Sample Size Calculations for Rerandomization

Power analyses are an important aspect of experimental design, because they help determine how experiments are implemented in practice. It is common to specify a desired level of power and compute the sample size necessary to obtain that…

Methodology · Statistics 2022-12-09 Zach Branson , Xinran Li , Peng Ding

Feature Selection from High-Dimensional Data with Very Low Sample Size: A Cautionary Tale

In classification problems, the purpose of feature selection is to identify a small, highly discriminative subset of the original feature set. In many applications, the dataset may have thousands of features and only a few dozens of samples…

Machine Learning · Computer Science 2020-08-28 Ludmila I. Kuncheva , Clare E. Matthews , Álvar Arnaiz-González , Juan J. Rodríguez

A review of Bayesian perspectives on sample size derivation for confirmatory trials

Sample size derivation is a crucial element of the planning phase of any confirmatory trial. A sample size is typically derived based on constraints on the maximal acceptable type I error rate and a minimal desired power. Here, power…

Applications · Statistics 2023-04-17 Kevin Kunzmann , Michael J. Grayling , Kim May Lee , David S. Robertson , Kaspar Rufibach , James M. S. Wason

Usage Bibliometrics as a Tool to Measure Research Activity

Measures for research activity and impact have become an integral ingredient in the assessment of a wide range of entities (individual researchers, organizations, instruments, regions, disciplines). Traditional bibliometric indicators, like…

Digital Libraries · Computer Science 2020-04-22 Edwin A. Henneken , Michael J. Kurtz

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

We study the scaling properties of latent diffusion models (LDMs) with an emphasis on their sampling efficiency. While improved network architecture and inference algorithms have shown to effectively boost sampling efficiency of diffusion…

Computer Vision and Pattern Recognition · Computer Science 2024-12-11 Kangfu Mei , Zhengzhong Tu , Mauricio Delbracio , Hossein Talebi , Vishal M. Patel , Peyman Milanfar

Counting publications and citations: Is more always better?

Is more always better? We address this question in the context of bibliometric indices that aim to assess the scientific impact of individual researchers by counting their number of highly cited publications. We propose a simple model in…

Digital Libraries · Computer Science 2013-04-03 Ludo Waltman , Nees Jan van Eck , Paul Wouters

Choosing good subsamples for regression modelling

A common problem in health research is that we have a large database with many variables measured on a large number of individuals. We are interested in measuring additional variables on a subsample; these measurements may be newly…

Methodology · Statistics 2022-03-22 Thomas Lumley , Tong Chen

Subsampling and Jackknifing: A Practically Convenient Solution for Large Data Analysis with Limited Computational Resources

Modern statistical analysis often encounters datasets with large sizes. For these datasets, conventional estimation methods can hardly be used immediately because practitioners often suffer from limited computational resources. In most…

Methodology · Statistics 2023-04-14 Shuyuan Wu , Xuening Zhu , Hansheng Wang

A Survey on Sampling and Profiling over Big Data (Technical Report)

Due to the development of internet technology and computer science, data is exploding at an exponential rate. Big data brings us new opportunities and challenges. On the one hand, we can analyze and mine big data to discover hidden…

Databases · Computer Science 2020-05-12 Zhicheng Liu , Aoqian Zhang