Related papers: Measuring Data

What Do Learned Models Measure?

In many scientific and data-driven applications, machine learning models are increasingly used as measurement instruments, rather than merely as predictors of predefined labels. When the measurement function is learned from data, the…

Machine Learning · Computer Science 2026-01-27 Indrė Žliobaitė

Position: Measure Dataset Diversity, Don't Just Claim It

Machine learning (ML) datasets, often perceived as neutral, inherently encapsulate abstract and disputed social constructs. Dataset curators frequently employ value-laden terms such as diversity, bias, and quality to characterize datasets.…

Machine Learning · Computer Science 2024-07-12 Dora Zhao , Jerone T. A. Andrews , Orestis Papakyriakopoulos , Alice Xiang

Data Quality Measures and Efficient Evaluation Algorithms for Large-Scale High-Dimensional Data

Machine learning has been proven to be effective in various application areas, such as object and speech recognition on mobile systems. Since a critical key to machine learning success is the availability of large training data, many…

Machine Learning · Computer Science 2021-01-06 Hyeongmin Cho , Sangkyun Lee

Measures in Visualization Space

Measurement is an integral part of modern science, providing the fundamental means for evaluation, comparison, and prediction. In the context of visualization, several different types of measures have been proposed, ranging from approaches…

Graphics · Computer Science 2019-09-13 Fabian Bolte , Stefan Bruckner

Insights into Performance Fitness and Error Metrics for Machine Learning

Machine learning (ML) is the field of training machines to achieve high level of cognition and perform human-like analysis. Since ML is a data-driven approach, it seemingly fits into our daily lives and operations as well as complex and…

Machine Learning · Computer Science 2021-11-25 M. Z. Naser , Amir Alavi

Metrology for AI: From Benchmarks to Instruments

In this paper we present the first steps towards hardening the science of measuring AI systems, by adopting metrology, the science of measurement and its application, and applying it to human (crowd) powered evaluations. We begin with the…

Artificial Intelligence · Computer Science 2019-11-06 Chris Welty , Praveen Paritosh , Lora Aroyo

A Novel Metric for Measuring Data Quality in Classification Applications (extended version)

Data quality is a key element for building and optimizing good learning models. Despite many attempts to characterize data quality, there is still a need for rigorous formalization and an efficient measure of the quality from available…

Machine Learning · Computer Science 2023-12-14 Jouseau Roxane , Salva Sébastien , Samir Chafik

Benchmarks for Detecting Measurement Tampering

When training powerful AI systems to perform complex tasks, it may be challenging to provide training signals which are robust to optimization. One concern is \textit{measurement tampering}, where the AI system manipulates multiple…

Machine Learning · Computer Science 2023-10-02 Fabien Roger , Ryan Greenblatt , Max Nadeau , Buck Shlegeris , Nate Thomas

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities

The emergence and continued reliance on the Internet and related technologies has resulted in the generation of large amounts of data that can be made available for analyses. However, humans do not possess the cognitive capabilities to…

Machine Learning · Computer Science 2021-01-12 MohammadNoor Injadat , Abdallah Moubayed , Ali Bou Nassif , Abdallah Shami

A Survey on Large-scale Machine Learning

Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However,…

Machine Learning · Computer Science 2020-08-11 Meng Wang , Weijie Fu , Xiangnan He , Shijie Hao , Xindong Wu

Unfolding Data Quality Dimensions in Practice: A Survey

Data quality describes the degree to which data meet specific requirements and are fit for use by humans and/or downstream tasks (e.g., artificial intelligence). Data quality can be assessed across multiple high-level concepts called…

Databases · Computer Science 2025-07-24 Vasileios Papastergios , Lisa Ehrlinger , Anastasios Gounaris

Measuring the Data

Measuring the Data analytically finds the intrinsic manifold in big data. First, Optimal Transport generates the tangent space at each data point from which the intrinsic dimension is revealed. Then, the Koopman Dimensionality Reduction…

Machine Learning · Computer Science 2025-04-04 Ido Cohen

Data Cleaning and Machine Learning: A Systematic Literature Review

Context: Machine Learning (ML) is integrated into a growing number of systems for various applications. Because the performance of an ML model is highly dependent on the quality of the data it has been trained on, there is a growing…

Machine Learning · Computer Science 2024-06-03 Pierre-Olivier Côté , Amin Nikanjam , Nafisa Ahmed , Dmytro Humeniuk , Foutse Khomh

An Analytical Survey on Recent Trends in High Dimensional Data Visualization

Data visualization is the process by which data of any size or dimensionality is processed to produce an understandable set of data in a lower dimensionality, allowing it to be manipulated and understood more easily by people. The goal of…

Graphics · Computer Science 2021-07-06 Alexander Kiefer , Md. Khaledur Rahman

Data Collection and Labeling Techniques for Machine Learning

Data collection and labeling are critical bottlenecks in the deployment of machine learning applications. With the increasing complexity and diversity of applications, the need for efficient and scalable data collection and labeling…

Databases · Computer Science 2024-07-19 Qianyu Huang , Tongfang Zhao

Data Representativity for Machine Learning and AI Systems

Data representativity is crucial when drawing inference from data through machine learning models. Scholars have increased focus on unraveling the bias and fairness in models, also in relation to inherent biases in the input data. However,…

Machine Learning · Statistics 2023-02-06 Line H. Clemmensen , Rune D. Kjærsgaard

Data Excellence for AI: Why Should You Care

The efficacy of machine learning (ML) models depends on both algorithms and data. Training data defines what we want our models to learn, and testing data provides the means by which their empirical progress is measured. Benchmark datasets…

Machine Learning · Computer Science 2021-11-23 Lora Aroyo , Matthew Lease , Praveen Paritosh , Mike Schaekermann

Towards Measurement Theory for Artificial Intelligence

We motivate and outline a programme for a formal theory of measurement of artificial intelligence. We argue that formalising measurement for AI will allow researchers, practitioners, and regulators to: (i) make comparisons between systems…

Artificial Intelligence · Computer Science 2025-07-09 Elija Perrier

Hybrid data regression modelling in measurement

Measurement involves the determination of quantitative estimates of physical quantities from experiment, along with estimates of their associated uncertainties. Herewith an experimental system model is the key to extracting information from…

Applications · Statistics 2008-09-01 Vladimir B. Bokov

Absolute Evaluation Measures for Machine Learning: A Survey

Machine Learning is a diverse field applied across various domains such as computer science, social sciences, medicine, chemistry, and finance. This diversity results in varied evaluation approaches, making it difficult to compare models…

Machine Learning · Computer Science 2025-07-08 Silvia Beddar-Wiesing , Alice Moallemy-Oureh , Marie Kempkes , Josephine M. Thomas