English
Related papers

Related papers: First Study on Data Readiness Level

200 papers

In addition to the technology readiness level (TRL the scientific readiness level (SRL) has been introduced as a more authentic and adequate tool for determining the status quo of scientific and scientific-technical projects of fundamental…

Digital Libraries · Computer Science 2024-10-15 Eldar Knar

An ever shorter technology lifecycle engendered the need for assessing new technologies w.r.t. their market readiness. Knowing the Technology readiness level (TRL) of a given target technology proved to be useful to mitigate risks such as…

Computers and Society · Computer Science 2022-01-04 Archana Kumari , Stefan Schiffner , Sandra Schmitz

Application of models to data is fraught. Data-generating collaborators often only have a very basic understanding of the complications of collating, processing and curating data. Challenges include: poor data collection practices, missing…

Databases · Computer Science 2017-05-08 Neil D. Lawrence

Ranking evaluation metrics are a fundamental element of design and improvement efforts in information retrieval. We observe that most popular metrics disregard information portrayed in the scores used to derive rankings, when available.…

Information Retrieval · Computer Science 2016-12-20 Nuno Moniz , Luís Torgo , João Vinagre

Lack of reliability is a well-known issue for reinforcement learning (RL) algorithms. This problem has gained increasing attention in recent years, and efforts to improve it have grown substantially. To aid RL researchers and production…

Machine Learning · Statistics 2020-02-14 Stephanie C. Y. Chan , Samuel Fishman , John Canny , Anoop Korattikara , Sergio Guadarrama

Current trends in pre-training Large Language Models (LLMs) primarily focus on the scaling of model and dataset size. While the quality of pre-training data is considered an important factor for training powerful LLMs, it remains a nebulous…

Computation and Language · Computer Science 2025-07-04 Brando Miranda , Alycia Lee , Sudharsan Sundar , Allison Casasola , Rylan Schaeffer , Elyas Obbad , Sanmi Koyejo

Reliability quantification of deep reinforcement learning (DRL)-based control is a significant challenge for the practical application of artificial intelligence (AI) in safety-critical systems. This study proposes a method for quantifying…

Systems and Control · Electrical Eng. & Systems 2024-07-22 Hitoshi Yoshioka , Hirotada Hashimoto

Relational databases (RDBs) are widely regarded as the gold standard for storing structured information. Consequently, predictive tasks leveraging this data format hold significant application promise. Recently, Relational Deep Learning…

Machine Learning · Computer Science 2025-12-15 Jakub Peleška , Gustav Šír

Data quality describes the degree to which data meet specific requirements and are fit for use by humans and/or downstream tasks (e.g., artificial intelligence). Data quality can be assessed across multiple high-level concepts called…

Databases · Computer Science 2025-07-24 Vasileios Papastergios , Lisa Ehrlinger , Anastasios Gounaris

The development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. The lack of diligence can lead to technical debt, scope creep and misaligned…

Software Engineering · Computer Science 2020-12-17 Alexander Lavin , Gregory Renard

In this paper, we investigate the retrievability of datasets and publications in a real-life Digital Library (DL). The measure of retrievability was originally developed to quantify the influence that a retrieval system has on the access to…

Information Retrieval · Computer Science 2022-07-22 Dwaipayan Roy , Zeljko Carevic , Philipp Mayr

The development and deployment of machine learning (ML) systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. The lack of diligence can lead to technical debt, scope creep and misaligned…

Causal representation learning (CRL) models aim to transform high-dimensional data into a latent space, enabling interventions to generate counterfactual samples or modify existing data based on the causal relationships among latent…

Machine Learning · Computer Science 2026-03-19 Alireza Sadeghi , Wael AbdAlmageed

Understanding the effect of uncertainty and noise in data on machine learning models (MLM) is crucial in developing trust and measuring performance. In this paper, a new model is proposed to quantify uncertainties and noise in data on MLMs.…

Machine Learning · Computer Science 2024-12-10 Usman Anjum , Chris Trentman , Elrod Caden , Justin Zhan

Selecting the optimal resolution for discretizing high-dimensional data is a central problem in physics and data analysis, particularly in unsupervised settings where the underlying distribution is unknown. The Relevance-Resolution…

Statistical Mechanics · Physics 2026-03-06 Margherita Mele , Daniel Campos Moreno , Raffaello Potestio

The rapid growth of data across fields of science and industry has increased the need to improve the performance of end-to-end data transfers while using the resources more efficiently. In this paper, we present a dynamic, multiparameter…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-27 Hasibul Jamil , Jacob Goldverg , Elvis Rodrigues , MD S Q Zulkar Nine , Tevfik Kosar

Data Linkage is an important step that can provide valuable insights for evidence-based decision making, especially for crucial events. Performing sensible queries across heterogeneous databases containing millions of records is a complex…

Databases · Computer Science 2015-10-09 Mohammed Gollapalli

We provide a definition for class density that can be used to measure the aggregate similarity of the samples within each of the classes in a high-dimensional, unstructured dataset. We then put forth several candidate methods for…

Machine Learning · Computer Science 2022-02-09 Adam Byerly , Tatiana Kalganova

We introduce a criterion, resilience, which allows properties of a dataset (such as its mean or best low rank approximation) to be robustly computed, even in the presence of a large fraction of arbitrary additional data. Resilience is a…

Machine Learning · Computer Science 2017-11-28 Jacob Steinhardt , Moses Charikar , Gregory Valiant

Understanding geometric properties of natural language processing models' latent spaces allows the manipulation of these properties for improved performance on downstream tasks. One such property is the amount of data spread in a model's…

Machine Learning · Computer Science 2023-08-02 Anna C. Marbut , Katy McKinney-Bock , Travis J. Wheeler
‹ Prev 1 2 3 10 Next ›