Related papers: Reprowd: Crowdsourced Data Processing Made Reprodu…
Many research groups aspire to make data and code FAIR and reproducible, yet struggle because the data and code life cycles are disconnected, executable environments are often missing from published work, and technical skill requirements…
The field of human computation and crowdsourcing has historically studied how tasks can be outsourced to humans. However, many tasks previously distributed to human crowds can today be completed by generative AI with human-level abilities,…
Crowdsourcing is the primary means to generate training data at scale, and when combined with sophisticated machine learning algorithms, crowdsourcing is an enabler for a variety of emergent automated applications impacting all spheres of…
Many data mining tasks cannot be completely addressed by auto- mated processes, such as sentiment analysis and image classification. Crowdsourcing is an effective way to harness the human cognitive ability to process these machine-hard…
Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base construction and protein structure determination all benefit from human input. In…
With the development of mobile social networks, more and more crowdsourced data are generated on the Web or collected from real-world sensing. The fragment, heterogeneous, and noisy nature of online/offline crowdsourced data, however, makes…
We present CrowdHub, a tool for running systematic evaluations of task designs on top of crowdsourcing platforms. The goal is to support the evaluation process, avoiding potential experimental biases that, according to our empirical…
Data fusion has played an important role in data mining because high-quality data is required in a lot of applications. As on-line data may be out-of-date and errors in the data may propagate with copying and referring between sources, it…
Reproducibility is widely acknowledged as a fundamental principle in scientific research. Currently, the scientific community grapples with numerous challenges associated with reproducibility, often referred to as the ''reproducibility…
Big data have the characteristics of enormous volume, high velocity, diversity, value-sparsity, and uncertainty, which lead the knowledge learning from them full of challenges. With the emergence of crowdsourcing, versatile information can…
The reproduction and replication of research results has become a major issue for a number of scientific disciplines. In computer science and related computational disciplines such as systems biology, the challenges closely revolve around…
Crowdsourcing is rapidly evolving and applied in situations where ideas, labour, opinion or expertise of large groups of people are used. Crowdsourcing is now used in various policy-making initiatives; however, this use has usually focused…
Reproducibility in the computational sciences has been stymied because of the complex and rapidly changing computational environments in which modern research takes place. While many will espouse reproducibility as a value, the challenge of…
Recent reproducibility case studies have raised concerns showing that much of the deposited research has not been reproducible. One of their conclusions was that the way data repositories store research data and code cannot fully facilitate…
Crowd-sourcing is a powerful solution for finding correct answers to expensive and unanswered queries in databases, including those with uncertain and incomplete data. Attempts to use crowd-sourcing to exploit human abilities to process…
The drive for reproducibility in the computational sciences has provoked discussion and effort across a broad range of perspectives: technological, legislative/policy, education, and publishing. Discussion on these topics is not new, but…
The reproducibility of scientific findings are an important hallmark of quality and integrity in research. The scientific method requires hypotheses to be subjected to the most crucial tests, and for the results to be consistent across…
High-quality and large-scale data are key to success for AI systems. However, large-scale data annotation efforts are often confronted with a set of common challenges: (1) designing a user-friendly annotation interface; (2) training enough…
In recent years, imitation learning from large-scale human demonstrations has emerged as a promising paradigm for training robot policies. However, the burden of collecting large quantities of human demonstrations is significant in terms of…
Reproducibility has been consistently identified as an important component of scientific research. Although there is a general consensus on the importance of reproducibility along with the other commonly used 'R' terminology (i.e.,…