English
Related papers

Related papers: The potential and perils of preprocessing: Buildin…

200 papers

The continuous increase of data generated provides enormous possibilities of both public and private companies. The management of this mass of data or big data will play a crucial role in the society of the future, as it finds applications…

Computers and Society · Computer Science 2015-01-15 Fatima El Jamiy , Abderrahmane Daif , Mohamed Azouazi , Abdelaziz Marzak

Emerging Big Data analytics and machine learning applications require a significant amount of computational power. While there exists a plethora of large-scale data processing frameworks which thrive in handling the various complexities of…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-26 Jan S. Rellermeyer , Sobhan Omranian Khorasani , Dan Graur , Apourva Parthasarathy

In microarray technology, a number of critical steps are required to convert the raw measurements into the data relied upon by biologists and clinicians. These data manipulations, referred to as preprocessing, influence the quality of the…

Applications · Statistics 2009-09-29 Zhijin Wu , Rafael A. Irizarry

Machine learning has the potential to fuel further advances in data science, but it is greatly hindered by an ad hoc design process, poor data hygiene, and a lack of statistical rigor in model evaluation. Recently, these issues have begun…

Machine Learning · Computer Science 2021-08-19 Stella Biderman , Walter J. Scheirer

Data pre-processing is a significant step in machine learning to improve the performance of the model and decreases the running time. This might include dealing with missing values, outliers detection and removing, data augmentation,…

Machine Learning · Computer Science 2024-09-04 Ahmed M Salih

Data science requires time-consuming iterative manual activities. In particular, activities such as data selection, preprocessing, transformation, and mining, highly depend on iterative trial-and-error processes that could be sped-up…

Data analysis focuses on harnessing advanced statistics, programming, and machine learning techniques to extract valuable insights from vast datasets. An increasing volume and variety of research emerged, addressing datasets of diverse…

Databases · Computer Science 2025-01-06 Chen Liang , Donghua Yang , Zheng Liang , Zhiyu Liang , Tianle Zhang , Boyu Xiao , Yuqing Yang , Wenqi Wang , Hongzhi Wang

A significant portion of the effort involved in advanced process control, process analytics, and machine learning involves acquiring and preparing data. Literature often emphasizes increasingly complex modelling techniques with incremental…

Systems and Control · Electrical Eng. & Systems 2023-04-07 Lim C. Siang , Shams Elnawawi , Lee D. Rippon , Daniel L. O'Connor , R. Bhushan Gopaluni

Few major commercial or economic decisions are made today which are not underpinned by analysis using spreadsheets. It is virtually impossible to avoid making mistakes during their drafting and some of these errors remain, unseen and…

Human-Computer Interaction · Computer Science 2009-08-07 Angus Dunn

We live in a world where data generation is omnipresent. Innovations in computer hardware in the last few decades coupled with increasingly reliable connectivity among them have fueled this phenomenon. We are constantly creating and…

Human-Computer Interaction · Computer Science 2018-03-02 Gourab Mitra

Data science is creating very exciting trends as well as significant controversy. A critical matter for the healthy development of data science in its early stages is to deeply understand the nature of data and data science, and to discuss…

Computers and Society · Computer Science 2020-07-01 Longbing Cao

Progress in many domains increasingly benefits from our ability to view the systems through a computational lens, i.e., using computational abstractions of the domains; and our ability to acquire, share, integrate, and analyze disparate…

Data mining is about obtaining new knowledge from existing datasets. However, the data in the existing datasets can be scattered, noisy, and even incomplete. Although lots of effort is spent on developing or fine-tuning data mining models…

Machine Learning · Computer Science 2019-06-21 Canchen Li

Statistics is sometimes described as the science of reasoning under uncertainty. Statistical models provide one view of this uncertainty, but what is frequently neglected is the 'invisible' portion of uncertainty: that assumed not to exist…

Methodology · Statistics 2026-03-18 Oliver L. Pescott , Robin J. Boyd , Gary D. Powney , Gavin B. Stewart

Advances in AI, and especially machine learning, are increasingly drawing research interest and efforts towards predictive process monitoring, the subfield of process mining (PM) that concerns predicting next events, process outcomes and…

Artificial Intelligence · Computer Science 2021-07-06 Hans Weytjens , Jochen De Weerdt

As artificial intelligence and machine learning tools become more accessible, and scientists face new obstacles to data collection (e.g., rising costs, declining survey response rates), researchers increasingly use predictions from…

Machine Learning · Statistics 2025-12-08 Stephen Salerno , Kentaro Hoffman , Awan Afiaz , Anna Neufeld , Tyler H. McCormick , Jeffrey T. Leek

The amount of data in the world is expanding rapidly. Every day, huge amounts of data are created by scientific experiments, companies, and end users' activities. These large data sets have been labeled as "Big Data", and their storage,…

Databases · Computer Science 2020-04-29 Mahdi Bohlouli , Frank Schulz , Lefteris Angelis , David Pahor , Ivona Brandic , David Atlan , Rosemary Tate

The large amounts of data continuously generated online offer opportunities to identify and analyse trends in various aspects of society. For instance, data from online social media are frequently used as a means of analysing informal…

Social and Information Networks · Computer Science 2025-03-03 James Nevin , Salvatore Flavio Pileggi , Michael Lees , Paul Groth

Process mining enables the reconstruction and evaluation of business processes based on digital traces in IT systems. An increasingly important technique in this context is process prediction. Given a sequence of events of an ongoing trace,…

Machine Learning · Computer Science 2021-06-09 Dominic A. Neu , Johannes Lahann , Peter Fettke

Bad statistics make research papers unreproducible and misleading. For the most part, the reasons for such misusage of numerical data have been found and addressed years ago by experts and proper practical solutions have been presented…

Other Statistics · Statistics 2020-10-26 Farzan Shenavarmasouleh , Hamid R. Arabnia
‹ Prev 1 2 3 10 Next ›