English
Related papers

Related papers: A PCA-based Data Prediction Method

200 papers

For many modern applications in science and engineering, data are collected in a streaming fashion carrying time-varying information, and practitioners need to process them with a limited amount of memory and computational resources in a…

Machine Learning · Statistics 2018-06-13 Laura Balzano , Yuejie Chi , Yue M. Lu

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data…

Classical Principal Component Analysis (PCA) approximates data in terms of projections on a small number of orthogonal vectors. There are simple procedures to efficiently compute various functions of the data from the PCA approximation. The…

Machine Learning · Statistics 2019-07-26 Guihong Wan , Crystal Maung , Haim Schweitzer

Principal Component Analysis (PCA) is one of the most important methods to handle high dimensional data. However, most of the studies on PCA aim to minimize the loss after projection, which usually measures the Euclidean distance, though in…

Machine Learning · Computer Science 2019-03-19 Kai Liu , Qiuwei Li , Hua Wang , Gongguo Tang

The real-time crash likelihood prediction has been an important research topic. Various classifiers, such as support vector machine (SVM) and tree-based boosting algorithms, have been proposed in traffic safety studies. However, few…

Machine Learning · Computer Science 2018-02-13 Jintao Ke , Shuaichao Zhang , Hai Yang , Xiqun Chen

Many techniques for handling missing data have been proposed in the literature. Most of these techniques are overly complex. This paper explores an imputation technique based on rough set computations. In this paper, characteristic…

Computer Vision and Pattern Recognition · Computer Science 2007-05-23 Fulufhelo Vincent Nelwamondo , Tshilidzi Marwala

Accurate predictions of pollutant concentrations at new locations are often of interest in air pollution studies on fine particulate matters (PM$_{2.5}$), in which data is usually not measured at all study locations. PM$_{2.5}$ is also a…

Applications · Statistics 2020-05-19 Phuong T. Vu , Timothy V. Larson , Adam A. Szpiro

The estimation of missing input vector elements in real time processing applications requires a system that possesses the knowledge of certain characteristics such as correlations between variables, which are inherent in the input space.…

Applications · Statistics 2007-05-23 Fulufhelo V. Nelwamondo , Shakir Mohamed , Tshilidzi Marwala

This paper introduces a novel paradigm to impute missing data that combines a decision tree with an auto-associative neural network (AANN) based model and a principal component analysis-neural network (PCA-NN) based model. For each model,…

Applications · Statistics 2007-09-12 George Ssali , Tshilidzi Marwala

When working with tabular data, missingness is always one of the most painful problems. Throughout many years, researchers have continuously explored better and better ways to impute missing data. Recently, with the rapid development…

Machine Learning · Computer Science 2025-09-09 Tin Luu , Binh Nguyen , Man Ngo

Decision trees are a popular family of models due to their attractive properties such as interpretability and ability to handle heterogeneous data. Concurrently, missing data is a prevalent occurrence that hinders performance of machine…

Machine Learning · Computer Science 2020-07-01 Pasha Khosravi , Antonio Vergari , YooJung Choi , Yitao Liang , Guy Van den Broeck

While data are the primary fuel for machine learning models, they often suffer from missing values, especially when collected in real-world scenarios. However, many off-the-shelf machine learning models, including artificial neural network…

Data values in a dataset can be missing or anomalous due to mishandling or human error. Analysing data with missing values can create bias and affect the inferences. Several analysis methods, such as principle components analysis or…

Artificial Intelligence · Computer Science 2022-05-11 Sandeep Hans , Diptikalyan Saha , Aniya Aggarwal

This paper proposes a novel dynamic forecasting method using a new supervised Principal Component Analysis (PCA) when a large number of predictors are available. The new supervised PCA provides an effective way to bridge the gap between…

Econometrics · Economics 2024-06-14 Zhaoxing Gao , Ruey S. Tsay

This tutorial aims to provide signal processing (SP) and machine learning (ML) practitioners with vital tools, in an accessible way, to answer the question: How to deal with missing data? There are many strategies to handle incomplete…

Signal Processing · Electrical Eng. & Systems 2026-01-06 Alexandre Hippert-Ferrer , Aude Sportisse , Amirhossein Javaheri , Mohammed Nabil El Korso , Daniel P. Palomar

In health-pollution cohort studies, accurate predictions of pollutant concentrations at new locations are needed, since the locations of fixed monitoring sites and study participants are often spatially misaligned. For multi-pollution data,…

Applications · Statistics 2022-01-24 Phuong T. Vu , Adam A. Szpiro , Noah Simon

For many machine learning tasks, the input data lie on a low-dimensional manifold embedded in a high dimensional space and, because of this high-dimensional structure, most algorithms are inefficient. The typical solution is to reduce the…

Machine Learning · Computer Science 2019-03-05 Anna C. Gilbert , Rishi Sonthalia

We propose a multiple imputation method based on principal component analysis (PCA) to deal with incomplete continuous data. To reflect the uncertainty of the parameters from one imputation to the next, we use a Bayesian treatment of the…

Methodology · Statistics 2015-08-20 Vincent Audigier , François Husson , Julie Josse

Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps…

Missing data is a commonly occurring problem in practice. Many imputation methods have been developed to fill in the missing entries. However, not all of them can scale to high-dimensional data, especially the multiple imputation…

Machine Learning · Computer Science 2023-03-21 Thu Nguyen , Hoang Thien Ly , Michael Alexander Riegler , Pål Halvorsen , Hugo L. Hammer
‹ Prev 1 2 3 10 Next ›