Related papers: Introducing Partial Matching Approach in Associati…
The quality of training data for knowledge discovery in databases (KDD) and data mining depends upon many factors, but handling missing values is considered to be a crucial factor in overall data quality. Today real world datasets contains…
This paper presents algorithm for missing values imputation in categorical data. The algorithm is based on using association rules and is presented in three variants. Experimental shows better accuracy of missing values imputation using the…
Imputation of missing attribute values in medical datasets for extracting hidden knowledge from medical datasets is an interesting research topic of interest which is very challenging. One cannot eliminate missing values in medical records.…
Missing attribute values are quite common in the datasets available in the literature. Missing values are also possible because all attributes values may not be recorded and hence unavailable due to several practical reasons. For all these…
We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. Matching methods are heavily used in the social sciences due to their interpretability, but most matching…
Techniques such as clusterization, neural networks and decision making usually rely on algorithms that are not well suited to deal with missing values. However, real world data frequently contains such cases. The simplest solution is to…
Association Rule mining is one of the most important fields in data mining and knowledge discovery. This paper proposes an algorithm that combines the simple association rules derived from basic Apriori Algorithm with the multiple minimum…
In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here,…
When tackling real-life datasets, it is common to face the existence of scrambled missing values within data. Considered as 'dirty data', usually it is removed during a pre-processing step. Starting from the fact that 'making up this…
Item nonresponse is frequently encountered in practice. Ignoring missing data can lose efficiency and lead to misleading inference. Fractional imputation is a frequentist approach of imputation for handling missing data. However, the…
Many datasets suffer from missing values due to various reasons,which not only increases the processing difficulty of related tasks but also reduces the accuracy of classification. To address this problem, the mainstream approach is to use…
Missing value imputation is an important practical problem. There is a large body of work on it, but there does not exist any work that formulates the problem in a structured output setting. Also, most applications have constraints on the…
Decision making from data involves identifying a set of attributes that contribute to effective decision making through computational intelligence. The presence of missing values greatly influences the selection of right set of attributes…
Statistical matching is a technique for integrating two or more data sets when information available for matching records for individual participants across data sets is incomplete. Statistical matching can be viewed as a missing data…
Machine learning techniques have been developed to learn from complete data. When missing values exist in a dataset, the incomplete data should be preprocessed separately by removing data points with missing values or imputation. In this…
In classification of incomplete pattern, the missing values can either play a crucial role in the class determination, or have only little influence (or eventually none) on the classification results according to the context. We propose a…
BACKGROUND: As databases grow larger, it becomes harder to fully control their collection, and they frequently come with missing values: incomplete observations. These large databases are well suited to train machine-learning models, for…
Advancements in data collection techniques and the heterogeneity of data resources can yield high percentages of missing observations on variables, such as block-wise missing data. Under missing-data scenarios, traditional methods such as…
Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods,…
This paper proposes a new framework of algorithmic recourse (AR) that works even in the presence of missing values. AR aims to provide a recourse action for altering the undesired prediction result given by a classifier. Existing AR methods…