Related papers: Efficient Web-based Data Imputation with Graph Mod…

GIG: Graph Data Imputation With Graph Differential Dependencies

Data imputation addresses the challenge of imputing missing values in database instances, ensuring consistency with the overall semantics of the dataset. Although several heuristics which rely on statistical methods, and ad-hoc rules have…

Artificial Intelligence · Computer Science 2024-10-22 Jiang Hua , Michael Bewong , Selasi Kwashie , MD Geaur Rahman , Junwei Hu , Xi Guo , Zaiwen Fen

GEDI: A Graph-based End-to-end Data Imputation Framework

Data imputation is an effective way to handle missing data, which is common in practical applications. In this study, we propose and test a novel data imputation process that achieve two important goals: (1) preserve the row-wise…

Machine Learning · Computer Science 2023-09-13 Katrina Chen , Xiuqin Liang , Zheng Ma , Zhibin Zhang

Explainable Data Imputation using Constraints

Data values in a dataset can be missing or anomalous due to mishandling or human error. Analysing data with missing values can create bias and affect the inferences. Several analysis methods, such as principle components analysis or…

Artificial Intelligence · Computer Science 2022-05-11 Sandeep Hans , Diptikalyan Saha , Aniya Aggarwal

Certain and Approximately Certain Models for Statistical Learning

Real-world data is often incomplete and contains missing values. To train accurate models over real-world datasets, users need to spend a substantial amount of time and resources imputing and finding proper values for missing data items. In…

Machine Learning · Statistics 2024-03-05 Cheng Zhen , Nischal Aryal , Arash Termehchy , Alireza Aghasi , Amandeep Singh Chabada

Handling Missing Data with Graph Representation Learning

Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values, and label prediction where downstream labels are learned…

Machine Learning · Computer Science 2020-11-02 Jiaxuan You , Xiaobai Ma , Daisy Yi Ding , Mykel Kochenderfer , Jure Leskovec

FCMI: Feature Correlation based Missing Data Imputation

Processed data are insightful, and crude data are obtuse. A serious threat to data reliability is missing values. Such data leads to inaccurate analysis and wrong predictions. We propose an efficient technique to impute the missing value in…

Machine Learning · Computer Science 2021-07-02 Prateek Mishra , Kumar Divya Mani , Prashant Johri , Dikhsa Arya

Graph-theoretic autofill

Imagine a website that asks the user to fill in a web form and -- based on the input values -- derives a relevant figure, for instance an expected salary, a medical diagnosis or the market value of a house. How to deal with missing input…

Human-Computer Interaction · Computer Science 2015-12-11 Michael Mayer , Dominic van der Zypen

Out-of-Vocabulary Embedding Imputation with Grounded Language Information by Graph Convolutional Networks

Due to the ubiquitous use of embeddings as input representations for a wide range of natural language tasks, imputation of embeddings for rare and unseen words is a critical problem in language processing. Embedding imputation involves…

Computation and Language · Computer Science 2020-06-09 Ziyi Yang , Chenguang Zhu , Vin Sachidananda , Eric Darve

Confidence-Based Feature Imputation for Graphs with Partially Known Features

This paper investigates a missing feature imputation problem for graph learning tasks. Several methods have previously addressed learning tasks on graphs with missing features. However, in cases of high rates of missing features, they were…

Machine Learning · Computer Science 2023-05-30 Daeho Um , Jiwoong Park , Seulki Park , Jin Young Choi

Predicting feature imputability in the absence of ground truth

Data imputation is the most popular method of dealing with missing values, but in most real life applications, large missing data can occur and it is difficult or impossible to evaluate whether data has been imputed accurately (lack of…

Methodology · Statistics 2020-07-15 Niamh McCombe , Xuemei Ding , Girijesh Prasad , David P. Finn , Stephen Todd , Paula L. McClean , KongFatt Wong-Lin

Flexible Imputation of Incomplete Network Data

Sampled network data are widely used in empirical research because collecting complete network information is costly. However, empirical analyses based on sampled networks may lead to biased estimators. We propose a nonparametric imputation…

Econometrics · Economics 2026-05-12 Ge Sun , Weisheng Zhang

Data Imputation by Pursuing Better Classification: A Supervised Kernel-Based Method

Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows…

Machine Learning · Computer Science 2025-09-30 Ruikai Yang , Fan He , Mingzhen He , Kaijie Wang , Xiaolin Huang

Efficient Discovery of Ontology Functional Dependencies

Poor data quality has become a pervasive issue due to the increasing complexity and size of modern datasets. Constraint based data cleaning techniques rely on integrity constraints as a benchmark to identify and correct errors. Data values…

Databases · Computer Science 2017-05-25 Sridevi Baskaran , Alexander Keller , Fei Chiang , Golab Lukasz , Jaroslaw Szlichta

Reasoning on Property Graphs with Graph Generating Dependencies

Graph Generating Dependencies (GGDs) informally express constraints between two (possibly different) graph patterns which enforce relationships on both graph's data (via property value constraints) and its structure (via topological…

Databases · Computer Science 2022-11-02 Larissa C. Shimomura , Nikolay Yakovets , George Fletcher

Generative Imputation and Stochastic Prediction

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is…

Machine Learning · Computer Science 2020-09-07 Mohammad Kachuee , Kimmo Karkkainen , Orpaz Goldstein , Sajad Darabi , Majid Sarrafzadeh

Temporal Graph Functional Dependencies [Extended Version]

Data dependencies have been extended to graphs to characterize topological and value constraints. Existing data dependencies are defined to capture inconsistencies in static graphs. Nevertheless, inconsistencies may occur over evolving…

Databases · Computer Science 2022-07-27 Morteza Alipourlangouri , Adam Mansfield , Fei Chiang , Yinghui Wu

Data Imputation with Iterative Graph Reconstruction

Effective data imputation demands rich latent ``structure" discovery capabilities from ``plain" tabular data. Recent advances in graph neural networks-based data imputation solutions show their strong structure learning potential by…

Machine Learning · Computer Science 2024-04-16 Jiajun Zhong , Weiwei Ye , Ning Gui

Enhancing Missing Data Imputation through Combined Bipartite Graph and Complete Directed Graph

In this paper, we aim to address a significant challenge in the field of missing data imputation: identifying and leveraging the interdependencies among features to enhance missing data imputation for tabular data. We introduce a novel…

Machine Learning · Computer Science 2024-11-08 Zhaoyang Zhang , Hongtu Zhu , Ziqi Chen , Yingjie Zhang , Hai Shu

Imputation of missing data using multivariate Gaussian Linear Cluster-Weighted Modeling

Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps…

Methodology · Statistics 2023-08-15 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Discovering Reliable Dependencies from Data: Hardness and Improved Algorithms

The reliable fraction of information is an attractive score for quantifying (functional) dependencies in high-dimensional data. In this paper, we systematically explore the algorithmic implications of using this measure for optimization. We…

Artificial Intelligence · Computer Science 2018-09-17 Panagiotis Mandros , Mario Boley , Jilles Vreeken