Related papers: Table Transformers for Imputing Textual Attributes

Missing Data Imputation using Neural Cellular Automata

When working with tabular data, missingness is always one of the most painful problems. Throughout many years, researchers have continuously explored better and better ways to impute missing data. Recently, with the rapid development…

Machine Learning · Computer Science 2025-09-09 Tin Luu , Binh Nguyen , Man Ngo

TabImpute: Universal Zero-Shot Imputation for Tabular Data

Missing data is a widespread problem in tabular settings. Existing solutions range from simple averaging to complex generative adversarial networks, but due to each method's large variance in performance across real-world domains and…

Machine Learning · Computer Science 2026-02-18 Jacob Feitelberg , Dwaipayan Saha , Kyuseong Choi , Zaid Ahmad , Anish Agarwal , Raaz Dwivedi

CACTI: Leveraging Copy Masking and Contextual Information to Improve Tabular Data Imputation

We present CACTI, a masked autoencoding approach for imputing tabular data that leverages the structure in missingness patterns and contextual information. Our approach employs a novel median truncated copy masking training strategy that…

Machine Learning · Computer Science 2025-06-04 Aditya Gorla , Ryan Wang , Zhengtong Liu , Ulzee An , Sriram Sankararaman

GEDI: A Graph-based End-to-end Data Imputation Framework

Data imputation is an effective way to handle missing data, which is common in practical applications. In this study, we propose and test a novel data imputation process that achieve two important goals: (1) preserve the row-wise…

Machine Learning · Computer Science 2023-09-13 Katrina Chen , Xiuqin Liang , Zheng Ma , Zhibin Zhang

Tabular Transfer Learning via Prompting LLMs

Learning with a limited number of labeled data is a central problem in real-world applications of machine learning, as it is often expensive to obtain annotations. To deal with the scarcity of labeled data, transfer learning is a…

Computation and Language · Computer Science 2024-08-22 Jaehyun Nam , Woomin Song , Seong Hyeon Park , Jihoon Tack , Sukmin Yun , Jaehyung Kim , Kyu Hwan Oh , Jinwoo Shin

TUTA: Tree-based Transformers for Generally Structured Table Pre-training

Tables are widely used with various structures to organize and present data. Recent attempts on table understanding mainly focus on relational tables, yet overlook to other common table structures. In this paper, we propose TUTA, a unified…

Information Retrieval · Computer Science 2021-07-21 Zhiruo Wang , Haoyu Dong , Ran Jia , Jia Li , Zhiyi Fu , Shi Han , Dongmei Zhang

TFWT: Tabular Feature Weighting with Transformer

In this paper, we propose a novel feature weighting method to address the limitation of existing feature processing methods for tabular data. Typically the existing methods assume equal importance across all samples and features in one…

Machine Learning · Computer Science 2024-05-20 Xinhao Zhang , Zaitian Wang , Lu Jiang , Wanfu Gao , Pengfei Wang , Kunpeng Liu

Multiple Imputation with Denoising Autoencoder using Metamorphic Truth and Imputation Feedback

Although data may be abundant, complete data is less so, due to missing columns or rows. This missingness undermines the performance of downstream data products that either omit incomplete cases or create derived completed data for…

Machine Learning · Computer Science 2020-06-26 Haw-minn Lu , Giancarlo Perrone , José Unpingco

Not Another Imputation Method: A Transformer-based Model for Missing Values in Tabular Datasets

Handling missing values in tabular datasets presents a significant challenge in training and testing artificial intelligence models, an issue usually addressed using imputation techniques. Here we introduce "Not Another Imputation Method"…

Machine Learning · Computer Science 2026-03-13 Camillo Maria Caruso , Paolo Soda , Valerio Guarrasi

DiffImpute: Tabular Data Imputation With Denoising Diffusion Probabilistic Model

Tabular data plays a crucial role in various domains but often suffers from missing values, thereby curtailing its potential utility. Traditional imputation techniques frequently yield suboptimal results and impose substantial computational…

Machine Learning · Computer Science 2024-03-22 Yizhu Wen , Kai Yi , Jing Ke , Yiqing Shen

DeepIFSAC: Deep Imputation of Missing Values Using Feature and Sample Attention within Contrastive Framework

Missing values of varying patterns and rates in real-world tabular data pose a significant challenge in developing reliable data-driven models. The most commonly used statistical and machine learning methods for missing value imputation may…

Machine Learning · Computer Science 2025-03-26 Ibna Kowsar , Shourav B. Rabbani , Yina Hou , Manar D. Samad

ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation

Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly…

Machine Learning · Computer Science 2024-07-26 Tong Nie , Guoyang Qin , Wei Ma , Yuewen Mei , Jian Sun

Tabular Incremental Inference

Tabular data is a fundamental form of data structure. The evolution of table analysis tools reflects humanity's continuous progress in data acquisition, management, and processing. The dynamic changes in table columns arise from…

Artificial Intelligence · Computer Science 2026-01-28 Xinda Chen , Zhen Xing , Hanyu Zhang , Weimin Tan , Bo Yan

Imputation-free Learning of Tabular Data with Missing Values using Incremental Feature Partitions in Transformer

Tabular data sets with varying missing values are prepared for machine learning using an arbitrary imputation strategy. Synthetic values generated by imputation models often raise concerns regarding data quality and the reliability of…

Machine Learning · Computer Science 2026-01-28 Manar D. Samad , Kazi Fuad B. Akhter , Shourav B. Rabbani , Ibna Kowsar

Basis Transformers for Multi-Task Tabular Regression

Dealing with tabular data is challenging due to partial information, noise, and heterogeneous structure. Existing techniques often struggle to simultaneously address key aspects of tabular data such as textual information, a variable number…

Machine Learning · Computer Science 2025-06-10 Wei Min Loh , Jiaqi Shang , Pascal Poupart

LDI: Localized Data Imputation for Text-Rich Tables

Missing values are pervasive in real-world tabular data and can significantly impair downstream analysis. Imputing them is especially challenging in text-rich tables, where dependencies are implicit, complex, and dispersed across long…

Databases · Computer Science 2026-05-12 Soroush Omidvartehrani , Davood Rafiei

Diffusion Transformers for Imputation: Statistical Efficiency and Uncertainty Quantification

Imputation methods play a critical role in enhancing the quality of practical time-series data, which often suffer from pervasive missing values. Recently, diffusion-based generative imputation methods have demonstrated remarkable success…

Machine Learning · Computer Science 2025-10-03 Zeqi Ye , Minshuo Chen

TabINR: An Implicit Neural Representation Framework for Tabular Data Imputation

Tabular data builds the basis for a wide range of applications, yet real-world datasets are frequently incomplete due to collection errors, privacy restrictions, or sensor failures. As missing values degrade the performance or hinder the…

Machine Learning · Computer Science 2025-10-02 Vincent Ochs , Florentin Bieder , Sidaty el Hadramy , Paul Friedrich , Stephanie Taha-Mehlitz , Anas Taha , Philippe C. Cattin

Adversarial Attacks on Tables with Entity Swap

The capabilities of large language models (LLMs) have been successfully applied in the context of table representation learning. The recently proposed tabular language models have reported state-of-the-art results across various tasks for…

Computation and Language · Computer Science 2023-09-19 Aneta Koleva , Martin Ringsquandl , Volker Tresp

Missing Data Imputation for Supervised Learning

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle