Related papers: Tabular Incremental Inference

Tabular Data Augmentation for Machine Learning: Progress and Prospects of Embracing Generative AI

Machine learning (ML) on tabular data is ubiquitous, yet obtaining abundant high-quality tabular data for model training remains a significant obstacle. Numerous works have focused on tabular data augmentation (TDA) to enhance the original…

Machine Learning · Computer Science 2024-08-01 Lingxi Cui , Huan Li , Ke Chen , Lidan Shou , Gang Chen

Towards Data-Centric AI: A Comprehensive Survey of Traditional, Reinforcement, and Generative Approaches for Tabular Data Transformation

Tabular data is one of the most widely used formats across industries, driving critical applications in areas such as finance, healthcare, and marketing. In the era of data-centric AI, improving data quality and representation has become…

Machine Learning · Computer Science 2025-01-22 Dongjie Wang , Yanyong Huang , Wangyang Ying , Haoyue Bai , Nanxu Gong , Xinyuan Wang , Sixun Dong , Tao Zhe , Kunpeng Liu , Meng Xiao , Pengfei Wang , Pengyang Wang , Hui Xiong , Yanjie Fu

TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition

Table Structure Recognition is an essential part of end-to-end tabular data extraction in document images. The recent success of deep learning model architectures in computer vision remains to be non-reflective in table structure…

Computer Vision and Pattern Recognition · Computer Science 2021-05-18 Umar Khan , Sohaib Zahid , Muhammad Asad Ali , Adnan ul Hassan , Faisal Shafait

TabINR: An Implicit Neural Representation Framework for Tabular Data Imputation

Tabular data builds the basis for a wide range of applications, yet real-world datasets are frequently incomplete due to collection errors, privacy restrictions, or sensor failures. As missing values degrade the performance or hinder the…

Machine Learning · Computer Science 2025-10-02 Vincent Ochs , Florentin Bieder , Sidaty el Hadramy , Paul Friedrich , Stephanie Taha-Mehlitz , Anas Taha , Philippe C. Cattin

TabNet: Attentive Interpretable Tabular Learning

We propose a novel high-performance and interpretable canonical deep tabular data learning architecture, TabNet. TabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and…

Machine Learning · Computer Science 2020-12-10 Sercan O. Arik , Tomas Pfister

Table2Image: Interpretable Tabular Data Classification with Realistic Image Transformations

Recent advancements in deep learning for tabular data have shown promise, but challenges remain in achieving interpretable and lightweight models. This paper introduces Table2Image, a novel framework that transforms tabular data into…

Machine Learning · Computer Science 2025-01-24 Seungeun Lee , Il-Youp Kwak , Kihwan Lee , Subin Bae , Sangjun Lee , Seulbin Lee , Seungsang Oh

Representation Learning for Tabular Data: A Comprehensive Survey

Tabular data, structured as rows and columns, is among the most prevalent data types in machine learning classification and regression applications. Models for learning from tabular data have continuously evolved, with Deep Neural Networks…

Machine Learning · Computer Science 2025-04-24 Jun-Peng Jiang , Si-Yang Liu , Hao-Run Cai , Qile Zhou , Han-Jia Ye

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

The long-standing dominance of gradient-boosted decision trees on tabular data is currently challenged by tabular foundation models using In-Context Learning (ICL): setting the training data as context for the test data and predicting in a…

Machine Learning · Computer Science 2025-05-27 Jingang Qu , David Holzmüller , Gaël Varoquaux , Marine Le Morvan

Realistic Data Augmentation Framework for Enhancing Tabular Reasoning

Existing approaches to constructing training data for Natural Language Inference (NLI) tasks, such as for semi-structured table reasoning, are either via crowdsourcing or fully automatic methods. However, the former is expensive and…

Computation and Language · Computer Science 2022-10-25 Dibyakanti Kumar , Vivek Gupta , Soumya Sharma , Shuo Zhang

Leveraging Data Recasting to Enhance Tabular Reasoning

Creating challenging tabular inference data is essential for learning complex reasoning. Prior work has mostly relied on two data generation strategies. The first is human annotation, which yields linguistically diverse data but is…

Computation and Language · Computer Science 2022-11-24 Aashna Jena , Vivek Gupta , Manish Shrivastava , Julian Martin Eisenschlos

iTBLS: A Dataset of Interactive Conversations Over Tabular Information

This paper introduces Interactive Tables (iTBLS), a dataset of interactive conversations that focuses on natural-language manipulation of tabular information sourced from academic pre-prints on ArXiv. The iTBLS dataset consists of three…

Computation and Language · Computer Science 2025-08-20 Anirudh Sundar , Christopher Richardson , Adar Avsian , Larry Heck

Deep Learning with Tabular Data: A Self-supervised Approach

We have described a novel approach for training tabular data using the TabTransformer model with self-supervised learning. Traditional machine learning models for tabular data, such as GBDT are being widely used though our paper examines…

Machine Learning · Computer Science 2024-01-30 Tirth Kiranbhai Vyas

Embeddings for Tabular Data: A Survey

Tabular data comprising rows (samples) with the same set of columns (attributes, is one of the most widely used data-type among various industries, including financial services, health care, research, retail, and logistics, to name a few.…

Machine Learning · Computer Science 2023-02-24 Rajat Singh , Srikanta Bedathur

TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting

Tabular data is prevalent in many critical domains, yet it is often challenging to acquire in large quantities. This scarcity usually results in poor performance of machine learning models on such data. Data augmentation, a common strategy…

Machine Learning · Computer Science 2024-07-30 Andrei Margeloiu , Adrián Bazaga , Nikola Simidjievski , Pietro Liò , Mateja Jamnik

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

We propose TabTransformer, a novel deep tabular data modeling architecture for supervised and semi-supervised learning. The TabTransformer is built upon self-attention based Transformers. The Transformer layers transform the embeddings of…

Machine Learning · Computer Science 2020-12-15 Xin Huang , Ashish Khetan , Milan Cvitkovic , Zohar Karnin

Stable and Interpretable Deep Learning for Tabular Data: Introducing InterpreTabNet with the Novel InterpreStability Metric

As Artificial Intelligence (AI) integrates deeper into diverse sectors, the quest for powerful models has intensified. While significant strides have been made in boosting model capabilities and their applicability across domains, a glaring…

Machine Learning · Computer Science 2023-10-05 Shiyun Wa , Xinai Lu , Minjuan Wang

A Systematic Framework for Tabular Data Disentanglement

Tabular data, widely used in various applications such as industrial control systems, finance, and supply chain, often contains complex interrelationships among its attributes. Data disentanglement seeks to transform such data into latent…

Machine Learning · Computer Science 2026-04-10 Ivan Tjuawinata , Andre Gunawan , Anh Quan Tran , Nitish Kumar , Payal Pote , Harsh Bansal , Chu-Hung Chi , Kwok-Yan Lam , Parventanis Murthy

Semantic Annotation for Tabular Data

Detecting semantic concept of columns in tabular data is of particular interest to many applications ranging from data integration, cleaning, search to feature engineering and model building in machine learning. Recently, several works have…

Artificial Intelligence · Computer Science 2020-12-17 Udayan Khurana , Sainyam Galhotra

Deep Learning within Tabular Data: Foundations, Challenges, Advances and Future Directions

Tabular data remains one of the most prevalent data types across a wide range of real-world applications, yet effective representation learning for this domain poses unique challenges due to its irregular patterns, heterogeneous feature…

Machine Learning · Computer Science 2025-01-08 Weijieying Ren , Tianxiang Zhao , Yuqing Huang , Vasant Honavar

TabDPT: Scaling Tabular Foundation Models on Real Data

Tabular data is one of the most ubiquitous sources of information worldwide, spanning a wide variety of domains. This inherent heterogeneity has slowed the development of Tabular Foundation Models (TFMs) capable of fast generalization to…

Machine Learning · Computer Science 2026-01-21 Junwei Ma , Valentin Thomas , Rasa Hosseinzadeh , Alex Labach , Hamidreza Kamkari , Jesse C. Cresswell , Keyvan Golestan , Guangwei Yu , Anthony L. Caterini , Maksims Volkovs