Related papers: TabTransformer: Tabular Data Modeling Using Contex…

Deep Learning with Tabular Data: A Self-supervised Approach

We have described a novel approach for training tabular data using the TabTransformer model with self-supervised learning. Traditional machine learning models for tabular data, such as GBDT are being widely used though our paper examines…

Machine Learning · Computer Science 2024-01-30 Tirth Kiranbhai Vyas

TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting

Tabular data is prevalent in many critical domains, yet it is often challenging to acquire in large quantities. This scarcity usually results in poor performance of machine learning models on such data. Data augmentation, a common strategy…

Machine Learning · Computer Science 2024-07-30 Andrei Margeloiu , Adrián Bazaga , Nikola Simidjievski , Pietro Liò , Mateja Jamnik

TabSTAR: A Tabular Foundation Model for Tabular Data with Text Fields

While deep learning has achieved remarkable success across many domains, it has historically underperformed on tabular learning tasks, which remain dominated by gradient boosting decision trees. However, recent advancements are paving the…

Machine Learning · Computer Science 2025-10-31 Alan Arazi , Eilam Shapira , Roi Reichart

TabRet: Pre-training Transformer-based Tabular Models for Unseen Columns

We present \emph{TabRet}, a pre-trainable Transformer-based model for tabular data. TabRet is designed to work on a downstream task that contains columns not seen in pre-training. Unlike other methods, TabRet has an extra learning step…

Machine Learning · Computer Science 2023-04-18 Soma Onishi , Kenta Oono , Kohei Hayashi

TabNet: Attentive Interpretable Tabular Learning

We propose a novel high-performance and interpretable canonical deep tabular data learning architecture, TabNet. TabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and…

Machine Learning · Computer Science 2020-12-10 Sercan O. Arik , Tomas Pfister

TabGLM: Tabular Graph Language Model for Learning Transferable Representations Through Multi-Modal Consistency Minimization

Handling heterogeneous data in tabular datasets poses a significant challenge for deep learning models. While attention-based architectures and self-supervised learning have achieved notable success, their application to tabular data…

Machine Learning · Computer Science 2025-02-27 Anay Majee , Maria Xenochristou , Wei-Peng Chen

The GatedTabTransformer. An enhanced deep learning architecture for tabular modeling

There is an increasing interest in the application of deep learning architectures to tabular data. One of the state-of-the-art solutions is TabTransformer which incorporates an attention mechanism to better track relationships between…

Machine Learning · Computer Science 2022-01-04 Radostin Cholakov , Todor Kolev

Transformers with Stochastic Competition for Tabular Data Modelling

Despite the prevalence and significance of tabular data across numerous industries and fields, it has been relatively underexplored in the realm of deep learning. Even today, neural networks are often overshadowed by techniques such as…

Machine Learning · Computer Science 2024-07-19 Andreas Voskou , Charalambos Christoforou , Sotirios Chatzis

Unlocking the Transferability of Tokens in Deep Models for Tabular Data

Fine-tuning a pre-trained deep neural network has become a successful paradigm in various machine learning tasks. However, such a paradigm becomes particularly challenging with tabular data when there are discrepancies between the feature…

Machine Learning · Computer Science 2023-10-24 Qi-Le Zhou , Han-Jia Ye , Le-Ye Wang , De-Chuan Zhan

TabTreeFormer: Tabular Data Generation Using Hybrid Tree-Transformer

Transformers have shown impressive results in tabular data generation. However, they lack domain-specific inductive biases which are critical for preserving the intrinsic characteristics of tabular data. They also suffer from poor…

Machine Learning · Computer Science 2025-05-19 Jiayu Li , Bingyin Zhao , Zilong Zhao , Uzair Javaid , Kevin Yee , Biplab Sikdar

In-context Learning of Evolving Data Streams with Tabular Foundational Models

State-of-the-art data stream mining has long drawn from ensembles of the Very Fast Decision Tree, a seminal algorithm honored with the 2015 KDD Test-of-Time Award. However, the emergence of large tabular models, i.e., transformers designed…

Machine Learning · Computer Science 2025-12-16 Afonso Lourenço , João Gama , Eric P. Xing , Goreti Marreiros

SwitchTab: Switched Autoencoders Are Effective Tabular Learners

Self-supervised representation learning methods have achieved significant success in computer vision and natural language processing, where data samples exhibit explicit spatial or semantic dependencies. However, applying these methods to…

Machine Learning · Computer Science 2024-01-05 Jing Wu , Suiyao Chen , Qi Zhao , Renat Sergazinov , Chen Li , Shengjie Liu , Chongchao Zhao , Tianpei Xie , Hanqing Guo , Cheng Ji , Daniel Cociorva , Hakan Brunzel

TabText: Language-Based Representations of Tabular Health Data for Predictive Modelling

Tabular medical records remain the most readily available data format for applying machine learning in healthcare. However, traditional data preprocessing ignores valuable contextual information in tables and requires substantial manual…

Machine Learning · Computer Science 2025-09-30 Kimberly Villalobos Carballo , Liangyuan Na , Yu Ma , Léonard Boussioux , Cynthia Zeng , Luis R. Soenksen , Dimitris Bertsimas

Transfer Learning with Deep Tabular Models

Recent work on deep learning for tabular data demonstrates the strong performance of deep tabular models, often bridging the gap between gradient boosted decision trees and neural networks. Accuracy aside, a major advantage of neural models…

Machine Learning · Computer Science 2023-08-08 Roman Levin , Valeriia Cherepanova , Avi Schwarzschild , Arpit Bansal , C. Bayan Bruss , Tom Goldstein , Andrew Gordon Wilson , Micah Goldblum

Rethinking Pre-Training in Tabular Data: A Neighborhood Embedding Perspective

Pre-training is prevalent in deep learning for vision and text data, leveraging knowledge from other datasets to enhance downstream tasks. However, for tabular data, the inherent heterogeneity in attribute and label spaces across datasets…

Machine Learning · Computer Science 2025-02-13 Han-Jia Ye , Qi-Le Zhou , Huai-Hong Yin , De-Chuan Zhan , Wei-Lun Chao

Attention Augmented Convolutional Transformer for Tabular Time-series

Time-series classification is one of the most frequently performed tasks in industrial data science, and one of the most widely used data representation in the industrial setting is tabular representation. In this work, we propose a novel…

Machine Learning · Computer Science 2021-10-06 Sharath M Shankaranarayana , Davor Runje

TransTab: Learning Transferable Tabular Transformers Across Tables

Tabular data (or tables) are the most widely used data format in machine learning (ML). However, ML models often assume the table structure keeps fixed in training and testing. Before ML modeling, heavy data cleaning is required to merge…

Machine Learning · Computer Science 2022-09-19 Zifeng Wang , Jimeng Sun

ReConTab: Regularized Contrastive Representation Learning for Tabular Data

Representation learning stands as one of the critical machine learning techniques across various domains. Through the acquisition of high-quality features, pre-trained embeddings significantly reduce input space redundancy, benefiting…

Machine Learning · Computer Science 2023-12-19 Suiyao Chen , Jing Wu , Naira Hovakimyan , Handong Yao

TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling

Deep learning architectures for supervised learning on tabular data range from simple multilayer perceptrons (MLP) to sophisticated Transformers and retrieval-augmented methods. This study highlights a major, yet so far overlooked…

Machine Learning · Computer Science 2025-02-19 Yury Gorishniy , Akim Kotelnikov , Artem Babenko

P-Transformer: A Prompt-based Multimodal Transformer Architecture For Medical Tabular Data

Medical tabular data, abundant in Electronic Health Records (EHRs), is a valuable resource for diverse medical tasks such as risk prediction. While deep learning approaches, particularly transformer-based models, have shown remarkable…

Computation and Language · Computer Science 2025-04-11 Yucheng Ruan , Xiang Lan , Daniel J. Tan , Hairil Rizal Abdullah , Mengling Feng