Related papers: Deep Feature Embedding for Tabular Data

LLM Embeddings for Deep Learning on Tabular Data

Tabular deep-learning methods require embedding numerical and categorical input features into high-dimensional spaces before processing them. Existing methods deal with this heterogeneous nature of tabular data by employing separate…

Machine Learning · Computer Science 2025-02-18 Boshko Koloski , Andrei Margeloiu , Xiangjian Jiang , Blaž Škrlj , Nikola Simidjievski , Mateja Jamnik

On Embeddings for Numerical Features in Tabular Deep Learning

Recently, Transformer-like deep architectures have shown strong performance on tabular data problems. Unlike traditional models, e.g., MLP, these architectures map scalar values of numerical features to high-dimensional embeddings before…

Machine Learning · Computer Science 2023-10-27 Yury Gorishniy , Ivan Rubachev , Artem Babenko

Universal Embeddings of Tabular Data

Tabular data in relational databases represents a significant portion of industrial data. Hence, analyzing and interpreting tabular data is of utmost importance. Application tasks on tabular data are manifold and are often not specified…

Machine Learning · Computer Science 2025-07-09 Astrid Franz , Frederik Hoppe , Marianne Michaelis , Udo Göbel

Small Language Models for Tabular Data

Supervised deep learning is most commonly applied to difficult problems defined on large and often extensively curated datasets. Here we demonstrate the ability of deep representation learning to address problems of classification and…

Machine Learning · Computer Science 2022-11-30 Benjamin L. Badger

SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data

Tabular data is the most commonly used form of data in industry. Gradient Boosting Trees, Support Vector Machine, Random Forest, and Logistic Regression are typically used for classification tasks on tabular data. DNN models using…

Computer Vision and Pattern Recognition · Computer Science 2019-06-05 Baohua Sun , Lin Yang , Wenhan Zhang , Michael Lin , Patrick Dong , Charles Young , Jason Dong

Unlocking the Transferability of Tokens in Deep Models for Tabular Data

Fine-tuning a pre-trained deep neural network has become a successful paradigm in various machine learning tasks. However, such a paradigm becomes particularly challenging with tabular data when there are discrepancies between the feature…

Machine Learning · Computer Science 2023-10-24 Qi-Le Zhou , Han-Jia Ye , Le-Ye Wang , De-Chuan Zhan

Representation Learning for Tabular Data: A Comprehensive Survey

Tabular data, structured as rows and columns, is among the most prevalent data types in machine learning classification and regression applications. Models for learning from tabular data have continuously evolved, with Deep Neural Networks…

Machine Learning · Computer Science 2025-04-24 Jun-Peng Jiang , Si-Yang Liu , Hao-Run Cai , Qile Zhou , Han-Jia Ye

Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems

Modern deep learning-based recommendation systems exploit hundreds to thousands of different categorical features, each with millions of different categories ranging from clicks to posts. To respect the natural diversity within the…

Machine Learning · Computer Science 2020-06-30 Hao-Jun Michael Shi , Dheevatsa Mudigere , Maxim Naumov , Jiyan Yang

Rethinking Pre-Training in Tabular Data: A Neighborhood Embedding Perspective

Pre-training is prevalent in deep learning for vision and text data, leveraging knowledge from other datasets to enhance downstream tasks. However, for tabular data, the inherent heterogeneity in attribute and label spaces across datasets…

Machine Learning · Computer Science 2025-02-13 Han-Jia Ye , Qi-Le Zhou , Huai-Hong Yin , De-Chuan Zhan , Wei-Lun Chao

Deep Learning within Tabular Data: Foundations, Challenges, Advances and Future Directions

Tabular data remains one of the most prevalent data types across a wide range of real-world applications, yet effective representation learning for this domain poses unique challenges due to its irregular patterns, heterogeneous feature…

Machine Learning · Computer Science 2025-01-08 Weijieying Ren , Tianxiang Zhao , Yuqing Huang , Vasant Honavar

Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval

Tables contain valuable knowledge in a structured form. We employ neural language modeling approaches to embed tabular data into vector spaces. Specifically, we consider different table elements, such caption, column headings, and cells,…

Information Retrieval · Computer Science 2019-06-04 Li Deng , Shuo Zhang , Krisztian Balog

Flexible Operator Embeddings via Deep Learning

Integrating machine learning into the internals of database management systems requires significant feature engineering, a human effort-intensive process to determine the best way to represent the pieces of information that are relevant to…

Databases · Computer Science 2019-02-04 Ryan Marcus , Olga Papaemmanouil

Graph Neural Network contextual embedding for Deep Learning on Tabular Data

All industries are trying to leverage Artificial Intelligence (AI) based on their existing big data which is available in so called tabular form, where each record is composed of a number of heterogeneous continuous and categorical columns…

Machine Learning · Computer Science 2024-02-29 Mario Villaizán-Vallelado , Matteo Salvatori , Belén Carro Martinez , Antonio Javier Sanchez Esguevillas

Unveiling the Role of Data Uncertainty in Tabular Deep Learning

Recent advancements in tabular deep learning have demonstrated exceptional practical performance, yet the field often lacks a clear understanding of why these techniques actually succeed. To address this gap, our paper highlights the…

Machine Learning · Computer Science 2025-09-05 Nikolay Kartashev , Ivan Rubachev , Artem Babenko

Tabular Embeddings for Tables with Bi-Dimensional Hierarchical Metadata and Nesting

Embeddings serve as condensed vector representations for real-world entities, finding applications in Natural Language Processing (NLP), Computer Vision, and Data Management across diverse downstream tasks. Here, we introduce novel…

Computation and Language · Computer Science 2025-02-25 Gyanendra Shrestha , Chutain Jiang , Sai Akula , Vivek Yannam , Anna Pyayt , Michael Gubanov

Quadruplet Selection Methods for Deep Embedding Learning

Recognition of objects with subtle differences has been used in many practical applications, such as car model recognition and maritime vessel identification. For discrimination of the objects in fine-grained detail, we focus on deep…

Computer Vision and Pattern Recognition · Computer Science 2019-07-23 Kaan Karaman , Erhan Gundogdu , Aykut Koc , A. Aydin Alatan

Transferable Adversarial Robustness for Categorical Data via Universal Robust Embeddings

Research on adversarial robustness is primarily focused on image and text data. Yet, many scenarios in which lack of robustness can result in serious risks, such as fraud detection, medical diagnosis, or recommender systems often do not…

Machine Learning · Computer Science 2023-12-14 Klim Kireev , Maksym Andriushchenko , Carmela Troncoso , Nicolas Flammarion

Tab2Visual: Overcoming Limited Data in Tabular Data Classification Using Deep Learning with Visual Representations

This research addresses the challenge of limited data in tabular data classification, particularly prevalent in domains with constraints like healthcare. We propose Tab2Visual, a novel approach that transforms heterogeneous tabular data…

Machine Learning · Computer Science 2025-02-12 Ahmed Mamdouh , Moumen El-Melegy , Samia Ali , Ron Kikinis

Deep Neural Networks and Tabular Data: A Survey

Heterogeneous tabular data are the most commonly used form of data and are essential for numerous critical and computationally demanding applications. On homogeneous data sets, deep neural networks have repeatedly shown excellent…

Machine Learning · Computer Science 2023-01-24 Vadim Borisov , Tobias Leemann , Kathrin Seßler , Johannes Haug , Martin Pawelczyk , Gjergji Kasneci

Learning to Embed Categorical Features without Embedding Tables for Recommendation

Embedding learning of categorical features (e.g. user/item IDs) is at the core of various recommendation models including matrix factorization and neural collaborative filtering. The standard approach creates an embedding table where each…

Machine Learning · Computer Science 2021-06-08 Wang-Cheng Kang , Derek Zhiyuan Cheng , Tiansheng Yao , Xinyang Yi , Ting Chen , Lichan Hong , Ed H. Chi