Related papers: Tabular Learning: Encoding for Entity and Context …

LLM Embeddings for Deep Learning on Tabular Data

Tabular deep-learning methods require embedding numerical and categorical input features into high-dimensional spaces before processing them. Existing methods deal with this heterogeneous nature of tabular data by employing separate…

Machine Learning · Computer Science 2025-02-18 Boshko Koloski , Andrei Margeloiu , Xiangjian Jiang , Blaž Škrlj , Nikola Simidjievski , Mateja Jamnik

Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders

Named entity recognition and relation extraction are two important fundamental problems. Joint learning algorithms have been proposed to solve both tasks simultaneously, and many of them cast the joint task as a table-filling problem.…

Computation and Language · Computer Science 2020-10-09 Jue Wang , Wei Lu

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

We propose TabTransformer, a novel deep tabular data modeling architecture for supervised and semi-supervised learning. The TabTransformer is built upon self-attention based Transformers. The Transformer layers transform the embeddings of…

Machine Learning · Computer Science 2020-12-15 Xin Huang , Ashish Khetan , Milan Cvitkovic , Zohar Karnin

Learning Causal Orderings for In-Context Tabular Prediction

In-context learning for tabular data sets strong predictive standards in observational settings; it however primarily relies on correlational structure, which becomes unreliable under distribution shift or intervention. While established…

Machine Learning · Computer Science 2026-05-22 Sascha Xu , Sarah Mameche , Jilles Vreeken

Encoders and Ensembles for Task-Free Continual Learning

We present an architecture that is effective for continual learning in an especially demanding setting, where task boundaries do not exist or are unknown, and where classes have to be learned online (with each example presented only once).…

Machine Learning · Computer Science 2021-10-08 Murray Shanahan , Christos Kaplanis , Jovana Mitrović

Universal Sentence Encoder

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the…

Computation and Language · Computer Science 2018-04-13 Daniel Cer , Yinfei Yang , Sheng-yi Kong , Nan Hua , Nicole Limtiaco , Rhomni St. John , Noah Constant , Mario Guajardo-Cespedes , Steve Yuan , Chris Tar , Yun-Hsuan Sung , Brian Strope , Ray Kurzweil

Effective Use of Transformer Networks for Entity Tracking

Tracking entities in procedural language requires understanding the transformations arising from actions on entities as well as those entities' interactions. While self-attention-based pre-trained language encoders like GPT and BERT have…

Computation and Language · Computer Science 2019-09-09 Aditya Gupta , Greg Durrett

Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval

Tables contain valuable knowledge in a structured form. We employ neural language modeling approaches to embed tabular data into vector spaces. Specifically, we consider different table elements, such caption, column headings, and cells,…

Information Retrieval · Computer Science 2019-06-04 Li Deng , Shuo Zhang , Krisztian Balog

Learning Multi-Relational Semantics Using Neural-Embedding Models

In this paper we present a unified framework for modeling multi-relational representations, scoring, and learning, and conduct an empirical study of several recent multi-relational embedding models under the framework. We investigate the…

Computation and Language · Computer Science 2014-11-18 Bishan Yang , Wen-tau Yih , Xiaodong He , Jianfeng Gao , Li Deng

TableFormer: Robust Transformer Modeling for Table-Text Encoding

Understanding tables is an important aspect of natural language understanding. Existing models for table understanding require linearization of the table structure, where row or column order is encoded as an unwanted bias. Such spurious…

Computation and Language · Computer Science 2022-05-04 Jingfeng Yang , Aditya Gupta , Shyam Upadhyay , Luheng He , Rahul Goel , Shachi Paul

Rethinking Pre-Training in Tabular Data: A Neighborhood Embedding Perspective

Pre-training is prevalent in deep learning for vision and text data, leveraging knowledge from other datasets to enhance downstream tasks. However, for tabular data, the inherent heterogeneity in attribute and label spaces across datasets…

Machine Learning · Computer Science 2025-02-13 Han-Jia Ye , Qi-Le Zhou , Huai-Hong Yin , De-Chuan Zhan , Wei-Lun Chao

Tree-Regularized Tabular Embeddings

Tabular neural network (NN) has attracted remarkable attentions and its recent advances have gradually narrowed the performance gap with respect to tree-based models on many public datasets. While the mainstreams focus on calibrating NN to…

Machine Learning · Computer Science 2024-03-05 Xuan Li , Yun Wang , Bo Li

The Impact of Positional Encodings on Multilingual Compression

In order to preserve word-order information in a non-autoregressive setting, transformer architectures tend to include positional knowledge, by (for instance) adding positional encodings to token embeddings. Several modifications have been…

Computation and Language · Computer Science 2021-09-14 Vinit Ravishankar , Anders Søgaard

CARTE: Pretraining and Transfer for Tabular Learning

Pretrained deep-learning models are the go-to solution for images or text. However, for tabular data the standard is still to train tree-based models. Indeed, transfer learning on tables hits the challenge of data integration: finding…

Machine Learning · Computer Science 2024-06-03 Myung Jun Kim , Léo Grinsztajn , Gaël Varoquaux

Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains

The ability of deep networks to learn superior representations hinges on leveraging the proper inductive biases, considering the inherent properties of datasets. In tabular domains, it is critical to effectively handle heterogeneous…

Machine Learning · Computer Science 2024-05-15 Kyungeun Lee , Ye Seul Sim , Hye-Seung Cho , Moonjung Eo , Suhee Yoon , Sanghyu Yoon , Woohyung Lim

An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic

Automated theorem proving in first-order logic is an active research area which is successfully supported by machine learning. While there have been various proposals for encoding logical formulas into numerical vectors -- from simple…

Artificial Intelligence · Computer Science 2020-03-17 Ibrahim Abdelaziz , Veronika Thost , Maxwell Crouse , Achille Fokoue

Early Stopping Tabular In-Context Learning

Tabular foundation models have shown strong performance across various tabular learning tasks via in-context learning, offering robust generalization without any downstream finetuning. However, their inference-time costs remain high,…

Machine Learning · Computer Science 2025-07-01 Jaris Küken , Lennart Purucker , Frank Hutter

Task-Specific Embeddings for Ante-Hoc Explainable Text Classification

Current state-of-the-art approaches to text classification typically leverage BERT-style Transformer models with a softmax classifier, jointly fine-tuned to predict class labels of a target task. In this paper, we instead propose an…

Computation and Language · Computer Science 2022-12-02 Kishaloy Halder , Josip Krapac , Alan Akbik , Anthony Brew , Matti Lyra

TabEmb: Joint Semantic-Structure Embedding for Table Annotation

Table annotation is crucial for making web and enterprise tables usable in downstream NLP applications. Unlike textual data where learning semantically rich token or sentence embeddings often suffice, tables are structured combinations of…

Machine Learning · Computer Science 2026-04-22 Ehsan Hoseinzade , Ke Wang , Anandharaju Durai Raju

Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models

Multi-encoder models are a broad family of context-aware neural machine translation systems that aim to improve translation quality by encoding document-level contextual information alongside the current sentence. The context encoding is…

Computation and Language · Computer Science 2022-10-25 Lorenzo Lupo , Marco Dinarelli , Laurent Besacier