Related papers: Learning Semantic Annotations for Tabular Data

ColNet: Embedding the Semantics of Web Tables for Column Type Prediction

Automatically annotating column types with knowledge base (KB) concepts is a critical task to gain a basic understanding of web tables. Current methods rely on either table metadata like column name or entity correspondences of cells in the…

Computation and Language · Computer Science 2018-11-15 Jiaoyan Chen , Ernesto Jimenez-Ruiz , Ian Horrocks , Charles Sutton

Semantic Annotation for Tabular Data

Detecting semantic concept of columns in tabular data is of particular interest to many applications ranging from data integration, cleaning, search to feature engineering and model building in machine learning. Recently, several works have…

Artificial Intelligence · Computer Science 2020-12-17 Udayan Khurana , Sainyam Galhotra

Graph Neural Network Approach to Semantic Type Detection in Tables

This study addresses the challenge of detecting semantic column types in relational tables, a key task in many real-world applications. While language models like BERT have improved prediction accuracy, their token input constraints limit…

Machine Learning · Computer Science 2024-05-02 Ehsan Hoseinzade , Ke Wang

Semantic Labeling Using a Deep Contextualized Language Model

Generating schema labels automatically for column values of data tables has many data science applications such as schema matching, and data discovery and linking. For example, automatically extracted tables with missing headers can be…

Machine Learning · Computer Science 2020-11-02 Mohamed Trabelsi , Jin Cao , Jeff Heflin

Sato: Contextual Semantic Type Detection in Tables

Detecting the semantic types of data columns in relational tables is important for various data preparation and information retrieval tasks such as data cleaning, schema matching, data discovery, and semantic search. However, existing…

Databases · Computer Science 2020-06-04 Dan Zhang , Yoshihiko Suhara , Jinfeng Li , Madelon Hulsebos , Çağatay Demiralp , Wang-Chiew Tan

KGLink: A column type annotation method that combines knowledge graph and pre-trained language model

The semantic annotation of tabular data plays a crucial role in various downstream tasks. Previous research has proposed knowledge graph (KG)-based and deep learning-based methods, each with its inherent limitations. KG-based methods…

Machine Learning · Computer Science 2024-06-04 Yubo Wang , Hao Xin , Lei Chen

TabNet: Attentive Interpretable Tabular Learning

We propose a novel high-performance and interpretable canonical deep tabular data learning architecture, TabNet. TabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and…

Machine Learning · Computer Science 2020-12-10 Sercan O. Arik , Tomas Pfister

Semantic Classification of Tabular Datasets via Character-Level Convolutional Neural Networks

A character-level convolutional neural network (CNN) motivated by applications in "automated machine learning" (AutoML) is proposed to semantically classify columns in tabular data. Simulated data containing a set of base classes is first…

Computation and Language · Computer Science 2019-01-25 Paul Azunre , Craig Corcoran , Numa Dhamani , Jeffrey Gleason , Garrett Honke , David Sullivan , Rebecca Ruppel , Sandeep Verma , Jonathon Morgan

TCN: Table Convolutional Network for Web Table Interpretation

Information extraction from semi-structured webpages provides valuable long-tailed facts for augmenting knowledge graph. Relational Web tables are a critical component containing additional entities and attributes of rich and diverse…

Information Retrieval · Computer Science 2021-02-19 Daheng Wang , Prashant Shiralkar , Colin Lockard , Binxuan Huang , Xin Luna Dong , Meng Jiang

TabEmb: Joint Semantic-Structure Embedding for Table Annotation

Table annotation is crucial for making web and enterprise tables usable in downstream NLP applications. Unlike textual data where learning semantically rich token or sentence embeddings often suffice, tables are structured combinations of…

Machine Learning · Computer Science 2026-04-22 Ehsan Hoseinzade , Ke Wang , Anandharaju Durai Raju

Making Pre-trained Language Models Great on Tabular Prediction

The transferability of deep neural networks (DNNs) has made significant progress in image and language processing. However, due to the heterogeneity among tables, such DNN bonus is still far from being well exploited on tabular data…

Computation and Language · Computer Science 2024-03-13 Jiahuan Yan , Bo Zheng , Hongxia Xu , Yiheng Zhu , Danny Z. Chen , Jimeng Sun , Jian Wu , Jintai Chen

Representation Learning for Tabular Data: A Comprehensive Survey

Tabular data, structured as rows and columns, is among the most prevalent data types in machine learning classification and regression applications. Models for learning from tabular data have continuously evolved, with Deep Neural Networks…

Machine Learning · Computer Science 2025-04-24 Jun-Peng Jiang , Si-Yang Liu , Hao-Run Cai , Qile Zhou , Han-Jia Ye

Towards Interpretable Deep Neural Networks for Tabular Data

Tabular data is the foundation of many applications in fields such as finance and healthcare. Although DNNs tailored for tabular data achieve competitive predictive performance, they are blackboxes with little interpretability. We introduce…

Machine Learning · Computer Science 2026-03-27 Khawla Elhadri , Jörg Schlötterer , Christin Seifert

AdaTyper: Adaptive Semantic Column Type Detection

Understanding the semantics of relational tables is instrumental for automation in data exploration and preparation systems. A key source for understanding a table is the semantics of its columns. With the rise of deep learning, learned…

Databases · Computer Science 2023-11-27 Madelon Hulsebos , Paul Groth , Çağatay Demiralp

Learning New Facts From Knowledge Bases With Neural Tensor Networks and Semantic Word Vectors

Knowledge bases provide applications with the benefit of easily accessible, systematic relational knowledge but often suffer in practice from their incompleteness and lack of knowledge of new entities and relations. Much work has focused on…

Computation and Language · Computer Science 2013-03-19 Danqi Chen , Richard Socher , Christopher D. Manning , Andrew Y. Ng

Making Table Understanding Work in Practice

Understanding the semantics of tables at scale is crucial for tasks like data integration, preparation, and search. Table understanding methods aim at detecting a table's topic, semantic column types, column relations, or entities. With the…

Databases · Computer Science 2021-09-14 Madelon Hulsebos , Sneha Gathani , James Gale , Isil Dillig , Paul Groth , Çağatay Demiralp

Abstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings

This paper describes an abstractive summarization method for tabular data which employs a knowledge base semantic embedding to generate the summary. Assuming the dataset contains descriptive text in headers, columns and/or some augmenting…

Artificial Intelligence · Computer Science 2018-04-06 Paul Azunre , Craig Corcoran , David Sullivan , Garrett Honke , Rebecca Ruppel , Sandeep Verma , Jonathon Morgan

TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data

Tabular data are ubiquitous for the widespread applications of tables and hence have attracted the attention of researchers to extract underlying information. One of the critical problems in mining tabular data is how to understand their…

Machine Learning · Computer Science 2021-06-17 Lun Du , Fei Gao , Xu Chen , Ran Jia , Junshan Wang , Jiang Zhang , Shi Han , Dongmei Zhang

Taxonomy Inference for Tabular Data Using Large Language Models

Taxonomy inference for tabular data is a critical task of schema inference, aiming at discovering entity types (i.e., concepts) of the tables and building their hierarchy. It can play an important role in data management, data exploration,…

Databases · Computer Science 2025-03-31 Zhenyu Wu , Jiaoyan Chen , Norman W. Paton

Graph Neural Network contextual embedding for Deep Learning on Tabular Data

All industries are trying to leverage Artificial Intelligence (AI) based on their existing big data which is available in so called tabular form, where each record is composed of a number of heterogeneous continuous and categorical columns…

Machine Learning · Computer Science 2024-02-29 Mario Villaizán-Vallelado , Matteo Salvatori , Belén Carro Martinez , Antonio Javier Sanchez Esguevillas