English
Related papers

Related papers: HYTREL: Hypergraph-enhanced Tabular Data Represent…

200 papers

Handling heterogeneous data in tabular datasets poses a significant challenge for deep learning models. While attention-based architectures and self-supervised learning have achieved notable success, their application to tabular data…

Machine Learning · Computer Science 2025-02-27 Anay Majee , Maria Xenochristou , Wei-Peng Chen

Dealing with tabular data is challenging due to partial information, noise, and heterogeneous structure. Existing techniques often struggle to simultaneously address key aspects of tabular data such as textual information, a variable number…

Machine Learning · Computer Science 2025-06-10 Wei Min Loh , Jiaqi Shang , Pascal Poupart

Relational tables on the Web store a vast amount of knowledge. Owing to the wealth of such tables, there has been tremendous progress on a variety of tasks in the area of table understanding. However, existing work generally relies on…

Information Retrieval · Computer Science 2020-12-04 Xiang Deng , Huan Sun , Alyssa Lees , You Wu , Cong Yu

Tabular data, structured as rows and columns, is among the most prevalent data types in machine learning classification and regression applications. Models for learning from tabular data have continuously evolved, with Deep Neural Networks…

Machine Learning · Computer Science 2025-04-24 Jun-Peng Jiang , Si-Yang Liu , Hao-Run Cai , Qile Zhou , Han-Jia Ye

Tabular data remains one of the most prevalent data types across a wide range of real-world applications, yet effective representation learning for this domain poses unique challenges due to its irregular patterns, heterogeneous feature…

Machine Learning · Computer Science 2025-01-08 Weijieying Ren , Tianxiang Zhao , Yuqing Huang , Vasant Honavar

Existing work on tabular representation learning jointly models tables and associated text using self-supervised objective functions derived from pretrained language models such as BERT. While this joint pretraining improves tasks involving…

Computation and Language · Computer Science 2021-05-07 Hiroshi Iida , Dung Thai , Varun Manjunatha , Mohit Iyyer

Diffusion models have been the predominant generative model for tabular data generation. However, they face the conundrum of modeling under a separate versus a unified data representation. The former encounters the challenge of jointly…

Machine Learning · Computer Science 2025-12-23 Jacob Si , Zijing Ou , Mike Qu , Zhengrui Xiang , Yingzhen Li

Deep learning has achieved impressive performance in many domains, such as computer vision and natural language processing, but its advantage over classical shallow methods on tabular datasets remains questionable. It is especially…

Machine Learning · Computer Science 2023-08-25 Witold Wydmański , Oleksii Bulenok , Marek Śmieja

Inductive representation learning on temporal heterogeneous graphs is crucial for scalable deep learning on heterogeneous information networks (HINs) which are time-varying, such as citation networks. However, most existing approaches are…

Machine Learning · Computer Science 2024-05-15 Chenglin Li , Yuanzhen Xie , Chenyun Yu , Lei Cheng , Bo Hu , Zang Li , Di Niu

Clustering complex data in the form of attributed graphs has attracted increasing attention, where powerful graph representation is a critical prerequisite. However, the well-known Over-Smoothing (OS) effect makes Graph Convolutional…

Machine Learning · Computer Science 2026-03-17 Junyang Chen , Yang Lu , Mengke Li , Cuie Yang , Yiqun Zhang , Yiu-ming Cheung

Extractive summarization for long documents is challenging due to the extended structured input context. The long-distance sentence dependency hinders cross-sentence relations modeling, the critical step of extractive summarization. This…

Computation and Language · Computer Science 2022-10-11 Haopeng Zhang , Xiao Liu , Jiawei Zhang

Tabular data is prevalent across diverse domains in machine learning. With the rapid progress of deep tabular prediction methods, especially pretrained (foundation) models, there is a growing need to evaluate these methods systematically…

Machine Learning · Computer Science 2025-11-10 Han-Jia Ye , Si-Yang Liu , Hao-Run Cai , Qi-Le Zhou , De-Chuan Zhan

Pretrained deep-learning models are the go-to solution for images or text. However, for tabular data the standard is still to train tree-based models. Indeed, transfer learning on tables hits the challenge of data integration: finding…

Machine Learning · Computer Science 2024-06-03 Myung Jun Kim , Léo Grinsztajn , Gaël Varoquaux

Transformers have shown impressive results in tabular data generation. However, they lack domain-specific inductive biases which are critical for preserving the intrinsic characteristics of tabular data. They also suffer from poor…

Machine Learning · Computer Science 2025-05-19 Jiayu Li , Bingyin Zhao , Zilong Zhao , Uzair Javaid , Kevin Yee , Biplab Sikdar

Recent studies have demonstrated the overwhelming advantage of cross-lingual pre-trained models (PTMs), such as multilingual BERT and XLM, on cross-lingual NLP tasks. However, existing approaches essentially capture the co-occurrence among…

Computation and Language · Computer Science 2021-03-23 Xiangpeng Wei , Rongxiang Weng , Yue Hu , Luxi Xing , Heng Yu , Weihua Luo

Information extraction from semi-structured webpages provides valuable long-tailed facts for augmenting knowledge graph. Relational Web tables are a critical component containing additional entities and attributes of rich and diverse…

Information Retrieval · Computer Science 2021-02-19 Daheng Wang , Prashant Shiralkar , Colin Lockard , Binxuan Huang , Xin Luna Dong , Meng Jiang

Hierarchical neural architectures are often used to capture long-distance dependencies and have been applied to many document-level tasks such as summarization, document segmentation, and sentiment analysis. However, effective usage of such…

Computation and Language · Computer Science 2019-01-29 Ming-Wei Chang , Kristina Toutanova , Kenton Lee , Jacob Devlin

Graph representation learning (GRL) has emerged as an effective technique for modeling graph-structured data. When modeling heterogeneity and dynamics in real-world complex networks, GRL methods designed for complex heterogeneous temporal…

Social and Information Networks · Computer Science 2026-05-19 Huan Liu , Pengfei Jiao , Mengzhou Gao , Chaochao Chen , Di Jin

Tabular data underpins decisions across science, industry, and public services. Despite rapid progress, advances in deep learning have not fully carried over to the tabular domain, where gradient-boosted decision trees (GBDTs) remain a…

Machine Learning · Computer Science 2025-11-21 David Bonet , Marçal Comajoan Cara , Alvaro Calafell , Daniel Mas Montserrat , Alexander G. Ioannidis

There is a recent growing interest in applying Deep Learning techniques to tabular data, in order to replicate the success of other Artificial Intelligence areas in this structured domain. Specifically interesting is the case in which…

Machine Learning · Computer Science 2025-05-06 Simone Luetto , Fabrizio Garuti , Enver Sangineto , Lorenzo Forni , Rita Cucchiara
‹ Prev 1 2 3 10 Next ›