English
Related papers

Related papers: Tree-Regularized Tabular Embeddings

200 papers

Despite their impressive performance, Deep Neural Networks (DNNs) typically underperform Gradient Boosting Trees (GBTs) on many tabular-dataset learning tasks. We propose that applying a different regularization coefficient to each weight…

Machine Learning · Statistics 2018-10-25 Ira Shavitt , Eran Segal

Unsupervised feature learning often finds low-dimensional embeddings that capture the structure of complex data. For tasks for which prior expert topological knowledge is available, incorporating this into the learned representation may…

Machine Learning · Computer Science 2022-03-08 Robin Vandaele , Bo Kang , Jefrey Lijffijt , Tijl De Bie , Yvan Saeys

Heterogeneous tabular data are the most commonly used form of data and are essential for numerous critical and computationally demanding applications. On homogeneous data sets, deep neural networks have repeatedly shown excellent…

Machine Learning · Computer Science 2023-01-24 Vadim Borisov , Tobias Leemann , Kathrin Seßler , Johannes Haug , Martin Pawelczyk , Gjergji Kasneci

Foundation models have established unified representations for natural language processing, yet this paradigm remains largely unexplored for tabular data. Existing methods face fundamental limitations: LLM-based approaches lack…

Computation and Language · Computer Science 2026-05-07 Minjie Qiang , Mingming Zhang , Xiaoyi Bao , Xing Fu , Yu Cheng , Weiqiang Wang , Zhongqing Wang , Ningtao Wang

While deep learning has enabled tremendous progress on text and image datasets, its superiority on tabular data is not clear. We contribute extensive benchmarks of standard and novel deep learning methods as well as tree-based models such…

Machine Learning · Computer Science 2022-07-20 Léo Grinsztajn , Edouard Oyallon , Gaël Varoquaux

Deep predictive models of neuronal activity have recently enabled several new discoveries about the selectivity and invariance of neurons in the visual cortex. These models learn a shared set of nonlinear basis functions, which are linearly…

Neurons and Cognition · Quantitative Biology 2024-06-19 Polina Turishcheva , Max Burg , Fabian H. Sinz , Alexander Ecker

Embedding-based neural topic models could explicitly represent words and topics by embedding them to a homogeneous feature space, which shows higher interpretability. However, there are no explicit constraints for the training of…

Computation and Language · Computer Science 2022-06-17 Wei Shao , Lei Huang , Shuqi Liu , Shihua Ma , Linqi Song

Unsupervised representation learning methods are widely used for gaining insight into high-dimensional, unstructured, or structured data. In some cases, users may have prior topological knowledge about the data, such as a known cluster…

Machine Learning · Computer Science 2023-11-08 Edith Heiter , Robin Vandaele , Tijl De Bie , Yvan Saeys , Jefrey Lijffijt

Tabular data is prevalent across diverse domains in machine learning. With the rapid progress of deep tabular prediction methods, especially pretrained (foundation) models, there is a growing need to evaluate these methods systematically…

Machine Learning · Computer Science 2025-11-10 Han-Jia Ye , Si-Yang Liu , Hao-Run Cai , Qi-Le Zhou , De-Chuan Zhan

Merge trees are a valuable tool in the scientific visualization of scalar fields; however, current methods for merge tree comparisons are computationally expensive, primarily due to the exhaustive matching between tree nodes. To address…

Machine Learning · Computer Science 2024-10-07 Yu Qin , Brittany Terese Fasy , Carola Wenk , Brian Summa

Tabular Foundation Models have recently established the state of the art in supervised tabular learning, by leveraging pretraining to learn generalizable representations of numerical and categorical structured data. However, they lack…

Tables contain valuable knowledge in a structured form. We employ neural language modeling approaches to embed tabular data into vector spaces. Specifically, we consider different table elements, such caption, column headings, and cells,…

Information Retrieval · Computer Science 2019-06-04 Li Deng , Shuo Zhang , Krisztian Balog

Tabular data, structured as rows and columns, is among the most prevalent data types in machine learning classification and regression applications. Models for learning from tabular data have continuously evolved, with Deep Neural Networks…

Machine Learning · Computer Science 2025-04-24 Jun-Peng Jiang , Si-Yang Liu , Hao-Run Cai , Qile Zhou , Han-Jia Ye

Tabular foundation models aim to learn universal representations of tabular data that transfer across tasks and domains, enabling applications such as table retrieval, semantic search and table-based prediction. Despite the growing number…

Machine Learning · Computer Science 2026-04-24 Liane Vogel , Kavitha Srinivas , Niharika D'Souza , Sola Shirai , Oktie Hassanzadeh , Horst Samulowitz

Tabular data is the most commonly used form of data in industry. Gradient Boosting Trees, Support Vector Machine, Random Forest, and Logistic Regression are typically used for classification tasks on tabular data. DNN models using…

Computer Vision and Pattern Recognition · Computer Science 2019-06-05 Baohua Sun , Lin Yang , Wenhan Zhang , Michael Lin , Patrick Dong , Charles Young , Jason Dong

Tabular data remain a dominant form of real-world information but pose persistent challenges for deep learning due to heterogeneous feature types, lack of natural structure, and limited label-preserving augmentations. As a result, ensemble…

Machine Learning · Computer Science 2025-09-23 Sivan Sarafian , Yehudit Aperstein

All industries are trying to leverage Artificial Intelligence (AI) based on their existing big data which is available in so called tabular form, where each record is composed of a number of heterogeneous continuous and categorical columns…

Neural networks have proved to be very robust at processing unstructured data like images, text, videos, and audio. However, it has been observed that their performance is not up to the mark in tabular data; hence tree-based models are…

Machine Learning · Computer Science 2022-04-25 Tushar Sarkar

Recent advances in machine learning have led to a surge in adoption of neural networks for various tasks, but lack of interpretability remains an issue for many others in which an understanding of the features influencing the prediction is…

Representation learning stands as one of the critical machine learning techniques across various domains. Through the acquisition of high-quality features, pre-trained embeddings significantly reduce input space redundancy, benefiting…

Machine Learning · Computer Science 2023-12-19 Suiyao Chen , Jing Wu , Naira Hovakimyan , Handong Yao
‹ Prev 1 2 3 10 Next ›