English

Improving Deep Tabular Learning

Machine Learning 2025-09-23 v1

Abstract

Tabular data remain a dominant form of real-world information but pose persistent challenges for deep learning due to heterogeneous feature types, lack of natural structure, and limited label-preserving augmentations. As a result, ensemble models based on decision trees continue to dominate benchmark leaderboards. In this work, we introduce RuleNet, a transformer-based architecture specifically designed for deep tabular learning. RuleNet incorporates learnable rule embeddings in a decoder, a piecewise linear quantile projection for numerical features, and feature masking ensembles for robustness and uncertainty estimation. Evaluated on eight benchmark datasets, RuleNet matches or surpasses state-of-the-art tree-based methods in most cases, while remaining computationally efficient, offering a practical neural alternative for tabular prediction tasks.

Keywords

Cite

@article{arxiv.2509.16354,
  title  = {Improving Deep Tabular Learning},
  author = {Sivan Sarafian and Yehudit Aperstein},
  journal= {arXiv preprint arXiv:2509.16354},
  year   = {2025}
}

Comments

18 pages, 4 figures