English

Automating Date Format Detection for Data Visualization

Computation and Language 2025-01-13 v1

Abstract

Data preparation, specifically date parsing, is a significant bottleneck in analytic workflows. To address this, we present two algorithms, one based on minimum entropy and the other on natural language modeling that automatically derive date formats from string data. These algorithms achieve over 90% accuracy on a large corpus of data columns, streamlining the data preparation process within visualization environments. The minimal entropy approach is particularly fast, providing interactive feedback. Our methods simplify date format extraction, making them suitable for integration into data visualization tools and databases.

Keywords

Cite

@article{arxiv.2501.05640,
  title  = {Automating Date Format Detection for Data Visualization},
  author = {Zixuan Liang},
  journal= {arXiv preprint arXiv:2501.05640},
  year   = {2025}
}

Comments

2025 International Conference on Advanced Machine Learning and Data Science (AMLDS 2025)

R2 v1 2026-06-28T21:02:06.536Z