English

Learning Semantic String Transformations from Examples

Databases 2012-04-30 v1

Abstract

We address the problem of performing semantic transformations on strings, which may represent a variety of data types (or their combination) such as a column in a relational table, time, date, currency, etc. Unlike syntactic transformations, which are based on regular expressions and which interpret a string as a sequence of characters, semantic transformations additionally require exploiting the semantics of the data type represented by the string, which may be encoded as a database of relational tables. Manually performing such transformations on a large collection of strings is error prone and cumbersome, while programmatic solutions are beyond the skill-set of end-users. We present a programming by example technology that allows end-users to automate such repetitive tasks. We describe an expressive transformation language for semantic manipulation that combines table lookup operations and syntactic manipulations. We then present a synthesis algorithm that can learn all transformations in the language that are consistent with the user-provided set of input-output examples. We have implemented this technology as an add-in for the Microsoft Excel Spreadsheet system and have evaluated it successfully over several benchmarks picked from various Excel help-forums.

Keywords

Cite

@article{arxiv.1204.6079,
  title  = {Learning Semantic String Transformations from Examples},
  author = {Rishabh Singh and Sumit Gulwani},
  journal= {arXiv preprint arXiv:1204.6079},
  year   = {2012}
}

Comments

VLDB2012

R2 v1 2026-06-21T20:55:25.938Z