English

Assessing Data Efficiency in Task-Oriented Semantic Parsing

Computation and Language 2021-07-13 v1

Abstract

Data efficiency, despite being an attractive characteristic, is often challenging to measure and optimize for in task-oriented semantic parsing; unlike exact match, it can require both model- and domain-specific setups, which have, historically, varied widely across experiments. In our work, as a step towards providing a unified solution to data-efficiency-related questions, we introduce a four-stage protocol which gives an approximate measure of how much in-domain, "target" data a parser requires to achieve a certain quality bar. Specifically, our protocol consists of (1) sampling target subsets of different cardinalities, (2) fine-tuning parsers on each subset, (3) obtaining a smooth curve relating target subset (%) vs. exact match (%), and (4) referencing the curve to mine ad-hoc (target subset, exact match) points. We apply our protocol in two real-world case studies -- model generalizability and intent complexity -- illustrating its flexibility and applicability to practitioners in task-oriented semantic parsing.

Keywords

Cite

@article{arxiv.2107.04736,
  title  = {Assessing Data Efficiency in Task-Oriented Semantic Parsing},
  author = {Shrey Desai and Akshat Shrivastava and Justin Rill and Brian Moran and Safiyyah Saleem and Alexander Zotov and Ahmed Aly},
  journal= {arXiv preprint arXiv:2107.04736},
  year   = {2021}
}
R2 v1 2026-06-24T04:03:42.576Z