A Non-Linear Structural Probe

Jennifer C. White; Tiago Pimentel; Naomi Saphra; Ryan Cotterell

A Non-Linear Structural Probe

Computation and Language 2021-05-24 v1 Machine Learning

Authors: Jennifer C. White , Tiago Pimentel , Naomi Saphra , Ryan Cotterell

Abstract

Probes are models devised to investigate the encoding of knowledge -- e.g. syntactic structure -- in contextual representations. Probes are often designed for simplicity, which has led to restrictions on probe design that may not allow for the full exploitation of the structure of encoded information; one such restriction is linearity. We examine the case of a structural probe (Hewitt and Manning, 2019), which aims to investigate the encoding of syntactic structure in contextual representations through learning only linear transformations. By observing that the structural probe learns a metric, we are able to kernelize it and develop a novel non-linear variant with an identical number of parameters. We test on 6 languages and find that the radial-basis function (RBF) kernel, in conjunction with regularization, achieves a statistically significant improvement over the baseline in all languages -- implying that at least part of the syntactic knowledge is encoded non-linearly. We conclude by discussing how the RBF kernel resembles BERT's self-attention layers and speculate that this resemblance leads to the RBF-based probe's stronger performance.

Keywords

parsing natural language parsing natural language processing

Cite

@article{arxiv.2105.10185,
  title  = {A Non-Linear Structural Probe},
  author = {Jennifer C. White and Tiago Pimentel and Naomi Saphra and Ryan Cotterell},
  journal= {arXiv preprint arXiv:2105.10185},
  year   = {2021}
}

Comments

Accepted at NAACL 2021

A Non-Linear Structural Probe

Abstract

Keywords

Cite

Comments

Related papers