Finding patterns in Knowledge Attribution for Transformers

Jeevesh Juneja; Ritu Agarwal

Finding patterns in Knowledge Attribution for Transformers

Computation and Language 2022-05-05 v2 Machine Learning

Authors: Jeevesh Juneja , Ritu Agarwal

Abstract

We analyze the Knowledge Neurons framework for the attribution of factual and relational knowledge to particular neurons in the transformer network. We use a 12-layer multi-lingual BERT model for our experiments. Our study reveals various interesting phenomena. We observe that mostly factual knowledge can be attributed to middle and higher layers of the network( $\ge 6$ ). Further analysis reveals that the middle layers( $6-9$ ) are mostly responsible for relational information, which is further refined into actual factual knowledge or the "correct answer" in the last few layers( $10-12$ ). Our experiments also show that the model handles prompts in different languages, but representing the same fact, similarly, providing further evidence for effectiveness of multi-lingual pre-training. Applying the attribution scheme for grammatical knowledge, we find that grammatical knowledge is far more dispersed among the neurons than factual knowledge.

Keywords

bert model transformation transformer

Cite

@article{arxiv.2205.01366,
  title  = {Finding patterns in Knowledge Attribution for Transformers},
  author = {Jeevesh Juneja and Ritu Agarwal},
  journal= {arXiv preprint arXiv:2205.01366},
  year   = {2022}
}

Comments

Remove unnecessary files; Correct Typos;

Finding patterns in Knowledge Attribution for Transformers

Abstract

Keywords

Cite

Comments

Related papers