English

Partial Tensorized Transformers for Natural Language Processing

Computation and Language 2023-11-01 v1 Machine Learning

Abstract

The transformer architecture has revolutionized Natural Language Processing (NLP) and other machine-learning tasks, due to its unprecedented accuracy. However, their extensive memory and parameter requirements often hinder their practical applications. In this work, we study the effect of tensor-train decomposition to improve the accuracy and compress transformer vision-language neural networks, namely BERT and ViT. We focus both on embedding-layer compression and partial tensorization of neural networks (PTNN) through an algorithmic approach. Our novel PTNN approach significantly improves the accuracy of existing models by up to 5%, all without the need for post-training adjustments, breaking new ground in the field of tensor decomposition.

Keywords

Cite

@article{arxiv.2310.20077,
  title  = {Partial Tensorized Transformers for Natural Language Processing},
  author = {Subhadra Vadlamannati and Ryan Solgi},
  journal= {arXiv preprint arXiv:2310.20077},
  year   = {2023}
}

Comments

In Review under the 16th International Conference on Agents and Artificial Intelligence