Related papers: Structured Multidimensional Representation Learnin…

Parameter-Efficient Transformer Embeddings

Embedding layers in transformer-based NLP models typically account for the largest share of model parameters, scaling with vocabulary size but not yielding performance gains proportional to scale. We propose an alternative approach in which…

Computation and Language · Computer Science 2025-05-06 Henry Ndubuaku , Mouad Talhi

I3D: Transformer architectures with input-dependent dynamic depth for speech recognition

Transformer-based end-to-end speech recognition has achieved great success. However, the large footprint and computational overhead make it difficult to deploy these models in some real-world applications. Model compression techniques can…

Computation and Language · Computer Science 2023-03-15 Yifan Peng , Jaesong Lee , Shinji Watanabe

Tensorized Embedding Layers for Efficient Model Compression

The embedding layers transforming input words into real vectors are the key components of deep neural networks used in natural language processing. However, when the vocabulary is large, the corresponding weight matrices can be enormous,…

Computation and Language · Computer Science 2020-02-20 Oleksii Hrinchuk , Valentin Khrulkov , Leyla Mirvakhabova , Elena Orlova , Ivan Oseledets

TensorGPT: Efficient Compression of Large Language Models based on Tensor-Train Decomposition

High-dimensional token embeddings underpin Large Language Models (LLMs), as they can capture subtle semantic information and significantly enhance the modelling of complex language patterns. However, this high dimensionality also introduces…

Computation and Language · Computer Science 2024-10-07 Mingxue Xu , Yao Lei Xu , Danilo P. Mandic

Investigating semantic subspaces of Transformer sentence embeddings through linear structural probing

The question of what kinds of linguistic information are encoded in different layers of Transformer-based language models is of considerable interest for the NLP community. Existing work, however, has overwhelmingly focused on word-level…

Computation and Language · Computer Science 2023-10-19 Dmitry Nikolaev , Sebastian Padó

No-Rank Tensor Decomposition Using Metric Learning

Tensor decomposition of high-dimensional data often struggles to capture semantically or physically meaningful structures, particularly when relying on reconstruction objectives and fixed-rank constraints. We introduce a no-rank tensor…

Machine Learning · Computer Science 2026-03-03 Maryam Bagherian

Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency

Transformer-based self-supervised models are trained as feature extractors and have empowered many downstream speech tasks to achieve state-of-the-art performance. However, both the training and inference process of these models may…

Computation and Language · Computer Science 2021-05-04 Jinchuan Tian , Rongzhi Gu , Helin Wang , Yuexian Zou

Low-Rank Plus Sparse Matrix Transfer Learning under Growing Representations and Ambient Dimensions

Learning systems often expand their ambient features or latent representations over time, embedding earlier representations into larger spaces with limited new latent structure. We study transfer learning for structured matrix estimation…

Machine Learning · Computer Science 2026-01-30 Jinhang Chai , Xuyuan Liu , Elynn Chen , Yujun Yan

Leveraging Decoder Architectures for Learned Sparse Retrieval

Learned Sparse Retrieval (LSR) has traditionally focused on small-scale encoder-only transformer architectures. With the advent of large-scale pre-trained language models, their capability to generate sparse representations for retrieval…

Information Retrieval · Computer Science 2025-04-28 Jingfen Qiao , Thong Nguyen , Evangelos Kanoulas , Andrew Yates

Structured Transforms for Small-Footprint Deep Learning

We consider the task of building compact deep learning pipelines suitable for deployment on storage and power constrained mobile devices. We propose a unified framework to learn a broad family of structured parameter matrices that are…

Machine Learning · Statistics 2015-10-07 Vikas Sindhwani , Tara N. Sainath , Sanjiv Kumar

Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses

How related are the representations learned by neural language models, translation models, and language tagging tasks? We answer this question by adapting an encoder-decoder transfer learning method from computer vision to investigate the…

Computation and Language · Computer Science 2025-12-11 Richard Antonello , Javier Turek , Vy Vo , Alexander Huth

SpecTran: Spectral-Aware Transformer-based Adapter for LLM-Enhanced Sequential Recommendation

Traditional sequential recommendation (SR) models learn low-dimensional item ID embeddings from user-item interactions, often overlooking textual information such as item titles or descriptions. Recent advances in Large Language Models…

Information Retrieval · Computer Science 2026-04-27 Yu Cui , Feng Liu , Zhaoxiang Wang , Changwang Zhang , Jun Wang , Can Wang , Jiawei Chen

SEED: A Structural Encoder for Embedding-Driven Decoding in Time Series Prediction with LLMs

Multivariate time series forecasting requires models to simultaneously capture variable-wise structural dependencies and generalize across diverse tasks. While structural encoders are effective in modeling feature interactions, they lack…

Computation and Language · Computer Science 2025-06-26 Fengze Li , Yue Wang , Yangle Liu , Ming Huang , Dou Hong , Jieming Ma

Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition

Transformer has achieved competitive performance against state-of-the-art end-to-end models in automatic speech recognition (ASR), and requires significantly less training time than RNN-based models. The original Transformer, with…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-14 Wenyong Huang , Wenchao Hu , Yu Ting Yeung , Xiao Chen

A Simplified Fully Quantized Transformer for End-to-end Speech Recognition

While significant improvements have been made in recent years in terms of end-to-end automatic speech recognition (ASR) performance, such improvements were obtained through the use of very large neural networks, unfit for embedded use on…

Computation and Language · Computer Science 2020-03-25 Alex Bie , Bharat Venkitesh , Joao Monteiro , Md. Akmal Haidar , Mehdi Rezagholizadeh

TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices

Small Language Models (SLMs, or on-device LMs) have significantly fewer parameters than Large Language Models (LLMs). They are typically deployed on low-end devices, like mobile phones and single-board computers. Unlike LLMs, which rely on…

Computation and Language · Computer Science 2025-06-17 Mingxue Xu , Yao Lei Xu , Danilo P. Mandic

Condenser: a Pre-training Architecture for Dense Retrieval

Pre-trained Transformer language models (LM) have become go-to text representation encoders. Prior research fine-tunes deep LMs to encode text sequences such as sentences and passages into single dense vector representations for efficient…

Computation and Language · Computer Science 2021-09-22 Luyu Gao , Jamie Callan

Learning Efficient Tensor Representations with Ring Structure Networks

Tensor train (TT) decomposition is a powerful representation for high-order tensors, which has been successfully applied to various machine learning tasks in recent years. However, since the tensor product is not commutative, permutation of…

Numerical Analysis · Computer Science 2017-05-31 Qibin Zhao , Masashi Sugiyama , Andrzej Cichocki

Structured State Space Decoder for Speech Recognition and Synthesis

Automatic speech recognition (ASR) systems developed in recent years have shown promising results with self-attention models (e.g., Transformer and Conformer), which are replacing conventional recurrent neural networks. Meanwhile, a…

Sound · Computer Science 2022-11-01 Koichi Miyazaki , Masato Murata , Tomoki Koriyama

Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition

This paper proposes Transducers with Pronunciation-aware Embeddings (PET). Unlike conventional Transducers where the decoder embeddings for different tokens are trained independently, the PET model's decoder embedding incorporates shared…

Computation and Language · Computer Science 2024-04-09 Hainan Xu , Zhehuai Chen , Fei Jia , Boris Ginsburg