Related papers: Efficient Weight factorization for Multilingual Sp…

Towards continually learning new languages

Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training. An ability to add new languages after the prior training sessions can be economically…

Computation and Language · Computer Science 2024-07-19 Ngoc-Quan Pham , Jan Niehues , Alexander Waibel

Factorized Neural Transducer for Efficient Language Model Adaptation

In recent years, end-to-end (E2E) based automatic speech recognition (ASR) systems have achieved great success due to their simplicity and promising performance. Neural Transducer based models are increasingly popular in streaming E2E based…

Computation and Language · Computer Science 2021-10-19 Xie Chen , Zhong Meng , Sarangarajan Parthasarathy , Jinyu Li

Weight Factorization and Centralization for Continual Learning in Speech Recognition

Modern neural network based speech recognition models are required to continually absorb new data without re-training the whole system, especially in downstream applications using foundation models, having no access to the original training…

Computation and Language · Computer Science 2025-06-23 Enes Yavuz Ugan , Ngoc-Quan Pham , Alexander Waibel

One-To-Many Multilingual End-to-end Speech Translation

Nowadays, training end-to-end neural models for spoken language translation (SLT) still has to confront with extreme data scarcity conditions. The existing SLT parallel corpora are indeed orders of magnitude smaller than those available for…

Computation and Language · Computer Science 2019-10-09 Mattia Antonino Di Gangi , Matteo Negri , Marco Turchi

Fast and accurate factorized neural transducer for text adaption of end-to-end speech recognition models

Neural transducer is now the most popular end-to-end model for speech recognition, due to its naturally streaming ability. However, it is challenging to adapt it with text-only data. Factorized neural transducer (FNT) model was proposed to…

Computation and Language · Computer Science 2023-02-24 Rui Zhao , Jian Xue , Partha Parthasarathy , Veljko Miljanic , Jinyu Li

Tensorized Embedding Layers for Efficient Model Compression

The embedding layers transforming input words into real vectors are the key components of deep neural networks used in natural language processing. However, when the vocabulary is large, the corresponding weight matrices can be enormous,…

Computation and Language · Computer Science 2020-02-20 Oleksii Hrinchuk , Valentin Khrulkov , Leyla Mirvakhabova , Elena Orlova , Ivan Oseledets

Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer

Highly performing deep neural networks come at the cost of computational complexity that limits their practicality for deployment on portable devices. We propose the low-rank transformer (LRT), a memory-efficient and fast neural…

Computation and Language · Computer Science 2020-02-17 Genta Indra Winata , Samuel Cahyawijaya , Zhaojiang Lin , Zihan Liu , Pascale Fung

Random Weight Factorization Improves the Training of Continuous Neural Representations

Continuous neural representations have recently emerged as a powerful and flexible alternative to classical discretized representations of signals. However, training them to capture fine details in multi-scale signals is difficult and…

Machine Learning · Computer Science 2022-10-06 Sifan Wang , Hanwen Wang , Jacob H. Seidman , Paris Perdikaris

Balancing Training for Multilingual Neural Machine Translation

When training multilingual machine translation (MT) models that can translate to/from multiple languages, we are faced with imbalanced training sets: some languages have much more training data than others. Standard practice is to up-sample…

Computation and Language · Computer Science 2020-09-08 Xinyi Wang , Yulia Tsvetkov , Graham Neubig

Neural Machine Translation by Generating Multiple Linguistic Factors

Factored neural machine translation (FNMT) is founded on the idea of using the morphological and grammatical decomposition of the words (factors) at the output side of the neural network. This architecture addresses two well-known problems…

Computation and Language · Computer Science 2017-12-07 Mercedes García-Martínez , Loïc Barrault , Fethi Bougares

A Factorized Recurrent Neural Network based architecture for medium to large vocabulary Language Modelling

Statistical language models are central to many applications that use semantics. Recurrent Neural Networks (RNN) are known to produce state of the art results for language modelling, outperforming their traditional n-gram counterparts in…

Computation and Language · Computer Science 2016-02-05 Anantharaman Palacode Narayana Iyer

Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition

State-of-the-art language models (LMs) represented by long-short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming increasingly complex and expensive for practical applications. Low-bit neural network…

Computation and Language · Computer Science 2021-12-22 Junhao Xu , Jianwei Yu , Shoukang Hu , Xunying Liu , Helen Meng

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Weight binarization has emerged as a promising strategy to reduce the complexity of large language models (LLMs). Existing approaches fall into post-training binarization, which is simple but causes severe performance loss, and…

Machine Learning · Statistics 2026-04-22 Ba-Hien Tran , Van Minh Nguyen

Towards Language-Universal End-to-End Speech Recognition

Building speech recognizers in multiple languages typically involves replicating a monolingual training recipe for each language, or utilizing a multi-task learning approach where models for different languages have separate output labels…

Computation and Language · Computer Science 2017-11-08 Suyoun Kim , Michael L. Seltzer

An Efficient Matrix Multiplication Algorithm for Accelerating Inference in Binary and Ternary Neural Networks

Despite their tremendous success and versatility, Deep Neural Networks (DNNs) such as Large Language Models (LLMs) suffer from inference inefficiency and rely on advanced computational infrastructure. To address these challenges and make…

Machine Learning · Computer Science 2025-05-05 Mohsen Dehghankar , Mahdi Erfanian , Abolfazl Asudeh

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

In this work, we propose a new parameter-efficient learning framework based on neural model reprogramming for cross-lingual speech recognition, which can \textbf{re-purpose} well-trained English automatic speech recognition (ASR) models to…

Sound · Computer Science 2023-06-30 Chao-Han Huck Yang , Bo Li , Yu Zhang , Nanxin Chen , Rohit Prabhavalkar , Tara N. Sainath , Trevor Strohman

Factored Neural Machine Translation

We present a new approach for neural machine translation (NMT) using the morphological and grammatical decomposition of the words (factors) in the output side of the neural network. This architecture addresses two main problems occurring in…

Computation and Language · Computer Science 2017-12-07 Mercedes García-Martínez , Loïc Barrault , Fethi Bougares

Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains

Neural language modeling (LM) has led to significant improvements in several applications, including Automatic Speech Recognition. However, they typically require large amounts of training data, which is not available for many domains and…

Computation and Language · Computer Science 2019-06-05 Navid Rekabsaz , Nikolaos Pappas , James Henderson , Banriskhem K. Khonglah , Srikanth Madikeri

Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages

With multilingual machine translation (MMT) models continuing to grow in size and number of supported languages, it is natural to reuse and upgrade existing models to save computation as data becomes available in more languages. However,…

Computation and Language · Computer Science 2023-02-08 Simeng Sun , Maha Elbayad , Anna Sun , James Cross

Accelerating Multilingual Language Model for Excessively Tokenized Languages

Recent advancements in large language models (LLMs) have remarkably enhanced performances on a variety of tasks in multiple languages. However, tokenizers in LLMs trained primarily on English-centric corpora often overly fragment a text…

Computation and Language · Computer Science 2024-08-07 Jimin Hong , Gibbeum Lee , Jaewoong Cho