English
Related papers

Related papers: Real-time Neural-based Input Method

200 papers

We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies. Our approach, called adaptive softmax, circumvents the linear dependency on the vocabulary size by exploiting the…

Computation and Language · Computer Science 2017-06-20 Edouard Grave , Armand Joulin , Moustapha Cissé , David Grangier , Hervé Jégou

In this paper, we propose a new method for calculating the output layer in neural machine translation systems. The method is based on predicting a binary code for each word and can reduce computation time/memory requirements of the output…

Computation and Language · Computer Science 2017-04-25 Yusuke Oda , Philip Arthur , Graham Neubig , Koichiro Yoshino , Satoshi Nakamura

This study presents a Long Short-Term Memory (LSTM) neural network approach to Japanese word segmentation (JWS). Previous studies on Chinese word segmentation (CWS) succeeded in using recurrent neural networks such as LSTM and gated…

Computation and Language · Computer Science 2018-09-28 Yoshiaki Kitagawa , Mamoru Komachi

Neural language models have been widely used in various NLP tasks, including machine translation, next word prediction and conversational agents. However, it is challenging to deploy these models on mobile devices due to their slow…

Machine Learning · Computer Science 2018-10-31 Patrick H. Chen , Si Si , Sanjiv Kumar , Yang Li , Cho-Jui Hsieh

RNN-Transducer has been one of promising architectures for end-to-end automatic speech recognition. Although RNN-Transducer has many advantages including its strong accuracy and streaming-friendly property, its high memory consumption…

Audio and Speech Processing · Electrical Eng. & Systems 2022-04-01 Jaesong Lee , Lukas Lee , Shinji Watanabe

Large language models (LLMs) have made transformed changes for human society. One of the key computation in LLMs is the softmax unit. This operation is important in LLMs because it allows the model to generate a distribution over possible…

Machine Learning · Computer Science 2023-04-27 Yichuan Deng , Zhihang Li , Zhao Song

Model compression is essential for serving large deep neural nets on devices with limited resources or applications that require real-time responses. As a case study, a state-of-the-art neural language model usually consists of one or more…

Computation and Language · Computer Science 2018-06-20 Patrick H. Chen , Si Si , Yang Li , Ciprian Chelba , Cho-jui Hsieh

Extensive efforts have been made to boost the performance in the domain of language models by introducing various attention-based transformers. However, the inclusion of linear layers with large dimensions contributes to significant…

Machine Learning · Computer Science 2024-11-19 Priyansh Bhatnagar , Linfeng Wen , Mingu Kang

Recent research efforts focus on reducing the computational and memory overheads of Large Language Models (LLMs) to make them feasible on resource-constrained devices. Despite advancements in compression techniques, non-linear operators…

Hardware Architecture · Computer Science 2024-11-28 Mariam Rakka , Jinhao Li , Guohao Dai , Ahmed Eltawil , Mohammed E. Fouda , Fadi Kurdahi

This paper addresses the robust speech recognition problem as an adaptation task. Specifically, we investigate the cumulative application of adaptation methods. A bidirectional Long Short-Term Memory (BLSTM) based neural network, capable of…

Computation and Language · Computer Science 2019-06-17 Markus Kitza , Pavel Golik , Ralf Schlüter , Hermann Ney

We describe a large vocabulary speech recognition system that is accurate, has low latency, and yet has a small enough memory and computational footprint to run faster than real-time on a Nexus 5 Android smartphone. We employ a quantized…

There has been a rapid advance of custom hardware (HW) for accelerating the inference speed of deep neural networks (DNNs). Previously, the softmax layer was not a main concern of DNN accelerating HW, because its portion is relatively small…

Machine Learning · Computer Science 2021-11-23 Ihor Vasyltsov , Wooseok Chang

Finding ways to accelerate text input for individuals with profound motor impairments has been a long-standing area of research. Closing the speed gap for augmentative and alternative communication (AAC) devices such as eye-tracking…

Large language models (LLMs) have numerous real-life applications across various domains, such as natural language translation, sentiment analysis, language modeling, chatbots and conversational agents, creative writing, text…

Machine Learning · Computer Science 2025-02-18 Yeqi Gao , Zhao Song , Junze Yin

Statistical language models are central to many applications that use semantics. Recurrent Neural Networks (RNN) are known to produce state of the art results for language modelling, outperforming their traditional n-gram counterparts in…

Computation and Language · Computer Science 2016-02-05 Anantharaman Palacode Narayana Iyer

Recurrent Neural Networks and in particular Long Short-Term Memory (LSTM) networks have demonstrated state-of-the-art accuracy in several emerging Artificial Intelligence tasks. However, the models are becoming increasingly demanding in…

Computer Vision and Pattern Recognition · Computer Science 2018-01-10 Michalis Rizakis , Stylianos I. Venieris , Alexandros Kouris , Christos-Savvas Bouganis

Replicated Softmax model, a well-known undirected topic model, is powerful in extracting semantic representations of documents. Traditional learning strategies such as Contrastive Divergence are very inefficient. This paper provides a novel…

Machine Learning · Computer Science 2015-06-25 Jiatao Gu , Victor O. K. Li

The Softmax function is used in the final layer of nearly all existing sequence-to-sequence models for language generation. However, it is usually the slowest layer to compute which limits the vocabulary size to a subset of most frequent…

Computation and Language · Computer Science 2019-03-25 Sachin Kumar , Yulia Tsvetkov

To encourage intra-class compactness and inter-class separability among trainable feature vectors, large-margin softmax methods are developed and widely applied in the face recognition community. The introduction of the large-margin concept…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-22 Jingjing Huo , Yingbo Gao , Weiyue Wang , Ralf Schlüter , Hermann Ney

Recently sequence-to-sequence models have started to achieve state-of-the-art performance on standard speech recognition tasks when processing audio data in batch mode, i.e., the complete audio data is available when starting processing.…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-28 Thai-Son Nguyen , Ngoc-Quan Pham , Sebastian Stueker , Alex Waibel
‹ Prev 1 2 3 10 Next ›