Related papers: Learning Dynamic Feature Selection for Fast Sequen…

Training for Fast Sequential Prediction Using Dynamic Feature Selection

We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components. This is accomplished by partitioning…

Computation and Language · Computer Science 2014-12-23 Emma Strubell , Luke Vilnis , Andrew McCallum

Searching for Discriminative Words in Multidimensional Continuous Feature Space

Word feature vectors have been proven to improve many NLP tasks. With recent advances in unsupervised learning of these feature vectors, it became possible to train it with much more data, which also resulted in better quality of learned…

Computation and Language · Computer Science 2022-11-29 Marius Sajgalik , Michal Barla , Maria Bielikova

Using predefined vector systems to speed up neural network multimillion class classification

Label prediction in neural networks (NNs) has O(n) complexity proportional to the number of classes. This holds true for classification using fully connected layers and cosine similarity with some set of class prototypes. In this paper we…

Machine Learning · Computer Science 2026-04-07 Nikita Gabdullin , Ilya Androsov

Combining Discrete and Neural Features for Sequence Labeling

Neural network models have recently received heated research attention in the natural language processing community. Compared with traditional models with discrete features, neural models have two main advantages. First, they take…

Computation and Language · Computer Science 2017-08-25 Jie Yang , Zhiyang Teng , Meishan Zhang , Yue Zhang

Embedding Lexical Features via Low-Rank Tensors

Modern NLP models rely heavily on engineered features, which often combine word and contextual information into complex lexical features. Such combination results in large numbers of features, which can lead to over-fitting. We present a…

Computation and Language · Computer Science 2016-04-05 Mo Yu , Mark Dredze , Raman Arora , Matthew Gormley

Enhancing Deep Neural Network Training Efficiency and Performance through Linear Prediction

Deep neural networks (DNN) have achieved remarkable success in various fields, including computer vision and natural language processing. However, training an effective DNN model still poses challenges. This paper aims to propose a method…

Machine Learning · Computer Science 2024-07-03 Hejie Ying , Mengmeng Song , Yaohong Tang , Shungen Xiao , Zimin Xiao

A Linear Dynamical System Model for Text

Low dimensional representations of words allow accurate NLP models to be trained on limited annotated data. While most representations ignore words' local context, a natural way to induce context-dependent representations is to perform…

Machine Learning · Statistics 2015-06-02 David Belanger , Sham Kakade

Subset Sampling For Progressive Neural Network Learning

Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data. While this approach exempts the users from the manual task of designing…

Machine Learning · Computer Science 2020-05-26 Dat Thanh Tran , Moncef Gabbouj , Alexandros Iosifidis

Efficient Sequence Packing without Cross-contamination: Accelerating Large Language Models without Impacting Performance

Effective training of today's large language models (LLMs) depends on large batches and long sequences for throughput and accuracy. To handle variable-length sequences on hardware accelerators, it is common practice to introduce padding…

Computation and Language · Computer Science 2022-10-07 Mario Michael Krell , Matej Kosec , Sergio P. Perez , Andrew Fitzgibbon

Improving Neural Sequence Labelling using Additional Linguistic Information

Sequence labelling is the task of assigning categorical labels to a data sequence. In Natural Language Processing, sequence labelling can be applied to various fundamental problems, such as Part of Speech (POS) tagging, Named Entity…

Computation and Language · Computer Science 2018-07-31 Mahtab Ahmed , Muhammad Rifayat Samee , Robert E. Mercer

Understanding and Mitigating Classification Errors Through Interpretable Token Patterns

State-of-the-art NLP methods achieve human-like performance on many tasks, but make errors nevertheless. Characterizing these errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors,…

Computation and Language · Computer Science 2023-11-21 Michael A. Hedderich , Jonas Fischer , Dietrich Klakow , Jilles Vreeken

Protein sequence classification using natural language processing techniques

Purpose: This study aimed to enhance protein sequence classification using natural language processing (NLP) techniques while addressing the impact of sequence similarity on model performance. We compared various machine learning and deep…

Quantitative Methods · Quantitative Biology 2025-05-26 Huma Perveen , Julie Weeds

A Full-Stack Search Technique for Domain Optimized Deep Learning Accelerators

The rapidly-changing deep learning landscape presents a unique opportunity for building inference accelerators optimized for specific datacenter-scale workloads. We propose Full-stack Accelerator Search Technique (FAST), a hardware…

Machine Learning · Computer Science 2022-02-02 Dan Zhang , Safeen Huda , Ebrahim Songhori , Kartik Prabhu , Quoc Le , Anna Goldie , Azalia Mirhoseini

Deep Natural Language Feature Learning for Interpretable Prediction

We propose a general method to break down a main complex task into a set of intermediary easier sub-tasks, which are formulated in natural language as binary questions related to the final target task. Our method allows for representing…

Computation and Language · Computer Science 2024-02-02 Felipe Urrutia , Cristian Buc , Valentin Barriere

D4: Improving LLM Pretraining via Document De-Duplication and Diversification

Over recent years, an increasing amount of compute and data has been poured into training large language models (LLMs), usually by doing one-pass learning on as many tokens as possible randomly selected from large-scale web corpora. While…

Computation and Language · Computer Science 2023-08-24 Kushal Tirumala , Daniel Simig , Armen Aghajanyan , Ari S. Morcos

Challenging Language-Dependent Segmentation for Arabic: An Application to Machine Translation and Part-of-Speech Tagging

Word segmentation plays a pivotal role in improving any Arabic NLP application. Therefore, a lot of research has been spent in improving its accuracy. Off-the-shelf tools, however, are: i) complicated to use and ii) domain/dialect…

Computation and Language · Computer Science 2017-09-05 Hassan Sajjad , Fahim Dalvi , Nadir Durrani , Ahmed Abdelali , Yonatan Belinkov , Stephan Vogel

Learning Efficient Task-Specific Meta-Embeddings with Word Prisms

Word embeddings are trained to predict word cooccurrence statistics, which leads them to possess different lexical properties (syntactic, semantic, etc.) depending on the notion of context defined at training time. These properties manifest…

Computation and Language · Computer Science 2020-11-06 Jingyi He , KC Tsiolis , Kian Kenyon-Dean , Jackie Chi Kit Cheung

FAST: Boosting Uncertainty-based Test Prioritization Methods for Neural Networks via Feature Selection

Due to the vast testing space, the increasing demand for effective and efficient testing of deep neural networks (DNNs) has led to the development of various DNN test case prioritization techniques. However, the fact that DNNs can deliver…

Software Engineering · Computer Science 2024-09-17 Jialuo Chen , Jingyi Wang , Xiyue Zhang , Youcheng Sun , Marta Kwiatkowska , Jiming Chen , Peng Cheng

Embedded Federated Feature Selection with Dynamic Sparse Training: Balancing Accuracy-Cost Tradeoffs

Federated Learning (FL) enables multiple resource-constrained edge devices with varying levels of heterogeneity to collaboratively train a global model. However, devices with limited capacity can create bottlenecks and slow down model…

Machine Learning · Computer Science 2025-04-08 Afsaneh Mahanipour , Hana Khamfroush

Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with Supervised Learning

We show how to predict the basic word-order facts of a novel language given only a corpus of part-of-speech (POS) sequences. We predict how often direct objects follow their verbs, how often adjectives follow their nouns, and in general the…

Computation and Language · Computer Science 2017-10-12 Dingquan Wang , Jason Eisner