Related papers: Error-Correcting Neural Sequence Prediction

Code-switching pre-training for neural machine translation

This paper proposes a new pre-training method, called Code-Switching Pre-training (CSP for short) for Neural Machine Translation (NMT). Unlike traditional pre-training method which randomly masks some fragments of the input sentence, the…

Computation and Language · Computer Science 2020-09-18 Zhen Yang , Bojie Hu , Ambyera Han , Shen Huang , Qi Ju

On Error Correction Neural Networks for Economic Forecasting

Recurrent neural networks (RNNs) are more suitable for learning non-linear dependencies in dynamical systems from observed time series data. In practice all the external variables driving such systems are not known a priori, especially in…

Machine Learning · Computer Science 2020-06-02 Mhlasakululeka Mvubu , Emmanuel Kabuga , Christian Plitz , Bubacarr Bah , Ronnie Becker , Hans Georg Zimmermann

Functional Error Correction for Robust Neural Networks

When neural networks (NeuralNets) are implemented in hardware, their weights need to be stored in memory devices. As noise accumulates in the stored weights, the NeuralNet's performance will degrade. This paper studies how to use error…

Information Theory · Computer Science 2020-01-14 Kunping Huang , Paul Siegel , Anxiao , Jiang

Ensemble Learning using Error Correcting Output Codes: New Classification Error Bounds

New bounds on classification error rates for the error-correcting output code (ECOC) approach in machine learning are presented. These bounds have exponential decay complexity with respect to codeword length and theoretically validate the…

Machine Learning · Computer Science 2021-09-21 Hieu D. Nguyen , Mohammed Sarosh Khan , Nicholas Kaegi , Shen-Shyang Ho , Jonathan Moore , Logan Borys , Lucas Lavalva

Balancing Stability and Plasticity in Sequentially Trained Early-Exiting Neural Networks

Early-exiting neural networks enable adaptive inference by allowing inputs to exit at intermediate classifiers, reducing computation for easy samples while maintaining high accuracy. In practice, exits can be trained sequentially by…

Machine Learning · Computer Science 2026-05-08 Alaa Zniber , Ouassim Karrakchou , Mounir Ghogho

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

Scheduled sampling is widely used to mitigate the exposure bias problem for neural machine translation. Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus…

Computation and Language · Computer Science 2021-09-01 Yijin Liu , Fandong Meng , Yufeng Chen , Jinan Xu , Jie Zhou

Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation

This work presents our ongoing research of unsupervised pretraining in neural machine translation (NMT). In our method, we initialize the weights of the encoder and decoder with two language models that are trained with monolingual data and…

Computation and Language · Computer Science 2020-10-20 Dušan Variš , Ondřej Bojar

DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation

Despite their growing capabilities, language models still frequently reproduce content from their training data, generate repetitive text, and favor common grammatical patterns and vocabulary. A possible cause is the decoding strategy: the…

Computation and Language · Computer Science 2026-01-15 Giorgio Franceschelli , Mirco Musolesi

$k$-Neighbor Based Curriculum Sampling for Sequence Prediction

Multi-step ahead prediction in language models is challenging due to the discrepancy between training and test time processes. At test time, a sequence predictor is required to make predictions given past predictions as the input, instead…

Computation and Language · Computer Science 2021-01-26 James O' Neill , Danushka Bollegala

An Empirical Investigation of Contextualized Number Prediction

We conduct a large scale empirical investigation of contextualized number prediction in running text. Specifically, we consider two tasks: (1)masked number prediction-predicting a missing numerical value within a sentence, and (2)numerical…

Computation and Language · Computer Science 2020-11-17 Daniel Spokoyny , Taylor Berg-Kirkpatrick

Variational Classification

We present a latent variable model for classification that provides a novel probabilistic interpretation of neural network softmax classifiers. We derive a variational objective to train the model, analogous to the evidence lower bound…

Machine Learning · Computer Science 2024-01-10 Shehzaad Dhuliawala , Mrinmaya Sachan , Carl Allen

Improving Scheduled Sampling with Elastic Weight Consolidation for Neural Machine Translation

Despite strong performance in many sequence-to-sequence tasks, autoregressive models trained with maximum likelihood estimation suffer from exposure bias, i.e. the discrepancy between the ground-truth prefixes used during training and the…

Computation and Language · Computer Science 2023-01-11 Michalis Korakakis , Andreas Vlachos

An Error-Oriented Approach to Word Embedding Pre-Training

We propose a novel word embedding pre-training approach that exploits writing errors in learners' scripts. We compare our method to previous models that tune the embeddings based on script scores and the discrimination between correct and…

Computation and Language · Computer Science 2019-07-05 Youmna Farag , Marek Rei , Ted Briscoe

SelecMix: Debiased Learning by Contradicting-pair Sampling

Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are strongly correlated with undesirable features. To…

Computer Vision and Pattern Recognition · Computer Science 2022-11-07 Inwoo Hwang , Sangjun Lee , Yunhyeok Kwak , Seong Joon Oh , Damien Teney , Jin-Hwa Kim , Byoung-Tak Zhang

Improved Variational Inference in Discrete VAEs using Error Correcting Codes

Despite advances in deep probabilistic models, learning discrete latent representations remains challenging. This work introduces a novel method to improve inference in discrete Variational Autoencoders by reframing the inference problem…

Machine Learning · Computer Science 2025-06-11 María Martínez-García , Grace Villacrés , David Mitchell , Pablo M. Olmos

Neural Machine Translation via Binary Code Prediction

In this paper, we propose a new method for calculating the output layer in neural machine translation systems. The method is based on predicting a binary code for each word and can reduce computation time/memory requirements of the output…

Computation and Language · Computer Science 2017-04-25 Yusuke Oda , Philip Arthur , Graham Neubig , Koichiro Yoshino , Satoshi Nakamura

Integer Programming-based Error-Correcting Output Code Design for Robust Classification

Error-Correcting Output Codes (ECOCs) offer a principled approach for combining simple binary classifiers into multiclass classifiers. In this paper, we investigate the problem of designing optimal ECOCs to achieve both nominal and…

Machine Learning · Computer Science 2020-11-03 Samarth Gupta , Saurabh Amin

Neural Network Training via Stochastic Alternating Minimization with Trainable Step Sizes

The training of deep neural networks is inherently a nonconvex optimization problem, yet standard approaches such as stochastic gradient descent (SGD) require simultaneous updates to all parameters, often leading to unstable convergence and…

Machine Learning · Computer Science 2025-08-07 Chengcheng Yan , Jiawei Xu , Zheng Peng , Qingsong Wang

ECOC-Based Training of Neural Networks for Face Recognition

Error Correcting Output Codes, ECOC, is an output representation method capable of discovering some of the errors produced in classification tasks. This paper describes the application of ECOC to the training of feed forward neural…

Computer Vision and Pattern Recognition · Computer Science 2016-11-18 Nima Hatami , Reza Ebrahimpour , Reza Ghaderi

Edit Probability for Scene Text Recognition

We consider the scene text recognition problem under the attention-based encoder-decoder framework, which is the state of the art. The existing methods usually employ a frame-wise maximal likelihood loss to optimize the models. When we…

Computer Vision and Pattern Recognition · Computer Science 2018-05-10 Fan Bai , Zhanzhan Cheng , Yi Niu , Shiliang Pu , Shuigeng Zhou