Related papers: Knowledge Transfer Pre-training
When training data is scarce, the incorporation of additional prior knowledge can assist the learning process. While it is common to initialize neural networks with weights that have been pre-trained on other large data sets, pre-training…
Recurrent neural networks (RNNs), particularly long short-term memory (LSTM), have gained much attention in automatic speech recognition (ASR). Although some successful stories have been reported, training RNNs remains highly challenging,…
We introduce a new method for training deep Boltzmann machines jointly. Prior methods of training DBMs require an initial learning pass that trains the model greedily, one layer at a time, or do not perform well on classification tasks. In…
Currently there are two predominant ways to train deep neural networks. The first one uses restricted Boltzmann machine (RBM) and the second one autoencoders. RBMs are stacked in layers to form deep belief network (DBN); the final…
Transfer learning has emerged as a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains. This process consists of taking a neural network pre-trained on a large feature-rich source…
We present a general framework for training deep neural networks without backpropagation. This substantially decreases training time and also allows for construction of deep networks with many sorts of learners, including networks whose…
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models, and generally yields improved performance and faster training times. The technique of pre-training on one task and then…
Transfer learning aims to leverage knowledge from pre-trained models to benefit the target task. Prior transfer learning work mainly transfers from a single model. However, with the emergence of deep models pre-trained from different…
Transfer learning is widely used to adapt large pretrained models to new tasks with only a small amount of new data. However, a challenge persists -- the features from the original task often do not fully cover what is needed for unseen…
In this paper we present a technique to train neural network models on small amounts of data. Current methods for training neural networks on small amounts of rich data typically rely on strategies such as fine-tuning a pre-trained neural…
Knowledge tracing---where a machine models the knowledge of a student as they interact with coursework---is a well established problem in computer supported education. Though effectively modeling student knowledge would have high…
Pre-training has exhibited notable benefits to downstream tasks by boosting accuracy and speeding up convergence, but the exact reasons for these benefits still remain unclear. To this end, we propose to quantitatively and explicitly…
Transfer learning is a machine learning technique that uses previously acquired knowledge from a source domain to enhance learning in a target domain by reusing learned weights. This technique is ubiquitous because of its great advantages…
While deeper and wider neural networks are actively pushing the performance limits of various computer vision and machine learning tasks, they often require large sets of labeled data for effective training and suffer from extremely high…
Most uses of machine learning today involve training a model from scratch for a particular task, or sometimes starting with a model pretrained on a related task and then fine-tuning on a downstream task. Both approaches offer limited…
We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often…
Though pre-trained language models such as Bert and XLNet, have rapidly advanced the state-of-the-art on many NLP tasks, they implicit semantics only relying on surface information between words in corpus. Intuitively, background knowledge…
Deep transfer learning (DTL) has formed a long-term quest toward enabling deep neural networks (DNNs) to reuse historical experiences as efficiently as humans. This ability is named knowledge transferability. A commonly used paradigm for…
Knowledge Transfer (KT) techniques tackle the problem of transferring the knowledge from a large and complex neural network into a smaller and faster one. However, existing KT methods are tailored towards classification tasks and they…
We address two questions for training a convolutional neural network (CNN) for hyperspectral image classification: i) is it possible to build a pre-trained network? and ii) is the pre-training effective in furthering the performance? To…