Related papers: Knowledge Transfer Pre-training

Informed Pre-Training on Prior Knowledge

When training data is scarce, the incorporation of additional prior knowledge can assist the learning process. While it is common to initialize neural networks with weights that have been pre-trained on other large data sets, pre-training…

Machine Learning · Computer Science 2022-05-24 Laura von Rueden , Sebastian Houben , Kostadin Cvejoski , Christian Bauckhage , Nico Piatkowski

Recurrent Neural Network Training with Dark Knowledge Transfer

Recurrent neural networks (RNNs), particularly long short-term memory (LSTM), have gained much attention in automatic speech recognition (ASR). Although some successful stories have been reported, training RNNs remains highly challenging,…

Machine Learning · Statistics 2016-09-21 Zhiyuan Tang , Dong Wang , Zhiyong Zhang

Joint Training Deep Boltzmann Machines for Classification

We introduce a new method for training deep Boltzmann machines jointly. Prior methods of training DBMs require an initial learning pass that trains the model greedily, one layer at a time, or do not perform well on classification tasks. In…

Machine Learning · Statistics 2013-05-02 Ian J. Goodfellow , Aaron Courville , Yoshua Bengio

How to Train Your Deep Neural Network with Dictionary Learning

Currently there are two predominant ways to train deep neural networks. The first one uses restricted Boltzmann machine (RBM) and the second one autoencoders. RBMs are stacked in layers to form deep belief network (DBN); the final…

Machine Learning · Computer Science 2016-12-23 Vanika Singhal , Shikha Singh , Angshul Majumdar

Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification

Transfer learning has emerged as a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains. This process consists of taking a neural network pre-trained on a large feature-rich source…

Machine Learning · Computer Science 2021-04-27 Francisco Utrera , Evan Kravitz , N. Benjamin Erichson , Rajiv Khanna , Michael W. Mahoney

Forward Thinking: Building and Training Neural Networks One Layer at a Time

We present a general framework for training deep neural networks without backpropagation. This substantially decreases training time and also allows for construction of deep networks with many sorts of learners, including networks whose…

Machine Learning · Statistics 2017-06-09 Chris Hettinger , Tanner Christensen , Ben Ehlert , Jeffrey Humpherys , Tyler Jarvis , Sean Wade

Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep Character Recognition

Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models, and generally yields improved performance and faster training times. The technique of pre-training on one task and then…

Machine Learning · Computer Science 2020-01-03 Nishai Kooverjee , Steven James , Terence van Zyl

Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models

Transfer learning aims to leverage knowledge from pre-trained models to benefit the target task. Prior transfer learning work mainly transfers from a single model. However, with the emergence of deep models pre-trained from different…

Machine Learning · Computer Science 2022-11-07 Yang Shu , Zhangjie Cao , Ziyang Zhang , Jianmin Wang , Mingsheng Long

These Are Not All the Features You Are Looking For: A Fundamental Bottleneck in Supervised Pretraining

Transfer learning is widely used to adapt large pretrained models to new tasks with only a small amount of new data. However, a challenge persists -- the features from the original task often do not fully cover what is needed for unseen…

Machine Learning · Computer Science 2026-02-10 Xingyu Alice Yang , Jianyu Zhang , Léon Bottou

Beyond Fine Tuning: A Modular Approach to Learning on Small Data

In this paper we present a technique to train neural network models on small amounts of data. Current methods for training neural networks on small amounts of rich data typically rely on strategies such as fine-tuning a pre-trained neural…

Machine Learning · Computer Science 2016-11-08 Ark Anderson , Kyle Shaffer , Artem Yankov , Court D. Corley , Nathan O. Hodas

Deep Knowledge Tracing

Knowledge tracing---where a machine models the knowledge of a student as they interact with coursework---is a well established problem in computer supported education. Though effectively modeling student knowledge would have high…

Artificial Intelligence · Computer Science 2015-06-22 Chris Piech , Jonathan Spencer , Jonathan Huang , Surya Ganguli , Mehran Sahami , Leonidas Guibas , Jascha Sohl-Dickstein

Why pre-training is beneficial for downstream classification tasks?

Pre-training has exhibited notable benefits to downstream tasks by boosting accuracy and speeding up convergence, but the exact reasons for these benefits still remain unclear. To this end, we propose to quantitatively and explicitly…

Machine Learning · Computer Science 2024-10-14 Xin Jiang , Xu Cheng , Zechao Li

A Transfer Learning Evaluation of Deep Neural Networks for Image Classification

Transfer learning is a machine learning technique that uses previously acquired knowledge from a source domain to enhance learning in a target domain by reusing learned weights. This technique is ubiquitous because of its great advantages…

Computer Vision and Pattern Recognition · Computer Science 2026-05-14 Nermeen Abou Baker , Nico Zengeler , Uwe Handmann

Knowledge Projection for Deep Neural Networks

While deeper and wider neural networks are actively pushing the performance limits of various computer vision and machine learning tasks, they often require large sets of labeled data for effective training and suffer from extremely high…

Computer Vision and Pattern Recognition · Computer Science 2017-10-27 Zhi Zhang , Guanghan Ning , Zhihai He

muNet: Evolving Pretrained Deep Neural Networks into Scalable Auto-tuning Multitask Systems

Most uses of machine learning today involve training a model from scratch for a particular task, or sometimes starting with a model pretrained on a related task and then fine-tuning on a downstream task. Both approaches offer limited…

Machine Learning · Computer Science 2022-05-26 Andrea Gesmundo , Jeff Dean

Net2Net: Accelerating Learning via Knowledge Transfer

We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often…

Machine Learning · Computer Science 2016-04-26 Tianqi Chen , Ian Goodfellow , Jonathon Shlens

K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining

Though pre-trained language models such as Bert and XLNet, have rapidly advanced the state-of-the-art on many NLP tasks, they implicit semantics only relying on surface information between words in corpus. Intuitively, background knowledge…

Computation and Language · Computer Science 2021-06-01 Ruiqing Yan , Lanchang Sun , Fang Wang , Xiaoming Zhang

Rethinking Two Consensuses of the Transferability in Deep Learning

Deep transfer learning (DTL) has formed a long-term quest toward enabling deep neural networks (DNNs) to reuse historical experiences as efficiently as humans. This ability is named knowledge transferability. A commonly used paradigm for…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Yixiong Chen , Jingxian Li , Chris Ding , Li Liu

Learning Deep Representations with Probabilistic Knowledge Transfer

Knowledge Transfer (KT) techniques tackle the problem of transferring the knowledge from a large and complex neural network into a smaller and faster one. However, existing KT methods are tailored towards classification tasks and they…

Machine Learning · Computer Science 2019-03-21 Nikolaos Passalis , Anastasios Tefas

Is Pretraining Necessary for Hyperspectral Image Classification?

We address two questions for training a convolutional neural network (CNN) for hyperspectral image classification: i) is it possible to build a pre-trained network? and ii) is the pre-training effective in furthering the performance? To…

Computer Vision and Pattern Recognition · Computer Science 2019-01-28 Hyungtae Lee , Sungmin Eum , Heesung Kwon