Related papers: Complement Objective Training

Mixing between the Cross Entropy and the Expectation Loss Terms

The cross entropy loss is widely used due to its effectiveness and solid theoretical grounding. However, as training progresses, the loss tends to focus on hard to classify samples, which may prevent the network from obtaining gains in…

Machine Learning · Computer Science 2021-09-14 Barak Battash , Lior Wolf , Tamir Hazan

Contrastive Classification and Representation Learning with Probabilistic Interpretation

Cross entropy loss has served as the main objective function for classification-based tasks. Widely deployed for learning neural network classifiers, it shows both effectiveness and a probabilistic interpretation. Recently, after the…

Computer Vision and Pattern Recognition · Computer Science 2022-11-08 Rahaf Aljundi , Yash Patel , Milan Sulc , Daniel Olmeda , Nikolay Chumerin

Training Efficiency and Robustness in Deep Learning

Deep Learning has revolutionized machine learning and artificial intelligence, achieving superhuman performance in several standard benchmarks. It is well-known that deep learning models are inefficient to train; they learn by processing…

Machine Learning · Computer Science 2021-12-03 Fartash Faghri

Imbalanced Image Classification with Complement Cross Entropy

Recently, deep learning models have achieved great success in computer vision applications, relying on large-scale class-balanced datasets. However, imbalanced class distributions still limit the wide applicability of these models due to…

Computer Vision and Pattern Recognition · Computer Science 2021-08-05 Yechan Kim , Younkwan Lee , Moongu Jeon

Two Complementary Perspectives to Continual Learning: Ask Not Only What to Optimize, But Also How

Recent years have seen considerable progress in the continual training of deep neural networks, predominantly thanks to approaches that add replay or regularization terms to the loss function to approximate the joint loss over all tasks so…

Machine Learning · Computer Science 2024-11-01 Timm Hess , Tinne Tuytelaars , Gido M. van de Ven

Bridging the Gap: Unifying the Training and Evaluation of Neural Network Binary Classifiers

While neural network binary classifiers are often evaluated on metrics such as Accuracy and $F_1$-Score, they are commonly trained with a cross-entropy objective. How can this training-evaluation gap be addressed? While specific techniques…

Machine Learning · Computer Science 2022-06-03 Nathan Tsoi , Kate Candon , Deyuan Li , Yofti Milkessa , Marynel Vázquez

Unified Backpropagation for Multi-Objective Deep Learning

A common practice in most of deep convolutional neural architectures is to employ fully-connected layers followed by Softmax activation to minimize cross-entropy loss for the sake of classification. Recent studies show that substitution or…

Machine Learning · Computer Science 2017-10-23 Arash Shahriari

Accelerating Training of Deep Neural Networks with a Standardization Loss

A significant advance in accelerating neural network training has been the development of normalization methods, permitting the training of deep models both faster and with better accuracy. These advances come with practical challenges: for…

Machine Learning · Computer Science 2019-03-05 Jasmine Collins , Johannes Balle , Jonathon Shlens

Balanced softmax cross-entropy for incremental learning with and without memory

When incrementally trained on new classes, deep neural networks are subject to catastrophic forgetting which leads to an extreme deterioration of their performance on the old classes while learning the new ones. Using a small memory…

Machine Learning · Computer Science 2022-11-15 Quentin Jodelet , Xin Liu , Tsuyoshi Murata

Understanding Deep Learning Generalization by Maximum Entropy

Deep learning achieves remarkable generalization capability with overwhelming number of model parameters. Theoretical understanding of deep learning generalization receives recent attention yet remains not fully explored. This paper…

Machine Learning · Computer Science 2017-11-22 Guanhua Zheng , Jitao Sang , Changsheng Xu

Classes for Fast Maximum Entropy Training

Maximum entropy models are considered by many to be one of the most promising avenues of language modeling research. Unfortunately, long training times make maximum entropy research difficult. We present a novel speedup technique: we change…

Computation and Language · Computer Science 2007-05-23 Joshua Goodman

DCN+: Mixed Objective and Deep Residual Coattention for Question Answering

Traditional models for question answering optimize using cross entropy loss, which encourages exact answers at the cost of penalizing nearby or overlapping answers that are sometimes equally accurate. We propose a mixed objective that…

Computation and Language · Computer Science 2017-11-15 Caiming Xiong , Victor Zhong , Richard Socher

Aligning Multiclass Neural Network Classifier Criterion with Task Performance Metrics

Multiclass neural network classifiers are typically trained using cross-entropy loss but evaluated using metrics derived from the confusion matrix, such as Accuracy, $F_\beta$-Score, and Matthews Correlation Coefficient. This mismatch…

Machine Learning · Computer Science 2025-05-27 Deyuan Li , Taesoo Daniel Lee , Marynel Vázquez , Nathan Tsoi

The Coverage Principle: How Pre-Training Enables Post-Training

Language models demonstrate remarkable abilities when pre-trained on large text corpora and fine-tuned for specific tasks, but how and why pre-training shapes the success of the final model remains poorly understood. Notably, although…

Machine Learning · Statistics 2025-10-23 Fan Chen , Audrey Huang , Noah Golowich , Sadhika Malladi , Adam Block , Jordan T. Ash , Akshay Krishnamurthy , Dylan J. Foster

Neural Network Classifier as Mutual Information Evaluator

Cross-entropy loss with softmax output is a standard choice to train neural network classifiers. We give a new view of neural network classifiers with softmax and cross-entropy as mutual information evaluators. We show that when the dataset…

Machine Learning · Computer Science 2021-08-17 Zhenyue Qin , Dongwoo Kim , Tom Gedeon

Joint Training of Deep Ensembles Fails Due to Learner Collusion

Ensembles of machine learning models have been well established as a powerful method of improving performance over a single model. Traditionally, ensembling algorithms train their base learners independently or sequentially with the goal of…

Machine Learning · Computer Science 2023-11-01 Alan Jeffares , Tennison Liu , Jonathan Crabbé , Mihaela van der Schaar

Unprejudiced Training Auxiliary Tasks Makes Primary Better: A Multi-Task Learning Perspective

Human beings can leverage knowledge from relative tasks to improve learning on a primary task. Similarly, multi-task learning methods suggest using auxiliary tasks to enhance a neural network's performance on a specific primary task.…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Yuanze Li , Chun-Mei Feng , Qilong Wang , Guanglei Yang , Wangmeng Zuo

Optimizing Information Loss Towards Robust Neural Networks

Neural Networks (NNs) are vulnerable to adversarial examples. Such inputs differ only slightly from their benign counterparts yet provoke misclassifications of the attacked NNs. The required perturbations to craft the examples are often…

Cryptography and Security · Computer Science 2020-09-30 Philip Sperl , Konstantin Böttinger

Curriculum-Based Imitation of Versatile Skills

Learning skills by imitation is a promising concept for the intuitive teaching of robots. A common way to learn such skills is to learn a parametric model by maximizing the likelihood given the demonstrations. Yet, human demonstrations are…

Machine Learning · Computer Science 2023-07-18 Maximilian Xiling Li , Onur Celik , Philipp Becker , Denis Blessing , Rudolf Lioutikov , Gerhard Neumann

An Entropy-Based Model for Hierarchical Learning

Machine learning is the dominant approach to artificial intelligence, through which computers learn from data and experience. In the framework of supervised learning, a necessity for a computer to learn from data accurately and efficiently…

Machine Learning · Statistics 2023-01-25 Amir R. Asadi