Related papers: Maximum mutual information regularized classificat…

The Minimum Information Principle for Discriminative Learning

Exponential models of distributions are widely used in machine learning for classiffication and modelling. It is well known that they can be interpreted as maximum entropy models under empirical expectation constraints. In this work, we…

Machine Learning · Computer Science 2012-07-19 Amir Globerson , Naftali Tishby

Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems

Deep learning systems have been reported to achieve state-of-the-art performances in many applications, and a key is the existence of well trained classifiers on benchmark datasets. As a main-stream loss function, the cross entropy can…

Machine Learning · Computer Science 2022-09-22 Jirong Yi , Qiaosheng Zhang , Zhen Chen , Qiao Liu , Wei Shao

Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems

Deep learning systems have been reported to acheive state-of-the-art performances in many applications, and one of the keys for achieving this is the existence of well trained classifiers on benchmark datasets which can be used as backbone…

Machine Learning · Computer Science 2022-10-04 Jirong Yi , Qiaosheng Zhang , Zhen Chen , Qiao Liu , Wei Shao

Regularized maximum correntropy machine

In this paper we investigate the usage of regularized correntropy framework for learning of classifiers from noisy labels. The class label predictors learned by minimizing transitional loss functions are sensitive to the noisy and outlying…

Machine Learning · Computer Science 2015-01-20 Jim Jing-Yan Wang , Yunji Wang , Bing-Yi Jing , Xin Gao

Regularization via Mass Transportation

The goal of regression and classification methods in supervised learning is to minimize the empirical risk, that is, the expectation of some loss function quantifying the prediction error under the empirical distribution. When facing scarce…

Optimization and Control · Mathematics 2019-07-15 Soroosh Shafieezadeh-Abadeh , Daniel Kuhn , Peyman Mohajerin Esfahani

Multi-class Classification from Multiple Unlabeled Datasets with Partial Risk Regularization

Recent years have witnessed a great success of supervised deep learning, where predictive models were trained from a large amount of fully labeled data. However, in practice, labeling such big data can be very costly and may not even be…

Machine Learning · Computer Science 2022-10-18 Yuting Tang , Nan Lu , Tianyi Zhang , Masashi Sugiyama

Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients

As reinforcement learning techniques are increasingly applied to real-world decision problems, attention has turned to how these algorithms use potentially sensitive information. We consider the task of training a policy that maximizes…

Machine Learning · Computer Science 2024-04-17 Chris Cundy , Rishi Desai , Stefano Ermon

The Role of Mutual Information in Variational Classifiers

Overfitting data is a well-known phenomenon related with the generation of a model that mimics too closely (or exactly) a particular instance of data, and may therefore fail to predict future observations reliably. In practice, this…

Machine Learning · Statistics 2023-04-14 Matias Vera , Leonardo Rey Vega , Pablo Piantanida

Learning regularization and intensity-gradient-based fidelity for single image super resolution

How to extract more and useful information for single image super resolution is an imperative and difficult problem. Learning-based method is a representative method for such task. However, the results are not so stable as there may exist…

Image and Video Processing · Electrical Eng. & Systems 2020-03-25 Hu Liang , Shengrong Zhao

RLSEP: Learning Label Ranks for Multi-label Classification

Multi-label ranking maps instances to a ranked set of predicted labels from multiple possible classes. The ranking approach for multi-label learning problems received attention for its success in multi-label classification, with one of the…

Computer Vision and Pattern Recognition · Computer Science 2022-12-09 Emine Dari , V. Bugra Yesilkaynak , Alican Mertan , Gozde Unal

MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories

Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer models of the likelihood-to-evidence…

Machine Learning · Computer Science 2022-06-08 Giulio Isacchini , Natanael Spisak , Armita Nourmohammad , Thierry Mora , Aleksandra M. Walczak

Efficient Multiclass Implementations of L1-Regularized Maximum Entropy

This paper discusses the application of L1-regularized maximum entropy modeling or SL1-Max [9] to multiclass categorization problems. A new modification to the SL1-Max fast sequential learning algorithm is proposed to handle conditional…

Machine Learning · Computer Science 2007-05-23 Patrick Haffner , Steven Phillips , Rob Schapire

Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer

In this work, we develop a novel regularizer to improve the learning of long-range dependency of sequence data. Applied on language modelling, our regularizer expresses the inductive bias that sequence variables should have high mutual…

Machine Learning · Computer Science 2020-02-25 Yanshuai Cao , Peng Xu

An Optimization Framework for Semi-Supervised and Transfer Learning using Multiple Classifiers and Clusterers

Unsupervised models can provide supplementary soft constraints to help classify new, "target" data since similar instances in the target set are more likely to share the same class label. Such models can also help detect possible…

Machine Learning · Computer Science 2012-06-06 Ayan Acharya , Eduardo R. Hruschka , Joydeep Ghosh , Sreangsu Acharyya

Connecting the Dots Between MLE and RL for Sequence Prediction

Sequence prediction models can be learned from example sequences with a variety of training algorithms. Maximum likelihood learning is simple and efficient, yet can suffer from compounding error at test time. Reinforcement learning such as…

Machine Learning · Computer Science 2019-07-02 Bowen Tan , Zhiting Hu , Zichao Yang , Ruslan Salakhutdinov , Eric Xing

Learning with Privileged Information for Multi-Label Classification

In this paper, we propose a novel approach for learning multi-label classifiers with the help of privileged information. Specifically, we use similarity constraints to capture the relationship between available information and privileged…

Computer Vision and Pattern Recognition · Computer Science 2017-03-30 Shiyu Chen , Shangfei Wang , Tanfang Chen , Xiaoxiao Shi

Learning Curves for Mutual Information Maximization

An unsupervised learning procedure based on maximizing the mutual information between the outputs of two networks receiving different but statistically dependent inputs is analyzed (Becker and Hinton, Nature, 355, 92, 161). For a generic…

Disordered Systems and Neural Networks · Physics 2009-11-10 Robert Urbanczik

Multi-Granularity Regularized Re-Balancing for Class Incremental Learning

Deep learning models suffer from catastrophic forgetting when learning new tasks incrementally. Incremental learning has been proposed to retain the knowledge of old classes while learning to identify new classes. A typical approach is to…

Computer Vision and Pattern Recognition · Computer Science 2022-07-14 Huitong Chen , Yu Wang , Qinghua Hu

Neural Network Classifier as Mutual Information Evaluator

Cross-entropy loss with softmax output is a standard choice to train neural network classifiers. We give a new view of neural network classifiers with softmax and cross-entropy as mutual information evaluators. We show that when the dataset…

Machine Learning · Computer Science 2021-08-17 Zhenyue Qin , Dongwoo Kim , Tom Gedeon

Notes on Generalizing the Maximum Entropy Principle to Uncertain Data

The principle of maximum entropy is a broadly applicable technique for computing a distribution with the least amount of information possible constrained to match empirical data, for instance, feature expectations. We seek to generalize…

Information Theory · Computer Science 2022-05-30 Kenneth Bogert