Related papers: Classes for Fast Maximum Entropy Training

Efficient Multiclass Implementations of L1-Regularized Maximum Entropy

This paper discusses the application of L1-regularized maximum entropy modeling or SL1-Max [9] to multiclass categorization problems. A new modification to the SL1-Max fast sequential learning algorithm is proposed to handle conditional…

Machine Learning · Computer Science 2007-05-23 Patrick Haffner , Steven Phillips , Rob Schapire

Cluster Expansions and Iterative Scaling for Maximum Entropy Language Models

The maximum entropy method has recently been successfully introduced to a variety of natural language applications. In each of these applications, however, the power of the maximum entropy method is achieved at the cost of a considerable…

cmp-lg · Computer Science 2008-02-03 John D. Lafferty , Bernhard Suhm

Maximum Entropy Regularization and Chinese Text Recognition

Chinese text recognition is more challenging than Latin text due to the large amount of fine-grained Chinese characters and the great imbalance over classes, which causes a serious overfitting problem. We propose to apply Maximum Entropy…

Computer Vision and Pattern Recognition · Computer Science 2020-07-10 Changxu Cheng , Wuheng Xu , Xiang Bai , Bin Feng , Wenyu Liu

Maximum Entropy Flow Networks

Maximum entropy modeling is a flexible and popular framework for formulating statistical models given partial knowledge. In this paper, rather than the traditional method of optimizing over the continuous density directly, we learn a smooth…

Methodology · Statistics 2017-05-01 Gabriel Loaiza-Ganem , Yuanjun Gao , John P. Cunningham

Efficient Neural Task Adaptation by Maximum Entropy Initialization

Transferring knowledge from one neural network to another has been shown to be helpful for learning tasks with few training examples. Prevailing fine-tuning methods could potentially contaminate pre-trained features by comparably high…

Machine Learning · Computer Science 2019-07-15 Farshid Varno , Behrouz Haji Soleimani , Marzie Saghayi , Lisa Di Jorio , Stan Matwin

Improving the Speed of Response of Learning Algorithms Using Multiple Models

This is the first of a series of papers that the authors propose to write on the subject of improving the speed of response of learning systems using multiple models. During the past two decades, the first author has worked on numerous…

Machine Learning · Computer Science 2015-11-02 Kumpati S. Narendra , Snehasis Mukhopadyhay , Yu Wang

Learning Maximum Entropy Models from finite size datasets: a fast Data-Driven algorithm allows sampling from the posterior distribution

Maximum entropy models provide the least constrained probability distributions that reproduce statistical properties of experimental datasets. In this work we characterize the learning dynamics that maximizes the log-likelihood in the case…

Disordered Systems and Neural Networks · Physics 2016-09-21 Ulisse Ferrari

Entropy-aware Masking for Masked Language Modeling

Masked language modeling has become a standard pretraining objective for training encoder-based language models. In this approach, certain tokens in the input are masked, and the model learns to predict them using the surrounding context.…

Artificial Intelligence · Computer Science 2026-05-28 Gokul Srinivasagan , Kai Hartung , Munir Georges

Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

Although deep learning has made great progress in recent years, the exploding economic and environmental costs of training neural networks are becoming unsustainable. To address this problem, there has been a great deal of research on…

Machine Learning · Computer Science 2023-03-22 Brian R. Bartoldson , Bhavya Kailkhura , Davis Blalock

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech. The use of large-scale models trained on vast amounts of data holds immense promise for…

Machine Learning · Computer Science 2023-04-10 Li Shen , Yan Sun , Zhiyuan Yu , Liang Ding , Xinmei Tian , Dacheng Tao

Action Redundancy in Reinforcement Learning

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning paradigm which seeks to maximize return under entropy regularization. However, action entropy does not necessarily coincide with state entropy, e.g., when multiple…

Machine Learning · Computer Science 2021-07-27 Nir Baram , Guy Tennenholtz , Shie Mannor

Understanding Deep Learning Generalization by Maximum Entropy

Deep learning achieves remarkable generalization capability with overwhelming number of model parameters. Theoretical understanding of deep learning generalization receives recent attention yet remains not fully explored. This paper…

Machine Learning · Computer Science 2017-11-22 Guanhua Zheng , Jitao Sang , Changsheng Xu

Know Your Limits: Entropy Estimation Modeling for Compression and Generalization

Language prediction is constrained by informational entropy intrinsic to language, such that there exists a limit to how accurate any language model can become and equivalently a lower bound to language compression. The most efficient…

Computation and Language · Computer Science 2025-11-14 Benjamin L. Badger , Matthew Neligeorge

Accelerating Deep Learning with Fixed Time Budget

The success of modern deep learning is attributed to two key elements: huge amounts of training data and large model sizes. Where a vast amount of data allows the model to learn more features, the large model architecture boosts the…

Machine Learning · Computer Science 2024-10-08 Muhammad Asif Khan , Ridha Hamila , Hamid Menouar

MEMe: An Accurate Maximum Entropy Method for Efficient Approximations in Large-Scale Machine Learning

Efficient approximation lies at the heart of large-scale machine learning problems. In this paper, we propose a novel, robust maximum entropy algorithm, which is capable of dealing with hundreds of moments and allows for computationally…

Machine Learning · Statistics 2019-06-05 Diego Granziol , Binxin Ru , Stefan Zohren , Xiaowen Doing , Michael Osborne , Stephen Roberts

Learning Efficient Task-Specific Meta-Embeddings with Word Prisms

Word embeddings are trained to predict word cooccurrence statistics, which leads them to possess different lexical properties (syntactic, semantic, etc.) depending on the notion of context defined at training time. These properties manifest…

Computation and Language · Computer Science 2020-11-06 Jingyi He , KC Tsiolis , Kian Kenyon-Dean , Jackie Chi Kit Cheung

Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models

Large language models have shown remarkable performance across a wide range of language tasks, owing to their exceptional capabilities in context modeling. The most commonly used method of context modeling is full self-attention, as seen in…

Computation and Language · Computer Science 2025-06-26 Zhisong Zhang , Yan Wang , Xinting Huang , Tianqing Fang , Hongming Zhang , Chenlong Deng , Shuaiyi Li , Dong Yu

Random versus maximum entropy models of neural population activity

The principle of maximum entropy provides a useful method for inferring statistical mechanics models from observations in correlated systems, and is widely used in a variety of fields where accurate data are available. While the assumptions…

Neurons and Cognition · Quantitative Biology 2017-06-02 Ulisse Ferrari , Tomoyuki Obuchi , Thierry Mora

Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization without Compounding Errors

Model usage is the central challenge of model-based reinforcement learning. Although dynamics model based on deep neural networks provide good generalization for single step prediction, such ability is over exploited when it is used to…

Machine Learning · Computer Science 2020-06-30 Chi Zhang , Sanmukh Rao Kuppannagari , Viktor K Prasanna

The Case for Meta-Cognitive Machine Learning: On Model Entropy and Concept Formation in Deep Learning

Machine learning is usually defined in behaviourist terms, where external validation is the primary mechanism of learning. In this paper, I argue for a more holistic interpretation in which finding more probable, efficient and abstract…

Artificial Intelligence · Computer Science 2017-11-07 Johan Loeckx