Related papers: Falsification and future performance
Recent research has demonstrated significant achievable performance gains by exploiting circularity/non-circularity or propeness/improperness of complex-valued signals. In this paper, we investigate the influence of these properties on…
We use a formal correspondence between thermodynamics and inference, where the number of samples can be thought of as the inverse temperature, to study a quantity called ``learning capacity'' which is a measure of the effective…
There are (at least) three approaches to quantifying information. The first, algorithmic information or Kolmogorov complexity, takes events as strings and, given a universal Turing machine, quantifies the information content of a string as…
We study the excess capacity of deep networks in the context of supervised classification. That is, given a capacity measure of the underlying hypothesis class - in our case, empirical Rademacher complexity - to what extent can we (a…
Language models demonstrate remarkable abilities when pre-trained on large text corpora and fine-tuned for specific tasks, but how and why pre-training shapes the success of the final model remains poorly understood. Notably, although…
Machine learning models provide statistically impressive results which might be individually unreliable. To provide reliability, we propose an Epistemic Classifier (EC) that can provide justification of its belief using support from the…
An information-theoretic framework is introduced to analyze last-layer embedding, focusing on learned representations for regression tasks. We define representation-rate and derive limits on the reliability with which input-output…
The performance of an error correcting code is evaluated by its error probability, rate, and en/decoding complexity. The performance of a series of codes is evaluated by, as the block lengths approach infinity, whether their error…
The present paper is concerned with the question of how falsifiable a single proposition is in the short and long run. Formal Learning theorists such as Schulte and Juhl have argued that long-run falsifiability is characterized by the…
We propose a novel combination of optimization tools with learning theory bounds in order to analyze the sample complexity of optimal kernel sum classifiers. This contrasts the typical learning theoretic results which hold for all…
Conformal Prediction (CP) is a powerful framework for constructing prediction sets with guaranteed coverage. However, recent studies have shown that integrating confidence calibration with CP can lead to a degradation in efficiency. In this…
Recent work has explored sequence-to-sequence latent variable models for expressive speech synthesis (supporting control and transfer of prosody and style), but has not presented a coherent framework for understanding the trade-offs between…
Conformal prediction (CP) provides a comprehensive framework to produce statistically rigorous uncertainty sets for black-box machine learning models. To further improve the efficiency of CP, conformal correction is proposed to fine-tune or…
Considering the difficulty of interpreting generative model output, there is significant current research focused on determining meaningful evaluation metrics. Several recent approaches utilize "precision" and "recall," borrowed from the…
Entropy Estimation is an important problem with many applications in cryptography, statistic,machine learning. Although the estimators optimal with respect to the sample complexity have beenrecently developed, there are still some…
The major problem in information theoretic analysis of neural responses and other biological data is the reliable estimation of entropy--like quantities from small samples. We apply a recently introduced Bayesian entropy estimator to…
Monotone learning describes learning processes in which expected performance consistently improves as the amount of training data increases. However, recent studies challenge this conventional wisdom, revealing significant gaps in the…
The one-shot classical capacity of a quantum channel quantifies the amount of classical information that can be transmitted through a single use of the channel such that the error probability is below a certain threshold. In this work, we…
Understanding how the test risk scales with model complexity is a central question in machine learning. Classical theory is challenged by the learning curves observed for large over-parametrized deep networks. Capacity measures based on…
With the recent development of quantum information theory, some attempts exist to construct information theory beyond quantum theory. Here we consider hypothesis testing relative entropy and one-shot classical capacity, that is, the optimal…