Related papers: Singular Learning Theory for Factor Analysis
Recent advances have clarified theoretical learning accuracy in Bayesian inference, revealing that the asymptotic behavior of metrics such as generalization loss and free energy, assessing predictive accuracy, is dictated by a rational…
Recent advances have clarified theoretical learning accuracy in Bayesian inference, revealing that the asymptotic behavior of metrics such as generalization loss and free energy, assessing predictive accuracy, is dictated by a rational…
In singular models, the optimal set of parameters forms an analytic set with singularities and classical statistical inference cannot be applied to such models. This is significant for deep learning as neural networks are singular and thus…
Gaussian latent tree models, or more generally, Gaussian latent forest models have Fisher-information matrices that become singular along interesting submodels, namely, models that correspond to subforests. For these singularities, we…
Three-layer neural networks are known to form singular learning models, and their Bayesian asymptotic behavior is governed by the learning coefficient, or real log canonical threshold. Although this quantity has been clarified for regular…
Deep learning has seen substantial achievements, with numerical and theoretical evidence suggesting that singularities of statistical models are considered a contributing factor to its performance. From this remarkable success of classical…
We introduce a novel Information Criterion (IC), termed Learning under Singularity (LS), designed to enhance the functionality of the Widely Applicable Bayes Information Criterion (WBIC) and the Singular Bayesian Information Criterion…
The marginal likelihood or evidence in Bayesian statistics contains an intrinsic penalty for larger model sizes and is a fundamental quantity in Bayesian model comparison. Over the past two decades, there has been steadily increasing…
The learning coefficient plays a crucial role in analyzing the performance of information criteria, such as the Widely Applicable Information Criterion (WAIC) and the Widely Applicable Bayesian Information Criterion (WBIC), which Sumio…
The accurate asymptotic evaluation of marginal likelihood integrals is a fundamental problem in Bayesian statistics. Following the approach introduced by Watanabe, we translate this into a problem of computational algebraic geometry,…
Singular statistical models-including mixtures, matrix factorization, and neural networks-violate regular asymptotics due to parameter non-identifiability and degenerate Fisher geometry. Although singular learning theory characterizes…
The Computational Algebraic Geometry applied in Algebraic Statistics; are beginning to exploring new branches and applications; in artificial intelligence and others areas. Currently, the development of the mathematics is very extensive and…
We develop a correspondence between the structure of Turing machines and the structure of singularities of real analytic functions, based on connecting the Ehrhard-Regnier derivative from linear logic with the role of geometry in Watanabe's…
Learning machines which have hierarchical structures or hidden variables are singular statistical models because they are nonidentifiable and their Fisher information matrices are singular. In singular statistical models, neither the Bayes…
In statistical learning, models are classified as regular or singular depending on whether the mapping from parameters to probability distributions is injective. Most models with hierarchical structures or latent variables are singular, for…
Bayes statistics and statistical physics have the common mathematical structure, where the log likelihood function corresponds to the random Hamiltonian. Recently, it was discovered that the asymptotic learning curves in Bayes estimation…
In this work, we advocate for the importance of singular learning theory (SLT) as it pertains to the theory and practice of variational inference in Bayesian neural networks (BNNs). To begin, using SLT, we lay to rest some of the confusion…
Latent variable models are popularly used to measure latent factors (e.g., abilities and personalities) from large-scale assessment data. Beyond understanding these latent factors, the covariate effect on responses controlling for latent…
Many learning machines such as normal mixtures and layered neural networks are not regular but singular statistical models, because the map from a parameter to a probability distribution is not one-to-one. The conventional statistical…
Hierarchical learning models, such as mixture models and Bayesian networks, are widely employed for unsupervised learning tasks, such as clustering analysis. They consist of observable and hidden variables, which represent the given data…