Pascal Germain
Generalization bounds for deep learning models are typically vacuous, not computable or restricted to specific model classes. In this paper, we tackle these issues by providing new disagreement-based certificates for the gap between the…
Continual learning algorithms aim to learn from a sequence of tasks. In order to avoid catastrophic forgetting, most existing approaches rely on heuristics and do not provide computable learning guarantees. In this paper, we introduce…
Classical PAC generalization bounds on the prediction risk of a classifier are insufficient to provide theoretical guarantees on fairness when the goal is to learn models balancing predictive risk and fairness constraints. We propose a…
PAC-Bayes is a popular and efficient framework for obtaining generalization guarantees in situations involving uncountable hypothesis spaces. Unfortunately, in its classical formulation, it only provides guarantees on the expected risk of a…
Both PAC-Bayesian and Sample Compress learning frameworks are instrumental for deriving tight (non-vacuous) generalization bounds for neural networks. We leverage these results in a meta-learning scheme, relying on a hypernetwork that…
Rule-based models, such as decision trees, appeal to practitioners due to their interpretable nature. However, the learning algorithms that produce such models are often vulnerable to spurious associations and thus, they are not guaranteed…
The sample compression theory provides generalization guarantees for predictors that can be fully defined using a subset of the training dataset and a (short) message string, generally defined as a binary sequence. Previous works provided…
In this paper, we propose Phoneme Discretized Saliency Maps (PDSM), a discretization algorithm for saliency maps that takes advantage of phoneme boundaries for explainable detection of AI-generated voice. We experimentally show with two…
We study the use of binary activated neural networks as interpretable and explainable predictors in the context of regression tasks on tabular data; more specifically, we provide guarantees on their expressiveness, present an approach based…
Interpretability and explainability have gained more and more attention in the field of machine learning as they are crucial when it comes to high-stakes decisions and troubleshooting. Since both provide information about predictors and…
Since their inception, Variational Autoencoders (VAEs) have become central in machine learning. Despite their widespread use, numerous questions regarding their theoretical properties remain open. Using PAC-Bayesian theory, this work…
We extend PAC-Bayesian theory to generative models and develop generalization bounds for models based on the Wasserstein distance and the total variation distance. Our first result on the Wasserstein distance assumes the instance space is…
PAC-Bayesian bounds are known to be tight and informative when studying the generalization ability of randomized classifiers. However, they require a loose and costly derandomization step when applied to some families of deterministic…
It is crucial to detect when an instance lies downright too far from the training samples for the machine learning model to be trusted, a challenge known as out-of-distribution (OOD) detection. For neural networks, one approach to this task…
Considering a probability distribution over parameters is known as an efficient strategy to learn a neural network with non-differentiable activation functions. We study the expectation of a probabilistic neural network as a predictor by…
We investigate a stochastic counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties. While our approach holds for arbitrary distributions, we instantiate it with Dirichlet distributions:…
In the PAC-Bayesian literature, the C-Bound refers to an insightful relation between the risk of a majority vote classifier (under the zero-one loss) and the first two moments of its margin (i.e., the expected margin and the voters'…
Contrastive unsupervised representation learning (CURL) is the state-of-the-art technique to learn representations (as a set of features) from unlabelled data. While CURL has collected several empirical successes recently, theoretical…
We present a comprehensive study of multilayer neural networks with binary activation, relying on the PAC-Bayesian theory. Our contributions are twofold: (i) we develop an end-to-end framework to train a binary activated deep neural…
In this paper, we improve the PAC-Bayesian error bound for linear regression derived in Germain et al. [10]. The improvements are twofold. First, the proposed error bound is tighter, and converges to the generalization loss with a…