Related papers: Learnability, Sample Complexity, and Hypothesis Cl…
Proper learning refers to the setting in which learners must emit predictors in the underlying hypothesis class $H$, and often leads to learners with simple algorithmic forms (e.g. empirical risk minimization (ERM), structural risk…
Precision and Recall are fundamental metrics in machine learning tasks where both accurate predictions and comprehensive coverage are essential, such as in multi-label learning, language generation, medical studies, and recommender systems.…
We introduce a novel technique for verification and model synthesis of sequential programs. Our technique is based on learning a regular model of the set of feasible paths in a program, and testing whether this model contains an incorrect…
The basic problem in the PAC model of computational learning theory is to determine which hypothesis classes are efficiently learnable. There is presently a dearth of results showing hardness of learning problems. Moreover, the existing…
In many learning theory problems, a central role is played by a hypothesis class: we might assume that the data is labeled according to a hypothesis in the class (usually referred to as the realizable setting), or we might evaluate the…
As learning solutions reach critical applications in social, industrial, and medical domains, the need to curtail their behavior has become paramount. There is now ample evidence that without explicit tailoring, learning can lead to biased,…
In most real-world applications of artificial intelligence, the distributions of the data and the goals of the learners tend to change over time. The Probably Approximately Correct (PAC) learning framework, which underpins most machine…
This paper proposes a data-driven systematic, consistent and non-exhaustive approach to Model Selection, that is an extension of the classical agnostic PAC learning model. In this approach, learning problems are modeled not only by a…
Probably Approximately Correct (i.e., PAC) learning is a core concept of sample complexity theory, and efficient PAC learnability is often seen as a natural counterpart to the class P in classical computational complexity. But while the…
A supervised learning algorithm has access to a distribution of labeled examples, and needs to return a function (hypothesis) that correctly labels the examples. The hypothesis of the learner is taken from some fixed class of functions…
Most models of machine teaching and learning assume the learner makes no errors in its internal deductive inference. However, humans and large language models in few-shot learning regimes are two important examples of learners where this…
In reinforcement learning, the classic objectives of maximizing discounted and finite-horizon cumulative rewards are PAC-learnable: There are algorithms that learn a near-optimal policy with high probability using a finite amount of samples…
In the problem of learning with label proportions, which we call LLP learning, the training data is unlabeled, and only the proportions of examples receiving each label are given. The goal is to learn a hypothesis that predicts the…
An active learner is given a class of models, a large set of unlabeled examples, and the ability to interactively query labels of a subset of these examples; the goal of the learner is to learn a model in the class that fits the data well.…
Strategic classification studies learning settings in which individuals can modify their features, at a cost, in order to influence the classifier's decision. A central question is how the sample complexity of the induced (strategic)…
Continual learning, or lifelong learning, is a formidable current challenge to machine learning. It requires the learner to solve a sequence of $k$ different learning tasks, one after the other, while retaining its aptitude for earlier…
We extend the theory of PAC learning in a way which allows to model a rich variety of learning tasks where the data satisfy special properties that ease the learning process. For example, tasks where the distance of the data from the…
We study the problem of robust learning under clean-label data-poisoning attacks, where the attacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test…
How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its "learning curve", that is, the decay of the error rate as a function of…
Monotone learning describes learning processes in which expected performance consistently improves as the amount of training data increases. However, recent studies challenge this conventional wisdom, revealing significant gaps in the…