Related papers: Sample Elicitation
Prediction in a small-sized sample with a large number of covariates, the "small n, large p" problem, is challenging. This setting is encountered in multiple applications, such as precision medicine, where obtaining additional samples can…
Federated learning provides a promising paradigm for collecting machine learning models from distributed data sources without compromising users' data privacy. The success of a credible federated learning system builds on the assumption…
This article introduces a new method for eliciting prior distributions from experts. The method models an expert decision-making process to infer a prior probability distribution for a rare event $A$. More specifically, assuming there…
We study the problem of eliciting and aggregating probabilistic information from multiple agents. In order to successfully aggregate the predictions of agents, the principal needs to elicit some notion of confidence from agents, capturing…
Incorporation of expert information in inference or decision settings is often important, especially in cases where data are unavailable, costly or unreliable. One approach is to elicit prior quantiles from an expert and then to fit these…
Recent work [ 14 ] has introduced a method for prior elicitation that utilizes records of expert decisions to infer a prior distribution. While this method provides a promising approach to eliciting expert uncertainty, it has only been…
Distributed estimation that recruits potentially large groups of humans to collect data about a phenomenon of interest has emerged as a paradigm applicable to a broad range of detection and estimation tasks. However, it also presents a…
Sampling from multivariate normal distributions, subjected to a variety of restrictions, is a problem that is recurrent in statistics and computing. In the present work, we demonstrate a general framework to efficiently sample a…
A central characteristic of Bayesian statistics is the ability to consistently incorporate prior knowledge into various modeling processes. In this paper, we focus on translating domain expert knowledge into corresponding prior…
Given a learning problem with real-world tradeoffs, which cost function should the model be trained to optimize? This is the metric selection problem in machine learning. Despite its practical interest, there is limited formal guidance on…
When facing uncertainty, decision-makers want predictions they can trust. A machine learning provider can convey confidence to decision-makers by guaranteeing their predictions are distribution calibrated -- amongst the inputs that receive…
We introduce the study of sequential information elicitation in strategic multi-agent systems. In an information elicitation setup a center attempts to compute the value of a function based on private information (a-k-a secrets) accessible…
Eliciting information to reduce uncertainty about a latent entity is a critical task in many application domains, e.g., assessing individual student learning outcomes, diagnosing underlying diseases, or learning user preferences. Though…
The estimation of an f-divergence between two probability distributions based on samples is a fundamental problem in statistics and machine learning. Most works study this problem under very weak assumptions, in which case it is provably…
We consider the problem of imitation learning from a finite set of expert trajectories, without access to reinforcement signals. The classical approach of extracting the expert's reward function via inverse reinforcement learning, followed…
An analyst is tasked with producing a statistical study. The analyst is not monitored and is able to manipulate the study. He can receive payments contingent on his report and trusted data collected from an independent source, modeled as a…
Knowledge distillation is an effective technique that transfers knowledge from a large teacher model to a shallow student. However, just like massive classification, large scale knowledge distillation also imposes heavy computational costs…
We study the problem of efficiently estimating counts for queries involving complex filters, such as user-defined functions, or predicates involving self-joins and correlated subqueries. For such queries, traditional sampling techniques may…
Scoring rules evaluate probabilistic forecasts of an unknown state against the realized state and are a fundamental building block in the incentivized elicitation of information. This paper develops mechanisms for scoring elicited text…
The boom of DL technology leads to massive DL models built and shared, which facilitates the acquisition and reuse of DL models. For a given task, we encounter multiple DL models available with the same functionality, which are considered…