Test Set Selection using Active Information Acquisition for Predictive Models
Abstract
In this paper, we consider active information acquisition when the prediction model is meant to be applied on a targeted subset of the population. The goal is to label a pre-specified fraction of customers in the target or test set by iteratively querying for information from the non-target or training set. The number of queries is limited by an overall budget. Arising in the context of two rather disparate applications- banking and medical diagnosis, we pose the active information acquisition problem as a constrained optimization problem. We propose two greedy iterative algorithms for solving the above problem. We conduct experiments with synthetic data and compare results of our proposed algorithms with few other baseline approaches. The experimental results show that our proposed approaches perform better than the baseline schemes.
Cite
@article{arxiv.1312.0790,
title = {Test Set Selection using Active Information Acquisition for Predictive Models},
author = {Sneha Chaudhari and Pankaj Dayama and Vinayaka Pandit and Indrajit Bhattacharya},
journal= {arXiv preprint arXiv:1312.0790},
year = {2014}
}
Comments
The paper has been withdrawn by the authors. The current version is incomplete and the work is still on going. The algorithm gives poor results for a particular setting and we are working on it. However, we are not planning to submit a revision of the paper. This work is going to take some time and we want to withdraw the current version since it is not in a good shape and needs a lot more work to be in publishable condition