Related papers: Fair and Diverse DPP-based Data Summarization
Determinantal Point Processes (DPPs) are probabilistic models that arise in quantum physics and random matrix theory and have recently found numerous applications in computer science. DPPs define distributions over subsets of a given ground…
Subsampling from a large data set is useful in many supervised learning contexts to provide a global view of the data based on only a fraction of the observations. Diverse (or space-filling) subsampling is an appealing subsampling approach…
Fairness in multi-document summarization of user-generated content remains a critical challenge in natural language processing (NLP). Existing summarization methods often fail to ensure equitable representation across different social…
Due to the recent cases of algorithmic bias in data-driven decision-making, machine learning methods are being put under the microscope in order to understand the root cause of these biases and how to correct them. Here, we consider a basic…
Submodular maximization has become established as the method of choice for the task of selecting representative and diverse summaries of data. However, if datapoints have sensitive attributes such as gender or age, such machine learning…
We propose a novel diverse feature selection method based on determinantal point processes (DPPs). Our model enables one to flexibly define diversity based on the covariance of features (similar to orthogonal matching pursuit) or…
The most important obstacles facing multi-document summarization include excessive redundancy in source descriptions and the looming shortage of training data. These obstacles prevent encoder-decoder models from being used directly, but…
Determinantal Point Processes (DPPs) are elegant probabilistic models of repulsion and diversity over discrete sets of items. But their applicability to large sets is hindered by expensive cubic-complexity matrix operations for basic tasks…
In some practical learning tasks, such as traffic video analysis, the number of available training samples is restricted by different factors, such as limited communication bandwidth and computation power. Determinantal Point Process (DPP)…
Determinantal point processes (DPPs), which arise in random matrix theory and quantum physics, are natural models for subset selection problems where diversity is preferred. Among many remarkable properties, DPPs offer tractable algorithms…
Determinantal point processes (DPPs) are well known models for diverse subset selection problems, including recommendation tasks, document summarization and image search. In this paper, we discuss a greedy deterministic adaptation of k-DPP.…
Dynamic data selection accelerates training by sampling a changing subset of the dataset while preserving accuracy. We rethink two core notions underlying sample evaluation: representativeness and diversity. Instead of local geometric…
Subset selection problems ask for a small, diverse yet representative subset of the given data. When pairwise similarities are captured by a kernel, the determinants of submatrices provide a measure of diversity or independence of items…
People from different social and demographic groups express diverse perspectives and conflicting opinions on a broad set of topics such as product reviews, healthcare, law, and politics. A fair summary should provide a comprehensive…
It is now much easier than ever before to produce videos. While the ubiquitous video data is a great source for information discovery and extraction, the computational challenges are unparalleled. Automatically summarizing the videos has…
The goal of diversity sampling is to select a representative subset of data in a way that maximizes information contained in the subset while keeping its cardinality small. We introduce the ordered diverse sampling problem based on a new…
Online feature selection has been an active research area in recent years. We propose a novel diverse online feature selection method based on Determinantal Point Processes (DPP). Our model aims to provide diverse features which can be…
Determinantal point processes (DPPs) are an important concept in random matrix theory and combinatorics. They have also recently attracted interest in the study of numerical methods for machine learning, as they offer an elegant "missing…
We study core-set construction algorithms for the task of Diversity Maximization under fairness/partition constraint. Given a set of points $P$ in a metric space partitioned into $m$ groups, and given $k_1,\ldots,k_m$, the goal of this…
Condensing large datasets into smaller synthetic counterparts has demonstrated its promise for image classification. However, previous research has overlooked a crucial concern in image recognition: ensuring that models trained on condensed…