English
Related papers

Related papers: Fair and Representative Subset Selection from Data…

200 papers

Diversity maximization is a fundamental problem with wide applications in data summarization, web search, and recommender systems. Given a set $X$ of $n$ elements, it asks to select a subset $S$ of $k \ll n$ elements with maximum…

Data Structures and Algorithms · Computer Science 2023-04-27 Yanhao Wang , Francesco Fabbri , Michael Mathioudakis

Submodular maximization has become established as the method of choice for the task of selecting representative and diverse summaries of data. However, if datapoints have sensitive attributes such as gender or age, such machine learning…

Machine Learning · Computer Science 2020-10-20 Marwa El Halabi , Slobodan Mitrović , Ashkan Norouzi-Fard , Jakab Tardos , Jakub Tarnawski

Streaming submodular maximization is a natural model for the task of selecting a representative subset from a large-scale dataset. If datapoints have sensitive attributes such as gender or race, it becomes important to enforce fairness to…

Machine Learning · Computer Science 2025-11-25 Marwa El Halabi , Federico Fusco , Ashkan Norouzi-Fard , Jakab Tardos , Jakub Tarnawski

We study the classic NP-Hard problem of finding the maximum $k$-set coverage in the data stream model: given a set system of $m$ sets that are subsets of a universe $\{1,\ldots,n \}$, find the $k$ sets that cover the most number of distinct…

Data Structures and Algorithms · Computer Science 2018-05-11 Andrew McGregor , Hoa T. Vu

In this paper, we develop the first one-pass streaming algorithm for submodular maximization that does not evaluate the entire stream even once. By carefully subsampling each element of data stream, our algorithm enjoys the tightest…

Machine Learning · Computer Science 2018-02-21 Moran Feldman , Amin Karbasi , Ehsan Kazemi

In this paper, we propose a novel framework that converts streaming algorithms for monotone submodular maximization into streaming algorithms for non-monotone submodular maximization. This reduction readily leads to the currently tightest…

Data Structures and Algorithms · Computer Science 2020-02-11 Ran Haba , Ehsan Kazemi , Moran Feldman , Amin Karbasi

We consider the problem of monotone, submodular maximization over a ground set of size $n$ subject to cardinality constraint $k$. For this problem, we introduce the first deterministic algorithms with linear time complexity; these…

Data Structures and Algorithms · Computer Science 2021-03-09 Alan Kuhnle

In this paper we study the extraction of representative elements in the data stream model in the form of submodular maximization. Different from the previous work on streaming submodular maximization, we are interested only in the recent…

Data Structures and Algorithms · Computer Science 2016-11-02 Jiecao Chen , Huy L. Nguyen , Qin Zhang

Diversity maximization aims to select a diverse and representative subset of items from a large dataset. It is a fundamental optimization task that finds applications in data summarization, feature selection, web search, recommender…

Data Structures and Algorithms · Computer Science 2023-04-27 Yanhao Wang , Michael Mathioudakis , Jia Li , Francesco Fabbri

Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a…

Machine Learning · Computer Science 2019-05-14 Ehsan Kazemi , Marko Mitrovic , Morteza Zadimoghaddam , Silvio Lattanzi , Amin Karbasi

Maximizing a submodular function has a wide range of applications in machine learning and data mining. One such application is data summarization whose goal is to select a small set of representative and diverse data items from a large…

Machine Learning · Computer Science 2023-03-10 Jing Yuan , Shaojie Tang

In recent years, the problem of computing the frequencies of the induced $k$-vertex subgraphs of a graph, or \emph{$k$-graphlets}, has become central. One approach for this problem is to sample $k$-graphlets randomly. Classic algorithms for…

Data Structures and Algorithms · Computer Science 2026-04-29 Marco Bressan , T-H. Hubert Chan , Qipeng Kuang , Mauro Sozio

Many real-world applications pose challenges in incorporating fairness constraints into the $k$-center clustering problem, where the dataset consists of $m$ demographic groups, each with a specified upper bound on the number of centers to…

Data Structures and Algorithms · Computer Science 2026-01-19 Longkun Guo , Zeyu Lin , Chaoqi Jia , Chao Chen

Stimulated by practical applications arising from viral marketing. This paper investigates a novel Budgeted $k$-Submodular Maximization problem defined as follows: Given a finite set $V$, a budget $B$ and a $k$-submodular function $f:…

Data Structures and Algorithms · Computer Science 2021-10-25 Canh V. Pham , Quang C. Vu , Dung K. T. Ha , Tai T. Nguyen

We revisit the MaxSAT problem in the data stream model. In this problem, the stream consists of $m$ clauses that are disjunctions of literals drawn from $n$ Boolean variables. The objective is to find an assignment to the variables that…

Data Structures and Algorithms · Computer Science 2022-08-22 Hoa T. Vu

Cardinality constrained submodular function maximization, which aims to select a subset of size at most $k$ to maximize a monotone submodular utility function, is the key in many data mining and machine learning applications such as data…

Data Structures and Algorithms · Computer Science 2018-11-15 Junzhou Zhao , Shuo Shang , Pinghui Wang , John C. S. Lui , Xiangliang Zhang

Diversity maximization problem is a well-studied problem where the goal is to find $k$ diverse items. Fair diversity maximization aims to select a diverse subset of $k$ items from a large dataset, while requiring that each group of items be…

Data Structures and Algorithms · Computer Science 2025-06-11 Florian Adriaens , Nikolaj Tatti

We study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint in the streaming model. Our main contribution is a single-pass (semi-)streaming algorithm that uses roughly $O(k / \varepsilon^2)$…

Data Structures and Algorithms · Computer Science 2020-08-11 Naor Alaluf , Alina Ene , Moran Feldman , Huy L. Nguyen , Andrew Suh

Constrained $k$-submodular maximization is a general framework that captures many discrete optimization problems such as ad allocation, influence maximization, personalized recommendation, and many others. In many of these applications,…

Data Structures and Algorithms · Computer Science 2023-05-26 Fabian Spaeh , Alina Ene , Huy L. Nguyen

The task of extracting a diverse subset from a dataset, often referred to as maximum diversification, plays a pivotal role in various real-world applications that have far-reaching consequences. In this work, we delve into the realm of…

Databases · Computer Science 2025-06-16 Yash Kurkure , Miles Shamo , Joseph Wiseman , Sainyam Galhotra , Stavros Sintos
‹ Prev 1 2 3 10 Next ›