Related papers: Fast Greedy Subset Selection from Large Candidate …
Subset selection is a popular topic in recent years and a number of subset selection methods have been proposed. Among those methods, hypervolume subset selection is widely used. Greedy hypervolume subset selection algorithms can achieve…
Subset selection, which aims to select a subset from a ground set to maximize some objective function, arises in various applications such as influence maximization and sensor placement. In real-world scenarios, however, one often needs to…
Greedy algorithms are widely used for problems in machine learning such as feature selection and set function optimization. Unfortunately, for large datasets, the running time of even greedy algorithms can be quite high. This is because for…
Subset selection is an important component in evolutionary multiobjective optimization (EMO) algorithms. Clustering, as a classic method to group similar data points together, has been used for subset selection in some fields. However,…
The problem of column subset selection has recently attracted a large body of research, with feature selection serving as one obvious and important application. Among the techniques that have been applied to solve this problem, the greedy…
We study the problem of selecting a subset of vectors from a large set, to obtain the best signal representation over a family of functions. Although greedy methods have been widely used for tackling this problem and many of those have been…
In this paper, we propose a new framework for designing fast parallel algorithms for fundamental statistical subset selection tasks that include feature selection and experimental design. Such tasks are known to be weakly submodular and are…
We study the problem of selecting a subset of k random variables from a large set, in order to obtain the best linear prediction of another variable of interest. This problem can be viewed in the context of both feature selection and sparse…
In general, a multi-objective optimization problem does not have a single optimal solution but a set of Pareto optimal solutions, which forms the Pareto front in the objective space. Various evolutionary algorithms have been proposed to…
There are many problems in machine learning and data mining which are equivalent to selecting a non-redundant, high "quality" set of objects. Recommender systems, feature selection, and data summarization are among many applications of…
To cope with the high level of ambiguity faced in domains such as Computer Vision or Natural Language processing, robust prediction methods often search for a diverse set of high-quality candidate solutions or proposals. In structured…
The submodular maximization problem is widely applicable in many engineering problems where objectives exhibit diminishing returns. While this problem is known to be NP-hard for certain subclasses of objective functions, there is a greedy…
The hypervolume indicator is one of the most used set-quality indicators for the assessment of stochastic multiobjective optimizers, as well as for selection in evolutionary multiobjective optimization algorithms. Its theoretical properties…
Active learning is increasingly adopted for expensive multi-objective combinatorial optimization problems, but it involves a challenging subset selection problem, optimizing the batch acquisition score that quantifies the goodness of a…
Multimodal learning considers learning from multi-modality data, aiming to fuse heterogeneous sources of information. However, it is not always feasible to leverage all available modalities due to memory constraints. Further, training on…
Automatic prompt optimization reduces manual prompt engineering, but relies on task performance measured on a small, often randomly sampled evaluation subset as its main source of feedback signal. Despite this, how to select that evaluation…
Modern datasets span billions of samples, making training on all available data infeasible. Selecting a high quality subset helps in reducing training costs and enhancing model quality. Submodularity, a discrete analogue of convexity, is…
Simultaneous operation of all sensors in a large-scale sensor network is power-consuming and computationally expensive. Hence, it is desirable to select fewer sensors. A greedy algorithm is widely used for sensor selection in homogeneous…
The Column Subset Selection Problem provides a natural framework for unsupervised feature selection. Despite being a hard combinatorial optimization problem, there exist efficient algorithms that provide good approximations. The drawback of…
In today's information systems, the availability of massive amounts of data necessitates the development of fast and accurate algorithms to summarize these data and represent them in a succinct format. One crucial problem in big data…