Related papers: Dualheap Selection Algorithm: Efficient, Inherentl…
An algorithm is presented that efficiently solves the selection problem: finding the k-th smallest member of a set. Relevant to a divide-and-conquer strategy, the algorithm also partitions a set into small and large valued subsets. Applied…
A generalization of the heapsort algorithm is proposed. At the expense of about 50% more comparison and move operations for typical cases, the dualheap sort algorithm offers several advantages over heapsort: improved cache performance,…
Randomized parallel algorithms for many fundamental problems achieve optimal linear work in expectation, but upgrading this guarantee to hold with high probability (whp) remains a recurring theoretical challenge. In this paper, we address…
Sorting is one of the most basic algorithms, and developing highly parallel sorting programs is becoming increasingly important in high-performance computing because the number of CPU cores per node in modern supercomputers tends to…
The goal of coreset selection in supervised learning is to produce a weighted subset of data, so that training only on the subset achieves similar performance as training on the entire dataset. Existing methods achieved promising results in…
Feature selection with large-scale high-dimensional data is important yet very challenging in machine learning and data mining. Online feature selection is a promising new paradigm that is more efficient and scalable than batch feature…
Motivated by the development of computer theory, the sorting algorithm is emerging in an endless stream. Inspired by decrease and conquer method, we propose a brand new sorting algorithmUltimately Heapsort. The algorithm consists of two…
We present a new connection between self-adjusting binary search trees (BSTs) and heaps, two fundamental, extensively studied, and practically relevant families of data structures. Roughly speaking, we map an arbitrary heap algorithm within…
We consider the Top-$K$ selection problem, which aims to identify the largest $K$ elements in an array. Top-$K$ selection arises in many machine learning algorithms and often becomes a bottleneck on accelerators, which are optimized for…
We investigate distributed memory parallel sorting algorithms that scale to the largest available machines and are robust with respect to input size and distribution of the input elements. The main outcome is that four sorting algorithms…
Feature subset selection, as a special case of the general subset selection problem, has been the topic of a considerable number of studies due to the growing importance of data-mining applications. In the feature subset selection problem…
Partitioning a graph into blocks of "roughly equal" weight while cutting only few edges is a fundamental problem in computer science with a wide range of applications. In particular, the problem is a building block in applications that…
Sorting is one of the most fundamental problems in the field of computer science. With the rapid development of manycore processors, it shows great importance to design efficient parallel sort algorithm on manycore architecture. This paper…
Many of the artificial intelligence techniques developed to date rely on heuristic search through large spaces. Unfortunately, the size of these spaces and the corresponding computational effort reduce the applicability of otherwise novel…
Nucleus decompositions have been shown to be a useful tool for finding dense subgraphs. The coreness value of a clique represents its density based on the number of other cliques it is adjacent to. One useful output of nucleus decomposition…
This paper studies the nucleus decomposition problem, which has been shown to be useful in finding dense substructures in graphs. We present a novel parallel algorithm that is efficient both in theory and in practice. Our algorithm achieves…
Feature importance ranking has become a powerful tool for explainable AI. However, its nature of combinatorial optimization poses a great challenge for deep learning. In this paper, we propose a novel dual-net architecture consisting of…
Smart Sort algorithm is a "smart" fusion of heap construction procedures (of Heap sort algorithm) into the conventional "Partition" function (of Quick sort algorithm) resulting in a robust version of Quick sort algorithm. We have also…
The problem of finding dense components of a graph is a widely explored area in data analysis, with diverse applications in fields and branches of study including community mining, spam detection, computer security and bioinformatics. This…
We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We…