English
Related papers

Related papers: Low-Depth Parallel Algorithms for the Binary-Forki…

200 papers

In this paper we develop optimal algorithms in the binary-forking model for a variety of fundamental problems, including sorting, semisorting, list ranking, tree contraction, range minima, and ordered set union, intersection and difference.…

Data Structures and Algorithms · Computer Science 2020-06-26 Guy E. Blelloch , Jeremy T. Fineman , Yan Gu , Yihan Sun

We present a model of multithreaded computation, combining fork-join and single-instruction-multiple-data parallelisms, with an emphasis on estimating parallelism overheads of programs written for modern many-core architectures. We…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-02-04 Sardar Anisul Haque , Marc Moreno Maza , Ning Xie

Balanced search trees are widely used in computer science to efficiently maintain dynamic ordered data. To support efficient set operations (e.g., union, intersection, difference) using trees, the join-based framework is widely studied.…

Data Structures and Algorithms · Computer Science 2025-10-24 Michael Goodrich , Yan Gu , Ryuto Kitagawa , Yihan Sun

As secure processors such as Intel SGX (with hyperthreading) become widely adopted, there is a growing appetite for private analytics on big data. Most prior works on data-oblivious algorithms adopt the classical PRAM model to capture…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-07-01 Vijaya Ramachandran , Elaine Shi

In this paper we analyze, evaluate, and improve the performance of training generalized linear models on modern CPUs. We start with a state-of-the-art asynchronous parallel training algorithm, identify system-level performance bottlenecks,…

Machine Learning · Computer Science 2018-12-20 Nikolas Ioannou , Celestine Dünner , Kornilios Kourtis , Thomas Parnell

Parallel parameterized complexity theory studies how fixed-parameter tractable (fpt) problems can be solved in parallel. Previous theoretical work focused on parallel algorithms that are very fast in principle, but did not take into account…

Data Structures and Algorithms · Computer Science 2019-02-21 Max Bannach , Malte Skambath , Till Tantau

The effective use of parallel computing resources to speed up algorithms in current multi-core parallel architectures remains a difficult challenge, with ease of programming playing a key role in the eventual success of various parallel…

Data Structures and Algorithms · Computer Science 2014-12-09 Arash Farzan , Alejandro López-Ortiz , Patrick K. Nicholson , Alejandro Salinger

As the artificial intelligence community advances into the era of large models with billions of parameters, distributed training and inference have become essential. While various parallelism strategies-data, model, sequence, and…

Machine Learning · Computer Science 2025-03-13 Ruifeng She , Bowen Pang , Kai Li , Zehua Liu , Tao Zhong

Minimum Spanning Tree (MST) is an important graph algorithm that has wide ranging applications in the areas of computer networks, VLSI routing, wireless communications among others. Today virtually every computer is built out of multi-core…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-15 Suryanarayana Murthy Durbhakula

When training large machine learning models with many variables or parameters, a single machine is often inadequate since the model may be too large to fit in memory, while training can take a long time even with stochastic updates. A…

Machine Learning · Statistics 2014-06-19 Seunghak Lee , Jin Kyu Kim , Xun Zheng , Qirong Ho , Garth A. Gibson , Eric P. Xing

Arrival of multicore systems has enforced a new scenario in computing, the parallel and distributed algorithms are fast replacing the older sequential algorithms, with many challenges of these techniques. The distributed algorithms provide…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-13 Rajendra Purohit , K R Chowdhary , S D Purohit

Distributed model fitting refers to the process of fitting a mathematical or statistical model to the data using distributed computing resources, such that computing tasks are divided among multiple interconnected computers or nodes, often…

Computation · Statistics 2024-06-04 Xiaofei Wu , Rongmei Liang , Fabio Roli , Marcello Pelillo , Jing Yuan

We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to growing amounts of data as well as growing needs for accurate and confident predictions in critical applications. In contrast to other…

Machine Learning · Computer Science 2018-10-09 Michael Kamp , Mario Boley , Olana Missura , Thomas Gärtner

We present a work-efficient parallel level-synchronous Breadth First Search (BFS) algorithm for shared-memory architectures which achieves the theoretical lower bound on parallel running time. The optimality holds regardless of the shape of…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-20 Jesmin Jahan Tithi , Yonatan Fogel , Rezaul Chowdhury

We present efficient and scalable parallel algorithms for performing mathematical operations for low-rank tensors represented in the tensor train (TT) format. We consider algorithms for addition, elementwise multiplication, computing norms…

Numerical Analysis · Mathematics 2021-09-08 Hussam Al Daas , Grey Ballard , Peter Benner

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these…

Mathematical Software · Computer Science 2008-06-12 Alfredo Buttari , Julien Langou , Jakub Kurzak , Jack Dongarra

We propose a parallel algorithm for local, on the fly, model checking of a fragment of CTL that is well-suited for modern, multi-core architectures. This model-checking algorithm takes bene t from a parallel state space construction…

Logic in Computer Science · Computer Science 2013-02-01 Rodrigo Tacla Saad , Silvano Dal Zilio , Bernard Berthomieu

In order to fully utilize "big data", it is often required to use "big models". Such models tend to grow with the complexity and size of the training data, and do not make strong parametric assumptions upfront on the nature of the…

Machine Learning · Statistics 2015-04-17 Vikas Sindhwani , Haim Avron

Shared memory programming models usually provide worksharing and task constructs. The former relies on the efficient fork-join execution model to exploit structured parallelism; while the latter relies on fine-grained synchronization among…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-08 M. Maronas , K. Sala , S. Mateo , E. Ayguadé , V. Beltran Barcelona Supercomputing Center

Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories -- based on the data structures which…

Data Structures and Algorithms · Computer Science 2010-03-10 Vamsi Kundeti , Sanguthevar Rajasekaran , Hieu Dinh
‹ Prev 1 2 3 10 Next ›