English
Related papers

Related papers: Adaptive Work-Efficient Connected Components on th…

200 papers

We present a new adaptive parallel algorithm for the challenging problem of multi-dimensional numerical integration on massively parallel architectures. Adaptive algorithms have demonstrated the best performance, but efficient many-core…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-24 Ioannis Sakiotis , Kamesh Arumugam , Marc Paterno , Desh Ranjan , Balša Terzić , Mohammad Zubair

Connected components and spanning forest are fundamental graph algorithms due to their use in many important applications, such as graph clustering and image segmentation. GPUs are an ideal platform for graph algorithms due to their high…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-28 Changwan Hong , Laxman Dhulipala , Julian Shun

We discuss an implementation of adaptive fast multipole methods targeting hybrid multicore CPU- and GPU-systems. From previous experiences with the computational profile of our version of the fast multipole algorithm, suitable parts are…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-09-03 Marcus Holm , Stefan Engblom , Anders Goude , Sverker Holmgren

Adaptive finite elements combined with geometric multigrid solvers are one of the most efficient numerical methods for problems such as the instationary Navier-Stokes equations. Yet despite their efficiency, computations remain expensive…

Numerical Analysis · Mathematics 2025-12-23 Manuel Liebchen , Robert Jendersie , Utku Kaya , Christian Lessig , Thomas Richter

We propose a GPU-accelerated distributed optimization algorithm for controlling multi-phase optimal power flow in active distribution systems with dynamically changing topologies. To handle varying network configurations and enable…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-15 Minseok Ryu , Geunyeong Byeon , Kibaek Kim

A technique used to accelerate an adaptive optics simulation platform using reconfigurable logic is described. The performance of parts of this simulation have been improved by up to 600 times (reducing computation times by this factor) by…

Astrophysics · Physics 2009-11-11 Alastair Basden

In this paper, we report an optimized union-find (UF) algorithm that can label the connected components on a 2D image efficiently by employing the GPU architecture. The proposed method contains three phases: UF-based local merge, boundary…

Computer Vision and Pattern Recognition · Computer Science 2017-08-30 Jun Chen , Qiang Yao , Houari Sabirin , Keisuke Nonaka , Hiroshi Sankoh , Sei Naito

Asynchronous tasks, when created with over-decomposition, enable automatic computation-communication overlap which can substantially improve performance and scalability. This is not only applicable to traditional CPU-based systems, but also…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-23 Jaemin Choi , David F. Richards , Laxmikant V. Kale

On an evolving graph that is continuously updated by a high-velocity stream of edges, how can one efficiently maintain if two vertices are connected? This is the connectivity problem, a fundamental and widely studied problem on graphs. We…

Data Structures and Algorithms · Computer Science 2016-02-18 Natcha Simsiri , Kanat Tangwongsan , Srikanta Tirthapura , Kun-Lung Wu

5G wireless technology can deliver higher data speeds, ultra low latency, more reliability, massive network capacity, increased availability, and a more uniform user experience to users. It brings additional power to help address the…

Systems and Control · Electrical Eng. & Systems 2022-11-02 Yousu Chen , Liwei Wang , Xiaoyuan Fan , Dexin Wang , James Ogle

Evolutionary computing (EC) has proven to be effective in solving complex optimization and robotics problems. Unfortunately, typical Evolutionary Algorithms (EAs) are constrained by the computational capacity available to researchers. More…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-19 Rustam Eynaliyev , Houcen Liu

The performance of graph programs depends highly on the algorithm, the size and structure of the input graphs, as well as the features of the underlying hardware. No single set of optimizations or one hardware platform works well across all…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-11 Ajay Brahmakshatriya , Yunming Zhang , Changwan Hong , Shoaib Kamil , Julian Shun , Saman Amarasinghe

Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and…

Hardware Architecture · Computer Science 2014-04-21 Sparsh Mittal , Jeffrey S. Vetter

Maintaining computational load balance is important to the performant behavior of codes which operate under a distributed computing model. This is especially true for GPU architectures, which can suffer from memory oversubscription if…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-05 Michael E. Rowan , Axel Huebl , Kevin N. Gott , Jack Deslippe , Maxence Thévenet , Remi Lehe , Jean-Luc Vay

Adaptive Computation (AC) has been shown to be effective in improving the efficiency of Open-Domain Question Answering (ODQA) systems. However, current AC approaches require tuning of all model parameters, and training state-of-the-art ODQA…

Computation and Language · Computer Science 2021-07-06 Yuxiang Wu , Pasquale Minervini , Pontus Stenetorp , Sebastian Riedel

The edge computing paradigm has emerged to handle cloud computing issues such as scalability, security and low response time among others. This new computing trend heavily relies on ubiquitous embedded systems on the edge. Performance and…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-28 Mohammad Hosseinabady , Mohd Amiruddin Bin Zainol , Jose Nunez-Yanez

Improving Transformer efficiency has become increasingly attractive recently. A wide range of methods has been proposed, e.g., pruning, quantization, new architectures and etc. But these methods are either sophisticated in implementation or…

Machine Learning · Computer Science 2021-09-10 Ye Lin , Yanyang Li , Tong Xiao , Jingbo Zhu

The Preconditioned Conjugate Gradient (PCG) method is widely used for solving linear systems of equations with sparse matrices. A recent version of PCG, Pipelined PCG, eliminates the dependencies in the computations of the PCG algorithm so…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-14 Manasi Tiwari , Sathish Vadhiyar

We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core - MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-05-15 George Teodoro , Tahsin Kurc , Guilherme Andrade , Jun Kong , Renato Ferreira , Joel Saltz

Parallel combinations of adaptive filters have been effectively used to improve the performance of adaptive algorithms and address well-known trade-offs, such as convergence rate vs. steady-state error. Nevertheless, typical combinations…

Optimization and Control · Mathematics 2017-11-21 Luiz F. O. Chamon , Cassio G. Lopes
‹ Prev 1 2 3 10 Next ›