English
Related papers

Related papers: Accelerating Domain Propagation: an Efficient GPU-…

200 papers

Eulerian nonlinear uncertainty propagation methods often suffer from finite domain limitations and computational inefficiencies. A recent approach to this class of algorithm, Grid-based Bayesian Estimation Exploiting Sparsity, addresses the…

Chaotic Dynamics · Physics 2025-08-20 Benjamin L. Hanson , Carlos Rubio , Adrián García-Gutiérrez , Thomas Bewley

Several methods for density matrix propagation in distributed computing environments, such as clusters and graphics processing units, are proposed and evaluated. It is demonstrated that the large communication overhead associated with each…

Chemical Physics · Physics 2014-07-16 Luke J. Edwards , Ilya Kuprov

Generalized sparse matrix-matrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-09 Aydın Buluç , John R. Gilbert

This work proposes a new GPU thread map for $m$-simplex domains, that scales its speedup with dimension and is energy efficient compared to other state of the art approaches. The main contributions of this work are i) the formulation of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-13 Cristóbal A. Navarro , Felipe A. Quezada , Benjamin Bustos , Nancy Hitschfeld , Rolando Kindelan

GPUs have significantly accelerated first-order methods for large-scale optimization, especially in continuous optimization. However, this success has not transferred cleanly to problems with discrete variables, combinatorial structure, and…

Machine Learning · Computer Science 2026-05-22 Jiachang Liu , Andrea Lodi

We present distributed algorithms for training dynamic Graph Neural Networks (GNN) on large scale graphs spanning multi-node, multi-GPU systems. To the best of our knowledge, this is the first scaling study on dynamic GNN. We devise…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-17 Venkatesan T. Chakaravarthy , Shivmaran S. Pandian , Saurabh Raje , Yogish Sabharwal , Toyotaro Suzumura , Shashanka Ubaru

Large scale-free graphs are famously difficult to process efficiently: the skewed vertex degree distribution makes it difficult to obtain balanced partitioning. Our research instead aims to turn this into an advantage by partitioning the…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-05 Scott Sallinen , Abdullah Gharaibeh , Matei Ripeanu

Constraint Programming developed within Logic Programming in the Eighties; nowadays all Prolog systems encompass modules capable of handling constraint programming on finite domains demanding their solution to a constraint solver. This work…

Artificial Intelligence · Computer Science 2026-01-14 Enrico Santi , Fabio Tardivo , Agostino Dovier , Andrea Formisano

Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-04 Lingda Li , Ari B. Hayes , Stephen A. Hackler , Eddy Z. Zhang , Mario Szegedy , Shuaiwen Leon Song

The focus of my PhD thesis is on exploring parallel approaches to efficiently solve problems modeled by constraints and presenting a new proposal. Current solvers are very advanced; they are carefully designed to effectively manage the…

Artificial Intelligence · Computer Science 2019-09-23 Fabio Tardivo

This work focuses on accelerating the multiplication of a dense random matrix with a (fixed) sparse matrix, which is frequently used in sketching algorithms. We develop a novel scheme that takes advantage of blocking and recomputation…

Computational Engineering, Finance, and Science · Computer Science 2024-05-14 Tianyu Liang , Riley Murray , Aydın Buluç , James Demmel

In this paper, we use graphics processing units(GPU) to accelerate sparse and arbitrary structured neural networks. Sparse networks have nodes in the network that are not fully connected with nodes in preceding and following layers, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-12 Aavaas Gajurel , Sushil J. Louis , Frederick C Harris

Diffusion models have achieved great success in synthesizing high-quality images. However, generating high-resolution images with diffusion models is still challenging due to the enormous computational costs, resulting in a prohibitive…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Muyang Li , Tianle Cai , Jiaxin Cao , Qinsheng Zhang , Han Cai , Junjie Bai , Yangqing Jia , Ming-Yu Liu , Kai Li , Song Han

This paper presents a novel selective constraint propagation method for constrained image segmentation. In the literature, many pairwise constraint propagation methods have been developed to exploit pairwise constraints for cluster…

Computer Vision and Pattern Recognition · Computer Science 2015-02-06 Peng Han

We implement two novel algorithms for sparse-matrix dense-matrix multiplication (SpMM) on the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row (CSR) format and thus do not require expensive format conversion.…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-13 Carl Yang , Aydin Buluc , John D. Owens

Real-time trajectory optimization for nonlinear constrained autonomous systems is critical and typically performed by CPU-based sequential solvers. Specifically, reliance on global sparse linear algebra or the serial nature of dynamic…

Robotics · Computer Science 2026-03-13 Yilin Zou , Zhong Zhang , Maxime Robic , Fanghua Jiang

We propose a GPU-accelerated distributed optimization algorithm for controlling multi-phase optimal power flow in active distribution systems with dynamically changing topologies. To handle varying network configurations and enable…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-15 Minseok Ryu , Geunyeong Byeon , Kibaek Kim

High level programming languages and GPU accelerators are powerful enablers for a wide range of applications. Achieving scalable vertical (within a compute node), horizontal (across compute nodes), and temporal (over different generations…

There is a stage in the GPU computing pipeline where a grid of thread-blocks, in \textit{parallel space}, is mapped onto the problem domain, in \textit{data space}. Since the parallel space is restricted to a box type geometry, the mapping…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-09-07 Cristóbal A. Navarro , Benjamín Bustos , Nancy Hitschfeld

There is a stage in the GPU computing pipeline where a grid of thread-blocks is mapped to the problem domain. Normally, this grid is a k-dimensional bounding box that covers a k-dimensional problem no matter its shape. Threads that fall…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-08-27 Cristobal A. Navarro , Nancy Hitschfeld
‹ Prev 1 2 3 10 Next ›