Related papers: Software Alchemy: Turning Complex Statistical Comp…
This thesis introduces PEMS2, an improvement to PEMS (Parallel External Memory System). PEMS executes Bulk-Synchronous Parallel (BSP) algorithms in an External Memory (EM) context, enabling computation with very large data sets which exceed…
Sequential computation is well understood but does not scale well with current technology. Within the next decade, systems will contain large numbers of processors with potentially thousands of processors per chip. Despite this, many…
We consider the problem of selecting the best variable-value strategy for solving a given problem in constraint programming. We show that the recent Embarrassingly Parallel Search method (EPS) can be used for this purpose. EPS proposes to…
In this paper, we explore the limits of graphics processors (GPUs) for general purpose parallel computing by studying problems that require highly irregular data access patterns: parallel graph algorithms for list ranking and connected…
Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…
In this paper, we consider an approach to the parallelizing of the algorithms realizing the modified probability changigng method with adaptation and partial rollback procedure for constrained pseudo-Boolean optimization problems. Existing…
Evolutionary algorithms (EAs) are increasingly implemented on graphics processing units (GPUs) to leverage parallel processing capabilities for enhanced efficiency. However, existing studies largely emphasize the raw speedup obtained by…
To harness the full benefit of new computing platforms, it is necessary to develop software with parallel computing capabilities. This is no less true for statisticians than for astrophysicists. The R programming language, which is perhaps…
This paper discusses the potential of graphics processing units (GPUs) in high-dimensional optimization problems. A single GPU card with hundreds of arithmetic cores can be inserted in a personal computer and dramatically accelerates many…
Bayesian optimization has emerged as a strong candidate tool for global optimization of functions with expensive evaluation costs. However, due to the dynamic nature of research in Bayesian approaches, and the evolution of computing…
Cycles are one of the fundamental subgraph patterns and being able to enumerate them in graphs enables important applications in a wide variety of fields, including finance, biology, chemistry, and network science. However, to enable cycle…
While deep learning excels in natural image and language processing, its application to high-dimensional data faces computational challenges due to the dimensionality curse. Current large-scale data tools focus on business-oriented…
Graph clustering has many important applications in computing, but due to growing sizes of graphs, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest.…
Computation of a signal's estimated covariance matrix is an important building block in signal processing, e.g., for spectral estimation. Each matrix element is a sum of products of elements in the input matrix taken over a sliding window.…
Hardware double precision is often insufficient to solve large scientific problems accurately. Computing in higher precision defined by software causes significant computational overhead. The application of parallel algorithms compensates…
More and more large data collections are gathered worldwide in various IT systems. Many of them possess the networked nature and need to be processed and analysed as graph structures. Due to their size they require very often usage of…
With the growing complexity and capability of contemporary robotic systems, the necessity of sophisticated computing solutions to efficiently handle tasks such as real-time processing, sensor integration, decision-making, and control…
Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly…
Training Gaussian process-based models typically involves an $ O(N^3)$ computational bottleneck due to inverting the covariance matrix. Popular methods for overcoming this matrix inversion problem cannot adequately model all types of latent…
Parallel batched data structures are designed to process synchronized batches of operations in a parallel computing model. In this paper, we propose parallel combining, a technique that implements a concurrent data structure from a parallel…