Related papers: cuPC: CUDA-based Parallel PC Algorithm for Causal …
The PC algorithm is the state-of-the-art algorithm for causal structure discovery on observational data. It can be computationally expensive in the worst case due to the conditional independence tests are performed in an…
Discovering causal relationships from observational data is a crucial problem and it has applications in many research areas. The PC algorithm is the state-of-the-art constraint based method for causal discovery. However, runtime of the PC…
Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. For large data sets the common approach to PCA computation is based on the standard NIPALS-PCA algorithm, which unfortunately suffers from…
Principal component analysis (PCA) is a statistical technique commonly used in multivariate data analysis. However, PCA can be difficult to interpret and explain since the principal components (PCs) are linear combinations of the original…
The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite…
Discovering causal relationships from data is the ultimate goal of many research areas. Constraint based causal exploration algorithms, such as PC, FCI, RFCI, PC-simple, IDA and Joint-IDA have achieved significant progress and have many…
Analysis of processing time and similarity of images generated between CPU and GPU architectures and sequential and parallel programming. For image processing a computer with AMD FX-8350 processor and an Nvidia GTX 960 Maxwell GPU was used,…
Parallel computing can offer an enormous advantage regarding the performance for very large applications in almost any field: scientific computing, computer vision, databases, data mining, and economics. GPUs are high performance many-core…
Large-scale GPU traces play a critical role in identifying performance bottlenecks within heterogeneous High-Performance Computing (HPC) architectures. However, the sheer volume and complexity of a single trace of data make performance…
Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…
We examine the problem of optimizing classification tree evaluation for on-line and real-time applications by using GPUs. Looking at trees with continuous attributes often used in image segmentation, we first put the existing algorithms for…
Predictive Coding (PC) offers a brain-inspired alternative to backpropagation for neural network training, described as a physical system minimizing its internal energy. However, in practice, PC is predominantly digitally simulated,…
The Graphics Processing Unit (GPU) is a powerful tool for parallel computing. In the past years the performance and capabilities of GPUs have increased, and the Compute Unified Device Architecture (CUDA) - a parallel computing architecture…
The advent of high performance computing (HPC) and graphics processing units (GPU), present an enormous computation resource for Large data transactions (big data) that require parallel processing for robust and prompt data analysis. While…
To effectively control large-scale distributed systems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the solving process in the…
The PC algorithm uses conditional independence tests for model selection in graphical modeling with acyclic directed graphs. In Gaussian models, tests of conditional independence are typically based on Pearson correlations, and…
Parallel algorithms on CPU and GPU are implemented for the Unified Gas-Kinetic Scheme and their performances are investigated and compared by a two dimensional channel flow case. The parallel CPU algorithm has a one dimensional block…
With the advent of high-performance computing techniques, the data for analysis has grown significantly. Here, graphic processing unit (GPU) based program kernels are discussed to exploit parallelism in the analysis codes specific to…
The Preconditioned Conjugate Gradient (PCG) method is widely used for solving linear systems of equations with sparse matrices. A recent version of PCG, Pipelined PCG, eliminates the dependencies in the computations of the PCG algorithm so…
The development of multicore architectures supporting parallel data processing has led to a paradigm shift, which affects communication systems significantly. This article provides a scalable parallel approach of an iterative LDPC decoder,…