English
Related papers

Related papers: Equal bi-Vectorized (EBV) method to high performan…

200 papers

Matrix decompositions are ubiquitous in machine learning, including applications in dimensionality reduction, data compression and deep learning algorithms. Typical solutions for matrix decompositions have polynomial complexity which…

Machine Learning · Computer Science 2024-03-13 Łukasz Struski , Paweł Morkisz , Przemysław Spurek , Samuel Rodriguez Bernabeu , Tomasz Trzciński

Nowadays, several industrial applications are being ported to parallel architectures. In fact, these platforms allow acquire more performance for system modelling and simulation. In the electric machines area, there are many problems which…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-10-25 Antonio Wendell De Oliveira Rodrigues , Frédéric Guyomarch , Yvonnick Le Menach , Jean-Luc Dekeyser

A fast algorithm for the approximation of a low rank LU decomposition is presented. In order to achieve a low complexity, the algorithm uses sparse random projections combined with FFT-based random projections. The asymptotic approximation…

Numerical Analysis · Mathematics 2016-01-19 Yariv Aizenbud , Gil Shabat , Amir Averbuch

We propose a GPU-accelerated distributed optimization algorithm for controlling multi-phase optimal power flow in active distribution systems with dynamically changing topologies. To handle varying network configurations and enable…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-15 Minseok Ryu , Geunyeong Byeon , Kibaek Kim

This paper describes a parallel implementation of Viterbi decoding algorithm. Viterbi decoder is widely used in many state-of-the-art wireless systems. The proposed solution optimizes both throughput and memory usage by applying…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-19 Alireza Mohammadidoost , Matin Hashemi

This paper presents a parallel preconditioning approach based on incomplete LU (ILU) factorizations in the framework of Domain Decomposition (DD) for general sparse linear systems. We focus on distributed memory parallel architectures,…

Numerical Analysis · Mathematics 2023-03-17 Tianshi Xu , Ruipeng Li , Daniel Osei-Kuffuor

A novel and scalable geometric multi-level algorithm is presented for the numerical solution of elliptic partial differential equations, specially designed to run with high occupancy of streaming processors inside Graphics Processing…

Mathematical Software · Computer Science 2017-03-22 J. T. Becerra-Sagredo , F. Mandujano , C. Malaga

We propose a GPU-based distributed optimization algorithm, aimed at controlling optimal power flow in multi-phase and unbalanced distribution systems. Typically, conventional distributed optimization algorithms employed in such scenarios…

Optimization and Control · Mathematics 2023-10-17 Minseok Ryu , Geunyeong Byeon , Kibaek Kim

We discuss an approach for solving sparse or dense banded linear systems ${\bf A} {\bf x} = {\bf b}$ on a Graphics Processing Unit (GPU) card. The matrix ${\bf A} \in {\mathbb{R}}^{N \times N}$ is possibly nonsymmetric and moderately large;…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-09-29 Ang Li , Radu Serban , Dan Negrut

Bilevel optimization has been widely used in decision-making process. However, there still lacks an efficient algorithm to determine an optimal solution of a bilevel optimization problem, especially for a large-size problem. To bridge the…

Optimization and Control · Mathematics 2016-05-18 Xuan Liu , Zuyi Li

The singular value decomposition (SVD) is a powerful tool in modern numerical linear algebra, which underpins computational methods such as principal component analysis (PCA), low-rank approximations, and randomized algorithms. Many…

Mathematical Software · Computer Science 2026-04-10 Ahmad Abdelfattah , Massimiliano Fasi

We present a fast randomized algorithm that computes a low rank LU decomposition. Our algorithm uses random projections type techniques to efficiently compute a low rank approximation of large matrices. The randomized LU algorithm can be…

Numerical Analysis · Mathematics 2016-02-02 Gil Shabat , Yaniv Shmueli , Yariv Aizenbud , Amir Averbuch

LU factorization for sparse matrices is the most important computing step for many engineering and scientific computing problems such as circuit simulation. But parallelizing LU factorization with the Graphic Processing Units (GPU) still…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-14 Shaoyi Peng , Sheldon X. -D. Tan

Many research works have been performed on implementation of Vitrerbi decoding algorithm on GPU instead of FPGA because this platform provides considerable flexibility in addition to great performance. Recently, the recently-introduced…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-30 Alireza Mohammadidoost , Matin Hashemi

In this paper, we investigate GPU based parallel triangular solvers systematically. The parallel triangular solvers are fundamental to incomplete LU factorization family preconditioners and algebraic multigrid solvers. We develop a new…

Mathematical Software · Computer Science 2016-06-03 Zhangxin Chen , Hui Liu , Bo Yang

The prediction of a dielectric breakdown in a high-voltage device is based on criteria that evaluate the electric field along field lines. Therefore it is necessary to efficiently compute the electric field at arbitrary points in space. A…

Numerical Analysis · Mathematics 2020-11-03 Cedric Münger , Steffen Börm , Jörg Ostrowski

Hierarchical low-rank approximation of dense matrices can reduce the complexity of their factorization from O(N^3) to O(N). However, the complex structure of such hierarchical matrices makes them difficult to parallelize. The block size and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-05 Qianxiang Ma , Rio Yokota

We present a recursive way to partition hypergraphs which creates and exploits hypergraph geometry and is suitable for many-core parallel architectures. Such partitionings are then used to bring sparse matrices in a recursive Bordered Block…

Data Structures and Algorithms · Computer Science 2011-05-24 B. O. Fagginger Auer , R. H. Bisseling

In this paper, we propose an efficient parallelization strategy for boundary element method (BEM) solvers that perform the electromagnetic analysis of structures with lossy conductors. The proposed solver is accelerated with the adaptive…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-30 Damian Marek , Shashwat Sharma , Piero Triverio

Singular Value Decomposition (SVD) is a fundamental matrix factorization technique in linear algebra, widely applied in numerous matrix-related problems. However, traditional SVD approaches are hindered by slow panel factorization and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-18 Shifang Liu , Huiyuan Li , Hongjiao Sheng , Haoyuan Gui , Xiaoyu Zhang
‹ Prev 1 2 3 10 Next ›