English
Related papers

Related papers: HONEI: A collection of libraries for numerical com…

200 papers

In this era of diverse and heterogeneous computer architectures, the programmability issues, such as productivity and portable efficiency, are crucial to software development and algorithm design. One way to approach the problem is to step…

Mathematical Software · Computer Science 2012-07-10 Mauro Bianco , Ugo Varetto

In this work, we propose an open-source, first-of-its-kind, arithmetic hardware library with a focus on accelerating the arithmetic operations involved in Ring Learning with Error (RLWE)-based somewhat homomorphic encryption (SHE). We…

Cryptography and Security · Computer Science 2020-07-06 Rashmi Agrawal , Lake Bu , Alan Ehret , Michel A. Kinsy

This work deals with the CPU-GPU heterogeneous code acceleration of a finite-volume CFD solver utilizing multiple CPUs and GPUs at the same time. First, a high-level description of the CFD solver called SENSEI, the discretization of SENSEI,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-30 Weicheng Xue , Hongyu Wang , Christopher J. Roy

Developing efficient hardware accelerators for mathematical kernels used in scientific applications and machine learning has traditionally been a labor-intensive task. These accelerators typically require low-level programming in Verilog or…

Hardware Architecture · Computer Science 2025-09-15 Doru Thom Popovici , Mario Vega , Angelos Ioannou , Fabien Chaix , Dania Mosuli , Blair Reasoner , Tan Nguyen , Xiaokun Yang , John Shalf

votess is a library for computing parallel 3D Voronoi tessellations on heterogeneous platforms, from CPUs and GPUs, to future accelerator architectures. To do so, it leverages the SYCL abstraction layer to achieve portability and…

Instrumentation and Methods for Astrophysics · Physics 2024-12-13 Samridh Dev Singh , Chris Byrohl , Dylan Nelson

In this paper we focus on the integration of high-performance numerical libraries in ab initio codes and the portability of performance and scalability. The target of our work is FLEUR, a software for electronic structure calculations…

Computational Engineering, Finance, and Science · Computer Science 2016-11-03 Diego Fabregat-Traver , Davor Davidović , Markus Höhnerbach , Edoardo Di Napoli

Existing GPU libraries often struggle to fully exploit the parallel resources and on-chip memory (SRAM) of GPUs when chaining multiple GPU functions as individual kernels. While Kernel Fusion (KF) techniques like Horizontal Fusion (HF) and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-09 Oscar Amoros , Albert Andaluz , Johnny Nunez , Antonio J. Pena

The introduction of Intel(R) Xeon Phi(TM) coprocessors opened up new possibilities in development of highly parallel applications. The familiarity and flexibility of the architecture together with compiler support integrated into the Intel…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-11-26 Jiri Dokulil , Enes Bajrovic , Siegfried Benkner , Sabri Pllana , Martin Sandrieser , Beverly Bachmayer

Binarized Neural Networks (BNNs) significantly reduce the computation and memory demands with binarized weights and activations compared to full-precision NNs. Executing a layer in a BNN on different devices of a heterogeneous…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-13 Leonard David Bereholschi , Ching-Chi Lin , Mikail Yayla , Jian-Jia Chen

It has been demonstrated that specialised architectures, such as FPGAs and AMD's AI Engines (AIEs), have the potential to deliver energy and performance advantages for scientific computing. Given the integration of AIEs into AMD's CPUs,…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-06 Nick Brown , Gabriel Rodriguez-Canal

The translation of linear algebra computations into efficient sequences of library calls is a non-trivial task that requires expertise in both linear algebra and high-performance computing. Almost all high-level languages and libraries for…

Mathematical Software · Computer Science 2020-01-01 Henrik Barthels , Christos Psarras , Paolo Bientinesi

High-performance computing systems are more and more often based on accelerators. Computing applications targeting those systems often follow a host-driven approach in which hosts offload almost all compute-intensive sections of the code…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-05-15 E. Calore , A. Gabbana , S. F. Schifano , R. Tripiccione

The recent influx of open scientific data has contributed to the transitioning of scientific computing from compute intensive to data intensive. Whereas many Big Data frameworks exist that minimize the cost of data transfers, few scientific…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-06 Valérie Hayot-Sasson , Mathieu Dugré , Tristan Glatard

GPUs are now used for a wide range of problems within HPC. However, making efficient use of the computational power available with multiple GPUs is challenging. The main challenges in achieving good performance are memory layout, affecting…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-20 Robert Clucas , Philip Blakely , Nikolaos Nikiforakis

Today's world of scientific software for High Energy Physics (HEP) is powered by x86 code, while the future will be much more reliant on accelerators like GPUs and FPGAs. The portable parallelization strategies (PPS) project of the High…

This article presents an automatic approach to quickly derive a good solution for hardware resource partition and task granularity for task-based parallel applications on heterogeneous many-core architectures. Our approach employs a…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-10 Peng Zhang , Jianbin Fang , Canqun Yang , Chun Huang , Tao Tang , Zheng Wang

We introduce PennyLane's Lightning suite, a collection of high-performance state-vector simulators targeting CPU, GPU, and HPC-native architectures and workloads. Quantum applications such as QAOA, VQE, and synthetic workloads are…

Fully Homomorphic Encryption (FHE) refers to a set of encryption schemes that allow computations to be applied directly on encrypted data without requiring a secret key. This enables novel application scenarios where a client can safely…

Machine Learning · Computer Science 2018-10-02 Roshan Dathathri , Olli Saarikivi , Hao Chen , Kim Laine , Kristin Lauter , Saeed Maleki , Madanlal Musuvathi , Todd Mytkowicz

We present Hawkeye, a system for analyzing and reproducing GPU-level arithmetic operations. Using our framework, anyone can re-execute on a CPU the exact matrix multiplication operations underlying a machine learning model training or…

Cryptography and Security · Computer Science 2026-05-19 Erez Badash , Dan Boneh , Ilan Komargodski , Megha Srivastava

We introduce HONEY; a new specialized programming language designed to facilitate the processing of multivariate, asynchronous and non-uniformly sampled symbolic and scalar time sequences. When compiled, a Honey program is transformed into…

Programming Languages · Computer Science 2016-09-13 Mathieu Guillame-Bert
‹ Prev 1 2 3 10 Next ›