English
Related papers

Related papers: Getting More From Your Multicore: Exploiting OpenM…

200 papers

Motivated by the emergence of multicore architectures, and the reality that parallelism is rarely used for analysis in observational astronomy, we demonstrate how general users may employ tightly-coupled multiprocessors in scriptable…

Astrophysics · Physics 2007-10-23 Michael S. Noble

In advancing parallel programming, particularly with OpenMP, the shift towards NLP-based methods marks a significant innovation beyond traditional S2S tools like Autopar and Cetus. These NLP approaches train on extensive datasets of…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-07 Weidong Wang , Haoran Zhu

With the slowing of Moore's Law, heterogeneous computing platforms such as Field Programmable Gate Arrays (FPGAs) have gained increasing interest for accelerating HPC workloads. In this work we present, to the best of our knowledge, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-13 Gabriel Rodriguez-Canal , David Katz , Nick Brown

Python demonstrates lower performance in comparison to traditional high performance computing (HPC) languages such as C, C++, and Fortran. This performance gap is largely due to Python's interpreted nature and the Global Interpreter Lock…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-16 César Piñeiro , Juan C. Pichel

In this paper, we present OMP2MPI a tool that generates automatically MPI source code from OpenMP. With this transformation the original program can be adapted to be able to exploit a larger number of processors by surpassing the limits of…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-12 Albert Saa-Garriga , David Castells-Rufas , Jordi Carrabina

RISC-V allows for building general-purpose computing platforms with programmable accelerators around a single open-source ISA. However, leveraging heterogeneous SoCs within high-level applications is a tedious task. In this preliminary…

Hardware Architecture · Computer Science 2025-04-08 Cyril Koenig , Enrico Zelioli , Frank K. Gürkaynak , Luca Benini

OpenMP is a cross-platform API that extends C, C++ and Fortran and provides shared-memory parallelism platform for those languages. The use of many cores and HPC technologies for scientific computing has been spread since the 1990s, and now…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-07-25 Gal Oren , Yehuda Ganan , Guy Malamud

Large language models (LLMs) are increasingly used for complex tasks that require multiple generation calls, advanced prompting techniques, control flow, and structured inputs/outputs. However, efficient systems are lacking for programming…

In past years, the world has switched to many-core and multi-core shared memory architectures. As a result, there is a growing need to utilize these architectures by introducing shared memory parallelization schemes to software…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-15 Re'em Harel , Yuval Pinter , Gal Oren

Graphs model several real-world phenomena. With the growth of unstructured and semi-structured data, parallelization of graph algorithms is inevitable. Unfortunately, due to inherent irregularity of computation, memory access, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-08 Nibedita Behera , Ashwina Kumar , Ebenezer Rajadurai T , Sai Nitish , Rajesh Pandian M , Rupesh Nasre

Recent research in information extraction (IE) focuses on utilizing code-style inputs to enhance structured output generation. The intuition behind this is that the programming languages (PLs) inherently exhibit greater structural…

Computation and Language · Computer Science 2025-05-23 Bo Li , Gexiang Fang , Wei Ye , Zhenghua Xu , Jinglei Zhang , Hao Cheng , Shikun Zhang

With multi-core processors a ubiquitous building block of modern supercomputers, it is now past time to enable applications to embrace these developments in processor design. To achieve exascale performance, applications will need ways of…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-08-13 Michele Weiland , Lawrence Mitchell , Gerard Gorman , Stephan Kramer , Mark Parsons , James Southern

We discuss the use of both MPI and OpenMP in the teaching of senior undergraduate and junior graduate classes in parallel programming. We briefly introduce the OpenMP standard and discuss why we have chosen to use it in parallel programming…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Yi Pan

Spoken Language Understanding infers semantic meaning directly from audio data, and thus promises to reduce error propagation and misunderstandings in end-user applications. However, publicly available SLU resources are limited. In this…

Computation and Language · Computer Science 2020-11-30 Emanuele Bastianelli , Andrea Vanzo , Pawel Swietojanski , Verena Rieser

OpenMP is the de facto API for parallel programming in HPC applications. These programs are often computed in data centers, where energy consumption is a major issue. Whereas previous work has focused almost entirely on performance, we here…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-12 Henrik Valter , Axel Karlsson , Miquel Pericàs

A trend in high performance computers that is becoming increasingly popular is the use of symmetric multiprocessing (SMP) rather than the older paradigm of MPP. MPI codes that ran and scaled well on MPP machines can often be run on an SMP…

High Energy Physics - Lattice · Physics 2009-10-31 Steven Gottlieb , Sonali Tamhankar

Large language models (LLMs)such as ChatGPT have significantly advanced the field of Natural Language Processing (NLP). This trend led to the development of code-based large language models such as StarCoder, WizardCoder, and CodeLlama,…

Software Engineering · Computer Science 2024-11-08 Le Chen , Arijit Bhattacharjee , Nesreen Ahmed , Niranjan Hasabnis , Gal Oren , Vy Vo , Ali Jannesari

Large Language Models (LLM) show strong abilities in code generation, but their skill in creating efficient parallel programs is less studied. This paper explores how LLMs generate task-based parallel code from three kinds of input prompts:…

Programming Languages · Computer Science 2026-02-27 Linus Bantel , Moritz Strack , Alexander Strack , Dirk Pflüger

Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark's open-source distributed machine learning library. MLlib…

VSIPL and OpenMP are two open standards for portable high performance computing. VSIPL delivers optimized single processor performance while OpenMP provides a low overhead mechanism for executing thread based parallelism on shared memory…

Astrophysics · Physics 2015-05-26 Jeremy Kepner
‹ Prev 1 2 3 10 Next ›