Related papers: Efficient Methods and Parallel Execution for Algor…
Efficient execution of parameter sensitivity analysis (SA) is critical to allow for its routinely use. The pathology image processing application investigated in this work processes high-resolution whole-slide cancer tissue images from…
With the increasingly availability of digital microscopy imagery equipments there is a demand for efficient execution of whole slide tissue image applications. Through the process of sensitivity analysis it is possible to improve the output…
We investigate and characterize the performance of an important class of operations on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high…
We propose, implement, and experimentally evaluate a runtime middleware to support high-throughput execution on hybrid cluster machines of large-scale analysis applications. A hybrid cluster machine consists of computation nodes which have…
Parallel computing is very important to accelerate the performance of software systems. Additionally, considering that a recurring challenge is to process high data volumes continuously, stream processing emerged as a paradigm and software…
Despite the possibility to quickly compute reachable sets of large-scale linear systems, current methods are not yet widely applied by practitioners. The main reason for this is probably that current approaches are not push-button-capable…
Optimizing the performance of computational fluid dynamics (CFD) applications accelerated by graphics processing units (GPUs) is crucial for efficient simulations. In this study, we employed a machine learning-based autotuning technique to…
Virtual screening applications are highly parameterized to optimize the balance between quality and execution performance. While output quality is critical, the entire screening process must be completed within a reasonable time. In fact, a…
We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core - MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core…
This article presents an automatic approach to quickly derive a good solution for hardware resource partition and task granularity for task-based parallel applications on heterogeneous many-core architectures. Our approach employs a…
The sensitivity of parameters in computational science problems is difficult to assess, especially for algorithms with multiple input parameters and diverse outputs. This work seeks to explore sensitivity analysis in the visualization…
The rise of deep learning has marked significant progress in fields such as computer vision, natural language processing, and medical imaging, primarily through the adaptation of pre-trained models for specific tasks. Traditional…
We propose a software platform that integrates methods and tools for multi-objective parameter auto- tuning in tissue image segmentation workflows. The goal of our work is to provide an approach for improving the accuracy of nucleus/cell…
Large models represent a groundbreaking advancement in multiple application fields, enabling remarkable achievements across various tasks. However, their unprecedented scale comes with significant computational costs. These models, often…
Evaluating LLMs and text-to-image models is a computationally intensive task often overlooked. Efficient evaluation is crucial for understanding the diverse capabilities of these models and enabling comparisons across a growing number of…
Supervised fine-tuning (SFT) is a pivotal approach to adapting large language models (LLMs) for downstream tasks; however, performance often suffers from the ``seesaw phenomenon'', where indiscriminate parameter updates yield progress on…
Many-core accelerators, as represented by the XeonPhi coprocessors and GPGPUs, allow software to exploit spatial and temporal sharing of computing resources to improve the overall system performance. To unlock this performance potential…
Performance tools for emerging heterogeneous exascale platforms must address two principal challenges when analyzing execution measurements. First, measurement of large-scale executions may record mountains of performance data. Second,…
Autotuning of performance-relevant source-code parameters allows to automatically tune applications without hard coding optimizations and thus helps with keeping the performance portable. In this paper, we introduce a benchmark set of ten…
While modern parallel computing systems offer high performance, utilizing these powerful computing resources to the highest possible extent demands advanced knowledge of various hardware architectures and parallel programming models.…