English
Related papers

Related papers: A Roofline Visualization Framework

200 papers

This paper presents a practical methodology for collecting performance data necessary to conduct hierarchical Roofline analysis on NVIDIA GPUs. It discusses the extension of the Empirical Roofline Toolkit for broader support of a range of…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-26 Charlene Yang , Yunsong Wang , Steven Farrell , Thorsten Kurth , Samuel Williams

This paper surveys a range of methods to collect necessary performance data on Intel CPUs and NVIDIA GPUs for hierarchical Roofline analysis. As of mid-2020, two vendor performance tools, Intel Advisor and NVIDIA Nsight Compute, have…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-06 Charlene Yang

In this short paper, we introduce the Ridgeline model, an extension of the Roofline model [4] for distributed systems. The Roofline model targets shared memory systems, bounding the performance of a kernel based on its operational…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-18 Fabio Checconi , Jesmin Jahan Tithi , Fabrizio Petrini

Deep learning applications are usually very compute-intensive and require a long run time for training and inference. This has been tackled by researchers from both hardware and software sides, and in this paper, we propose a Roofline-based…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-24 Yunsong Wang , Charlene Yang , Steven Farrell , Yan Zhang , Thorsten Kurth , Samuel Williams

The rapidly growing importance of Machine Learning (ML) applications, coupled with their ever-increasing model size and inference energy footprint, has created a strong need for specialized ML hardware architectures. Numerous ML…

Hardware Architecture · Computer Science 2025-05-26 Marian Verhelst , Luca Benini , Naveen Verma

We present a bottleneck analysis tool for designing compute systems for autonomous Unmanned Aerial Vehicles (UAV). The tool provides insights by exploiting the fundamental relationships between various components in the autonomous UAV such…

Robotics · Computer Science 2022-06-16 Srivatsan Krishnan , Zishen Wan , Kshitij Bhardwaj , Aleksandra Faust , Vijay Janapa Reddi

Energy consumption has been a great deal of concern in recent years and developers need to take energy-efficiency into account when they design algorithms. Their design needs to be energy-efficient and low-power while it tries to achieve…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-09-26 Millad Ghane , Jeff Larkin , Larry Shi , Sunita Chandrasekaran , Margaret S. Cheung

Software visualization seeks to represent software artifacts graphical-ly in two or three dimensions, with the goal of enhancing comprehension, anal-ysis, maintenance, and evolution of the source code. In this context, visualiza-tions…

Software Engineering · Computer Science 2025-09-30 Anthony Savidis , Christos Vasilopoulos

As RISC-V architectures proliferate across embedded and high-performance domains, developers face persistent challenges in performance optimization due to fragmented tooling, immature hardware features, and platform-specific defects. This…

Performance · Computer Science 2025-07-31 Alexander Batashev

We introduce an early-phase bottleneck analysis and characterization model called the F-1 for designing computing systems that target autonomous Unmanned Aerial Vehicles (UAVs). The model provides insights by exploiting the fundamental…

A robust evaluation toolset has been designed for Naval Research Laboratory's Real-Time Ocean Forecasting System RELO with the purpose of facilitating an adaptive sampling strategy and providing more educated guidance for routing underwater…

Robotics · Computer Science 2024-09-24 Edward Holmberg

Due to the recent announcement of the Frontier supercomputer, many scientific application developers are working to make their applications compatible with AMD architectures (CPU-GPU), which means moving away from the traditional CPU and…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-11 Matthew Leinhauser , René Widera , Sergei Bastrakov , Alexander Debus , Michael Bussmann , Sunita Chandrasekaran

Peak performance metrics published by vendors often do not correspond to what can be achieved in practice. It is therefore of great interest to do extensive benchmarking on core applications and library routines. Since DGEMM is one of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-19 Jacob Odgård Tørring , Jan Christian Meyer , Anne C. Elster

This paper proposes a framework for developing forecasting models by streamlining the connections between core components of the developmental process. The proposed framework enables swift and robust integration of new datasets,…

Machine Learning · Computer Science 2023-04-14 Jonathan Hans Soeseno , Sergio González , Trista Pei-Chun Chen

Interactive data visualization is a major part of modern exploratory data analysis, with web-based technologies enabling a rich ecosystem of both specialized and general tools. However, current visualization tools often lack support for…

Human-Computer Interaction · Computer Science 2025-08-14 Jan Simson

We propose a novel and flexible roof modeling approach that can be used for constructing planar 3D polygon roof meshes. Our method uses a graph structure to encode roof topology and enforces the roof validity by optimizing a simple but…

Graphics · Computer Science 2021-09-17 Jing Ren , Biao Zhang , Bojian Wu , Jianqiang Huang , Lubin Fan , Maks Ovsjanikov , Peter Wonka

Processing large amounts of data to extract useful information is an essential task within companies. To help in this task, visualization techniques have been commonly used due to their capacity to present data in synthesized views, easier…

Other Computer Science · Computer Science 2011-02-24 Vincent Mahe , Salvador Martinez Perez , Guillaume Doux , Hugo Brunelière , Jordi Cabot

There are many science applications that require scalable task-level parallelism and support for flexible execution and coupling of ensembles of simulations. Most high-performance system software and middleware, however, are designed to…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-29 Vivekanandan Balasubramanian , Antons Treikalis , Ole Weidner , Shantenu Jha

The aim of this paper is to develop an approach to visualizations that benefits from distributed computing. Three schemes of process distribution are considered: parallel, pipeline, and expanding pipeline computations. Expanding pipeline…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Mark Burgin , Walter Karplus , Damon Liu

Any data analysis, especially the data sets that may be changing often or in real-time, consists of at least three important synchronized components: i) figuring out what to infer (objectives), ii) analysis or computation of objectives, and…

Human-Computer Interaction · Computer Science 2021-06-11 Abhishek Santra , Kunal Samant , Endrit Memeti , Enamul Karim , Sharma Chakravarthy
‹ Prev 1 2 3 10 Next ›