English
Related papers

Related papers: Vector-Processing for Mobile Devices: Benchmark an…

200 papers

The need to train DNN models on end-user devices (e.g., smartphones) is increasing with the need to improve data privacy and reduce communication overheads. Unlike datacenter servers with powerful CPUs and GPUs, modern smartphones consist…

Machine Learning · Computer Science 2022-06-13 Sanjay Sri Vallabh Singapuram , Fan Lai , Chuheng Hu , Mosharaf Chowdhury

Measurements of absolute runtime are useful as a summary of performance when studying parallel visualization and analysis methods on computational platforms of increasing concurrency and complexity. We can obtain even more insights by…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-07 E. Wes Bethel , David Camp , Talita Perciano , Colleen Heinemann

Recent trends in the HPC field have introduced new CPU architectures with improved vectorization capabilities that require optimization to achieve peak performance and thus pose challenges for performance portability. The deployment of…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-17 Gianmarco Accordi , Jens Domke , Theresa Pollinger , Davide Gadioli , Gianluca Palermo

Vector processor architectures offer an efficient solution for accelerating data-parallel workloads (e.g., ML, AI), reducing instruction count, and enhancing processing efficiency. This is evidenced by the increasing adoption of vector…

Hardware Architecture · Computer Science 2025-04-15 Matteo Perotti , Vincenzo Maisto , Moritz Imfeld , Nils Wistoff , Alessandro Cilardo , Luca Benini

Vectorization is a compiler optimization that replaces multiple operations on scalar values with a single operation on vector values. Although common in traditional compilers such as rustc, clang, and gcc, vectorization is not common in the…

We present a graph processing benchmark suite with the goal of helping to standardize graph processing evaluations. Fewer differences between graph processing evaluations will make it easier to compare different research efforts and…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-05-18 Scott Beamer , Krste Asanović , David Patterson

We recently have witnessed many ground-breaking results in machine learning and computer vision, generated by using deep convolutional neural networks (CNN). While the success mainly stems from the large volume of training data and the deep…

Computer Vision and Pattern Recognition · Computer Science 2015-01-30 Jimmy SJ. Ren , Li Xu

As the size of artificial intelligence and machine learning (AI/ML) models and datasets grows, the memory bandwidth becomes a critical bottleneck. The paper presents a novel extended memory hierarchy that addresses some major memory…

Hardware Architecture · Computer Science 2025-05-20 Jordi Altayo , Paul Delestrac , David Novo , Simey Yang , Debjyoti Bhattacharjee , Francky Catthoor

The increasing use of heterogeneous embedded systems with multi-core CPUs and Graphics Processing Units (GPUs) presents important challenges in effectively exploiting pipeline, task and data-level parallelism to meet throughput requirements…

Signal Processing · Electrical Eng. & Systems 2017-12-01 Shuoxin Lin , Jiahao Wu , Shuvra S. Bhattacharyya

A Web browser utilizes a device's CPU to parse HTML, build a Document Object Model, a Cascading Style Sheets Object Model, and render trees, and parse, compile, and execute computationally-heavy JavaScript. A powerful CPU is required to…

Performance · Computer Science 2020-03-17 Utkarsh Goel , Stephen Ludin , Moritz Steiner

The paper introduces PDSP-Bench, a novel benchmarking system designed for a systematic understanding of performance of parallel stream processing in a distributed environment. Such an understanding is essential for determining how Stream…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-16 Pratyush Agnihotri , Boris Koldehofe , Roman Heinrich , Carsten Binnig , Manisha Luthra

Mozilla Research is developing Servo, a parallel web browser engine, to exploit the benefits of parallelism and concurrency in the web rendering pipeline. Parallelization results in improved performance for pinterest.com but not for…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-11 Rohit Zambre , Lars Bergstrom , Laleh Aghababaie Beni , Aparna Chandramowliswharan

Vector search (VS) is now available in most database engines. However, while vector search is a common feature in AI/ML/LLMs where the dominant computing platforms are GPUs, existing database engines operate on CPUs even when implementing…

Databases · Computer Science 2026-05-18 Vasilis Mageirakos , Joel André , Marko Kabić , Bowen Wu , Yannis Chronis , Gustavo Alonso

Modern processor architectures, in addition to having still more cores, also require still more consideration to memory-layout in order to run at full capacity. The usefulness of most languages is deprecating as their abstractions,…

Programming Languages · Computer Science 2013-03-26 Mads Ruben Burgdorff Kristensen , Simon Andreas Frimann Lund , Troels Blum , Brian Vinter

Last several years, GPUs are used to accelerate computations in many computer science domains. We focused on GPU accelerated Support Vector Machines (SVM) training with non-linear kernel functions. We had searched for all available GPU…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-07-21 Jan Vanek , Josef Michalek , Josef Psutka

Accumulation of corporate data in the cloud has attracted more enterprise applications to the cloud creating data gravity. As a consequence, network traffic has become more cloud centric. This increase in cloud centric traffic poses new…

Machine Learning · Computer Science 2022-10-05 Mujahid Sultan

In this work, we present a new benchmarking suite with new real-life inspired skewed workloads to test the performance of concurrent index data structures. We started this project to prepare workloads specifically for self-adjusting data…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-19 Vitaly Aksenov , Dmitry Ivanov , Ravil Galiev

Graph-structured data is prevalent in domains such as social networks, financial transactions, brain networks, and protein interactions. As a result, the research community has produced new databases and analytics engines to process such…

Databases · Computer Science 2024-04-02 Puneet Mehrotra , Vaastav Anand , Daniel Margo , Milad Rezaei Hajidehi , Margo Seltzer

This paper is focused on evaluating the effect of some different techniques in machine learning speed-up, including vector caches, parallel execution, and so on. The following content will include some review of the previous approaches and…

Machine Learning · Computer Science 2021-01-12 Zeyu Ning , Hugues Nelson Iradukunda , Qingquan Zhang , Ting Zhu

The use of neural networks in edge devices is increasing, which introduces new security challenges related to the neural networks' confidentiality. As edge devices often offer physical access, attacks targeting the hardware, such as…

Cryptography and Security · Computer Science 2026-02-06 Manuel Brosch , Matthias Probst , Stefan Kögler , Georg Sigl
‹ Prev 1 2 3 10 Next ›