Related papers: Benchmarking Processor Performance by Multi-Thread…

Performance Comparison for Scientific Computations on the Edge via Relative Performance

In a typical Internet-of-Things setting that involves scientific applications, a target computation can be evaluated in many different ways depending on the split of computations among various devices. On the one hand, different…

Performance · Computer Science 2022-08-09 Aravind Sankaran , Paolo Bientinesi

A Comparative Study of CPU Scheduling Algorithms

Developing CPU scheduling algorithms and understanding their impact in practice can be difficult and time consuming due to the need to modify and test operating system kernel code and measure the resulting performance on a consistent…

Operating Systems · Computer Science 2013-07-17 Neetu Goel , R. B. Garg

Benchmarking Machine Learning: How Fast Can Your Algorithms Go?

This paper is focused on evaluating the effect of some different techniques in machine learning speed-up, including vector caches, parallel execution, and so on. The following content will include some review of the previous approaches and…

Machine Learning · Computer Science 2021-01-12 Zeyu Ning , Hugues Nelson Iradukunda , Qingquan Zhang , Ting Zhu

Data Classification With Multiprocessing

Classification is one of the most important tasks in Machine Learning (ML) and with recent advancements in artificial intelligence (AI) it is important to find efficient ways to implement it. Generally, the choice of classification…

Machine Learning · Computer Science 2023-12-27 Anuja Dixit , Shreya Byreddy , Guanqun Song , Ting Zhu

A Survey From Distributed Machine Learning to Distributed Deep Learning

Artificial intelligence has made remarkable progress in handling complex tasks, thanks to advances in hardware acceleration and machine learning algorithms. However, to acquire more accurate outcomes and solve more complex issues,…

Machine Learning · Computer Science 2023-09-12 Mohammad Dehghani , Zahra Yazdanparast

Estimate The Efficiency Of Multiprocessor's Cash Memory Work Algorithms

Many computer systems for calculating the proper organization of memory are among the most critical issues. Using a tier cache memory (along with branching prediction) is an effective means of increasing modern multi-core processors'…

Networking and Internet Architecture · Computer Science 2021-05-21 Mohamed A. Hamada , Abdelrahman Abdallah

Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training

Deep learning has become widely used in complex AI applications. Yet, training a deep neural network (DNNs) model requires a considerable amount of calculations, long running time, and much energy. Nowadays, many-core AI accelerators (e.g.,…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-12 Yuxin Wang , Qiang Wang , Shaohuai Shi , Xin He , Zhenheng Tang , Kaiyong Zhao , Xiaowen Chu

Evaluating the Performance of Machine Learning Algorithms in Financial Market Forecasting: A Comprehensive Survey

With increasing competition and pace in the financial markets, robust forecasting methods are becoming more and more valuable to investors. While machine learning algorithms offer a proven way of modeling non-linearities in time series,…

Computational Finance · Quantitative Finance 2019-07-09 Lukas Ryll , Sebastian Seidens

Machine Learning and CPU (Central Processing Unit) Scheduling Co-Optimization over a Network of Computing Centers

In the rapidly evolving research on artificial intelligence (AI) the demand for fast, computationally efficient, and scalable solutions has increased in recent years. The problem of optimizing the computing resources for distributed machine…

Machine Learning · Computer Science 2025-10-30 Mohammadreza Doostmohammadian , Zulfiya R. Gabidullina , Hamid R. Rabiee

Distributed Training and Optimization Of Neural Networks

Deep learning models are yielding increasingly better performances thanks to multiple factors. To be successful, model may have large number of parameters or complex architectures and be trained on large dataset. This leads to large…

Machine Learning · Computer Science 2022-12-20 Jean-Roch Vlimant , Junqi Yin

Performance Evaluation and Comparison of a New Regression Algorithm

In recent years, Machine Learning algorithms, in particular supervised learning techniques, have been shown to be very effective in solving regression problems. We compare the performance of a newly proposed regression algorithm against…

Machine Learning · Computer Science 2023-06-16 Sabina Gooljar , Kris Manohar , Patrick Hosein

A Survey and Empirical Evaluation of Parallel Deep Learning Frameworks

The field of deep learning has witnessed a remarkable shift towards extremely compute- and memory-intensive neural networks. These newer larger models have enabled researchers to advance state-of-the-art tools across a variety of fields.…

Machine Learning · Computer Science 2022-07-04 Daniel Nichols , Siddharth Singh , Shu-Huai Lin , Abhinav Bhatele

Predicting the Performance of a Computing System with Deep Networks

Predicting the performance and energy consumption of computing hardware is critical for many modern applications. This will inform procurement decisions, deployment decisions, and autonomic scaling. Existing approaches to understanding the…

Machine Learning · Computer Science 2023-02-28 Mehmet Cengiz , Matthew Forshaw , Amir Atapour-Abarghouei , Andrew Stephen McGough

Discriminating Equivalent Algorithms via Relative Performance

In scientific computing, it is common that a mathematical expression can be computed by many different algorithms (sometimes over hundreds), each identifying a specific sequence of library calls. Although mathematically equivalent, those…

Performance · Computer Science 2021-09-15 Aravind Sankaran , Paolo Bientinesi

Learning Machines Implemented on Non-Deterministic Hardware

This paper highlights new opportunities for designing large-scale machine learning systems as a consequence of blurring traditional boundaries that have allowed algorithm designers and application-level practitioners to stay -- for the most…

Machine Learning · Computer Science 2014-09-10 Suyog Gupta , Vikas Sindhwani , Kailash Gopalakrishnan

A Learned Performance Model for Tensor Processing Units

Accurate hardware performance models are critical to efficient code generation. They can be used by compilers to make heuristic decisions, by superoptimizers as a minimization objective, or by autotuners to find an optimal configuration for…

Performance · Computer Science 2021-03-19 Samuel J. Kaufman , Phitchaya Mangpo Phothilimthana , Yanqi Zhou , Charith Mendis , Sudip Roy , Amit Sabne , Mike Burrows

Optimising Resource Management for Embedded Machine Learning

Machine learning inference is increasingly being executed locally on mobile and embedded platforms, due to the clear advantages in latency, privacy and connectivity. In this paper, we present approaches for online resource management in…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 Lei Xun , Long Tran-Thanh , Bashir M Al-Hashimi , Geoff V. Merrett

Exploiting Parallelism Opportunities with Deep Learning Frameworks

State-of-the-art machine learning frameworks support a wide variety of design features to enable a flexible machine learning programming interface and to ease the programmability burden on machine learning developers. Identifying and using…

Machine Learning · Computer Science 2020-07-01 Yu Emma Wang , Carole-Jean Wu , Xiaodong Wang , Kim Hazelwood , David Brooks

Algorithms with Prediction Portfolios

The research area of algorithms with predictions has seen recent success showing how to incorporate machine learning into algorithm design to improve performance when the predictions are correct, while retaining worst-case guarantees when…

Machine Learning · Computer Science 2022-12-06 Michael Dinitz , Sungjin Im , Thomas Lavastida , Benjamin Moseley , Sergei Vassilvitskii

Performance Evaluation of Parallel Algorithms

Evaluating how well a whole system or set of subsystems performs is one of the primary objectives of performance testing. We can tell via performance assessment if the architecture implementation meets the design objectives. Performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-15 Donald Ene Vincent Ike Anireh