Related papers: Multi-Amdahl: Optimal Resource Sharing with Multip…
Future multiprocessor chips will integrate many different units, each tailored to a specific computation. When designing such a system, the chip architect must decide how to distribute limited system resources such as area, power, and…
The problem of learning parallel computer performance is investigated in the context of multicore processors. Given a fixed workload, the effect of varying system configuration on performance is sought. Conventionally, the performance…
This article presents an automatic approach to quickly derive a good solution for hardware resource partition and task granularity for task-based parallel applications on heterogeneous many-core architectures. Our approach employs a…
The modification of Amdahl's law for the case of increment of processor elements in a computer system is considered. The coefficient $k$ linking accelerations of parallel and parallel specialized computer systems is determined. The limiting…
This work analyses the effects of sequential-to-parallel synchronization and inter-core communication on multicore performance, speedup and scaling. A modification of Amdahl law is formulated, to reflect the finding that parallel speedup is…
Classical Amdahl's Law conceptualized the limit of speedup for an era of fixed serial-parallel decomposition and homogeneous replication. Modern heterogeneous systems need a different conceptual framework: constrained resources must be…
Scheduling the power exchange between a population of heterogeneous distributed energy resources and the corresponding upper-level system is an important control problem in power systems. A key challenge is the large number of (partially…
Heterogeneous many-cores are now an integral part of modern computing systems ranging from embedding systems to supercomputers. While heterogeneous many-core design offers the potential for energy-efficient high-performance, such potential…
We study exploration in stochastic multi-armed bandits when we have access to a divisible resource that can be allocated in varying amounts to arm pulls. We focus in particular on the allocation of distributed computing resources, where we…
To deliver high performance in power limited systems, architects have turned to using heterogeneous systems, either CPU+GPU or mixed CPU-hardware systems. However, in systems with different processor types and task affinities, scheduling…
Hardware accelerators, such as those based on GPUs and FPGAs, offer an excellent opportunity to efficiently parallelize functionalities. Recently, modern embedded platforms started being equipped with such accelerators, resulting in a…
This paper proposes an accelerated consensus-based distributed iterative algorithm for resource allocation and scheduling. The proposed gradient-tracking algorithm introduces an auxiliary variable to add momentum towards the optimal state.…
In a federated setting, agents coordinate with a central agent or a server to solve an optimization problem in which agents do not share their information with each other. Wirth and his co-authors, in a recent paper, describe how the basic…
In present study, in order to improve the performance and reduce the amount of power which is dissipated in heterogeneous multicore processors, the ability of detecting the program execution phases is investigated. The programs execution…
With the spread of multi- and many-core processors more and more typical task is to re-implement some source code written originally for a single processor to run on more than one cores. Since it is a serious investment, it is important to…
This paper studies optimal scheduling and resource allocation under allowable over-scheduling. Formulating an optimisation problem where over-scheduling is embedded, we derive an optimal solution that can be implemented by means of a new…
In this paper, we study the resource allocation algorithm design for multiuser orthogonal frequency division multiplexing (OFDM) downlink systems with simultaneous wireless information and power transfer. The algorithm design is formulated…
Memory bandwidth regulation and cache partitioning are widely used techniques for achieving predictable timing in real-time computing systems. Combined with partitioned scheduling, these methods require careful co-allocation of tasks and…
In this paper, a unifying framework for orthogonal frequency division multiplexing (OFDM) multiuser resource allocation is presented. The isolated seeming problems of maximizing a weighted sum of rates for a given power budget $\bar{P}$ and…
Motivated by broad applications in various fields of engineering, we study a network resource allocation problem where the goal is to optimally allocate a fixed quantity of resources over a network of nodes. We consider large scale networks…