Related papers: Supporting Dynamic Control-Flow Execution for Runt…
Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent…
Rigid body dynamics is a key technology in the robotics field. In trajectory optimization and model predictive control algorithms, there are usually a large number of rigid body dynamics computing tasks. Using CPUs to process these tasks…
Lockstep processing is a recognized technique for helping to secure functional-safety relevant processing against, for instance, single upset errors that might cause faulty execution of code. Lockstepping processors does however bind…
The rapid progress and advancement in electronic chips technology provide a variety of new implementation options for system engineers. The choice varies between the flexible programs running on a general-purpose processor (GPP) and the…
Many dedicated embedded processors do not have memory or computational resources to coexist with traditional (host-based) security solutions. As a result, there is interest in using out-of-band analog side-channel measurements and their…
Based on the two observations that diverse applications perform better on different multicore architectures, and that different phases of an application may have vastly different resource requirements, Pal et al. proposed a novel…
Intra-device parallelism addresses resource under-utilization in ML inference and training by overlapping the execution of operators with different resource usage. However, its wide adoption is hindered by a fundamental conflict with the…
Stream workflow application such as online anomaly detection or online traffic monitoring, integrates multiple streaming big data applications into data analysis pipeline. This application can be highly dynamic in nature, where the data…
A new approach to designing processor accelerators is presented. A new computing model and a special kind of accelerator with dynamic (end-user programmable) architecture is suggested. The new model considers a processor, in which a newly…
We discuss computational superstructures that, using repeated, appropriately initialized short calls, enable temporal process simulators to perform alternative tasks such as fixed point computation, stability analysis and projective…
Autotuning of performance-relevant source-code parameters allows to automatically tune applications without hard coding optimizations and thus helps with keeping the performance portable. In this paper, we introduce a benchmark set of ten…
Hybrid workflows combining traditional HPC and novel ML methodologies are transforming scientific computing. This paper presents the architecture and implementation of a scalable runtime system that extends RADICAL-Pilot with service-based…
FastFlow is a programming environment specifically targeting cache-coherent shared-memory multi-cores. FastFlow is implemented as a stack of C++ template libraries built on top of lock-free (fence-free) synchronization mechanisms. In this…
This work details a hardware-assisted approach for information flow tracking implemented on reconfigurable chips. Current solutions are either time-consuming or hardly portable (modifications of both sofware/hardware layers). This work…
Shared memory multiprocessors come back to popularity thanks to rapid spreading of commodity multi-core architectures. As ever, shared memory programs are fairly easy to write and quite hard to optimise; providing multi-core programmers…
Microprocessor roadmaps clearly show a trend towards multiple core CPUs. Modern operating systems already make use of these CPU architectures by distributing tasks between processing cores thereby increasing system performance. This review…
The software configurable processor finds best use in the embedded systems. These processors have onchip logic like FPGA (Field Programmable Gate Array) and thus can be configured to implement custom hardware functionality. The digital…
Information Fusion Systems are now widely used in different fusion contexts, like scientific processing, sensor networks, video and image processing. One of the current trends in this area is to cope with distributed systems. In this…
Elasticity is highly desirable for stream processing systems to guarantee low latency against workload dynamics, such as surges in data arrival rate and fluctuations in data distribution. Existing systems achieve elasticity following a…
Modern embedded and cyber-physical systems require every day more performance, power efficiency and flexibility, to execute several profiles and functionalities targeting the ever growing adaptivity needs and preserving execution…