Related papers: C for a tiny system
We empirically evaluated thousands of different C calling conventions for irregular microcontroller architectures, and found potential for improvement over the calling conventions previously used in the Small Device C Compiler (SDCC). The…
We describe a modified SIMD architecture suitable for single-chip integration of a large number of processing elements, such as 1,000 or more. Important differences from traditional SIMD designs are: a) The size of the memory per processing…
Many virtual machines exist for sensor nodes with only a few KB RAM and tens to a few hundred KB flash memory. They pack an impressive set of features, but suffer from a slowdown of one to two orders of magnitude compared to optimised…
This paper presents a microkernel architecture for constraint programming organized around a number of small number of core functionalities and minimal interfaces. The architecture contrasts with the monolithic nature of many…
The paper introduces the development of a modular compiler for a subset of a C-like language, which addresses the challenges in constructing a compiler for high-level languages. This modular approach will allow developers to modify a…
Hyperdimensional Computing (HDC) is a bio-inspired computing framework that has gained increasing attention, especially as a more efficient approach to machine learning (ML). This work introduces the \name{} compiler, the first open-source…
Several high-performance lab instruments suitable for manual assembly have been developed using low-pin-count 32-bit microcontrollers that communicate with an Android tablet via a USB interface. A single Android tablet app accommodates…
Examples of embedded intelligence include a wide variety of tiny neural networks used on-board wireless sensors and actuators, which are expected to continuously perform inference on time-series of the data they sense. In order to fit…
We implement and benchmark parallel I/O methods for the fully-manycore driven particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as a major challenge for applications on today's and future HPC systems, we present a…
Printed electronics have gained significant traction in recent years, presenting a viable path to integrating computing into everyday items, from disposable products to low-cost healthcare. However, the adoption of computing in these…
The introduction of complex SoCs with multiple processor cores presents new development challenges, such that development support is now a decisive factor when choosing a System-on-Chip (SoC). The presented developments support strategy…
Memory latency, bandwidth, capacity, and energy increasingly limit performance. In this paper, we reconsider proposed system architectures that consist of huge (many-terabyte to petabyte scale) memories shared among large numbers of CPUs.…
Substantial efforts are invested in improving network security, but the threat landscape is rapidly evolving, particularly with the recent interest in programmable network hardware. We explore a new security threat, from an attacker who has…
In recent years the computational capacity of single Field Programmable Gate Arrays (FPGA) devices as well as their versatility has increased significantly. Adding to that the High Level Synthesis frameworks allowing to program such…
On-device training enables the model to adapt to new data collected from the sensors by fine-tuning a pre-trained model. Users can benefit from customized AI models without having to transfer the data to the cloud, protecting the privacy.…
Among the areas, most demanding in terms of calculation is the telecommunication and video applications are now included in several telecommunication devices such as set-top boxes, mobile phones. Embedded videos applications in new…
Efficient low complexity error correcting code(ECC) is considered as an effective technique for mitigation of multi-bit upset (MBU) in the configuration memory(CM)of static random access memory (SRAM) based Field Programmable Gate Array…
By mimicking brain-like cognition and exploiting parallelism, hyperdimensional computing (HDC) classifiers have been emerging as a lightweight framework to achieve efficient on-device inference. Nonetheless, they have two fundamental…
C is the lingua franca of programming and almost any device can be programmed using C. However, programming mod-ern heterogeneous architectures such as multi-core CPUs and GPUs requires explicitly expressing parallelism as well as…
The 64-bit architectures that have become standard today offer unprecedented low-level programming possibilities. For the first time in the history of computing, the size of address registers far exceeded the physical capacity of their…