Related papers: SERAD: Soft Error Resilient Asynchronous Design us…
We present a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time,…
We present CLEAR (Cross-Layer Exploration for Architecting Resilience), a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience…
Handling faults is a growing concern in HPC. In future exascale systems, it is projected that silent undetected errors will occur several times a day, increasing the occurrence of corrupted results. In this article, we propose SEDAR, which…
To cope with the soft errors and make full use of the multi-core system, this paper gives an efficient fault-tolerant hardware and software co-designed architecture for multi-core systems. And with a not large number of test patterns, it…
Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for…
Time-series information needs to be incorporated into energy system optimization to account for the uncertainty of renewable energy sources. Typically, time-series aggregation methods are used to reduce historical data to a few…
Reliability is an important requirement for both communication and storage systems. Due to continuous scale down of technology multiple adjacent bits error probability increases. The data may be corrupted due soft errors. Error correction…
As we stride toward the exascale era, due to increasing complexity of supercomputers, hard and soft errors are causing more and more problems in high-performance scientific and engineering computation. In order to improve reliability…
This paper presents a detailed evaluation of the efficiency of software-only techniques to mitigate SEU and SET in microprocessors. A set of well-known rules is presented and implemented automatically to transform an unprotected program…
Effects of radiation on electronic circuits used in extra-terrestrial applications and radiation prone environments need to be corrected. Since FPGAs offer flexibility, the effects of radiation on them need to be studied and robust methods…
Acoustic-sensor-based soft error resilience is particularly promising, since it can verify the absence of soft errors and eliminate silent data corruptions at a low hardware cost. However, the state-of-the-art work incurs a significant…
Very deep submicron and nanometer technologies have increased notably integrated circuit (IC) sensitiveness to radiation. Soft errors are currently appearing into ICs working at earth surface. Hardened circuits are currently required in…
High-speed serial links implemented in SRAM-based FPGAs have been extensively used in the trigger and data acquisition systems of High Energy Physics experiments. Usually, their application has been restricted to off-detector, mostly due…
Heterogeneous multicore architectures are becoming increasingly popular due to their potential of achieving high performance and energy efficiency compared to the homogeneous multicore architectures. In such systems, the real-time…
Most of today's high-speed switches and routers adopt an input-queued crossbar switch architecture. Such a switch needs to compute a matching (crossbar schedule) between the input ports and output ports during each switching cycle (time…
With the shrinking of technology nodes and the use of parallel processor clusters in hostile and critical environments, such as space, run-time faults caused by radiation are a serious cross-cutting concern, also impacting architectural…
Modern computer scaling trends in pursuit of larger component counts and power efficiency have, unfortunately, lead to less reliable hardware and consequently soft errors escaping into application data ("silent data corruptions").…
Compared with the fixed-run designs, the sequential adaptive designs (SAD) are thought to be more efficient and effective. Efficient global optimization (EGO) is one of the most popular SAD methods for expensive black-box optimization…
Dataset deduplication is widely recognized as a crucial preprocessing step that enhances data quality and improves the performance of large language models. A commonly used method for this process is the MinHash Locality-Sensitive Hashing…
Control Co-Design (CCD) considers the coupled effects of both the plant and control parameters to optimize a system's closed-loop transient performance during the design stage. This paper presents a new method for CCD with guarantees on…