Related papers: FPDetect: Efficient Reasoning About Stencil Progra…
The increase in HPC systems size and complexity, together with increasing on-chip transistor density, power limitations, and number of components, render modern HPC systems subject to soft errors. Silent data corruptions (SDCs) are…
The study addresses the problem of precision in floating-point (FP) computations. A method for estimating the errors which affect intermediate and final results is proposed and a summary of many software simulations is discussed. The basic…
Floating-point program errors can lead to severe consequences, particularly in critical domains such as military applications. Only a small subset of inputs may induce substantial floating-point errors, prompting researchers to develop…
We introduce a framework for fault-tolerant post-selection (FTPS) of fault-tolerant codes and channels -- such as those based on surface-codes -- using soft-information metrics based on visible syndrome and erasure information. We introduce…
Instruction-level error injection analyses aim to find instructions where errors often lead to unacceptable outcomes like Silent Data Corruptions (SDCs). These analyses require significant time, which is especially problematic if developers…
Soft error, namely silent corruption of signal or datum in a computer system, cannot be caverlierly ignored as compute and communication density grow exponentially. Soft error detection has been studied in the context of enterprise…
Improving the efficiency of edge detection in embedded applications, such as UAV control, is critical for reducing system cost and power dissipation. Field programmable gate arrays (FPGA) are a good platform for making improvements because…
Floating-point programs form the foundation of modern science and engineering, providing the essential computational framework for a wide range of applications, such as safety-critical systems, aerospace engineering, and financial analysis.…
Many aerospace and automotive applications use FPGAs in their designs due to their low power and reconfigurability requirements. Meanwhile, such applications also pose a high standard on system reliability, which makes the early-stage…
Whether stemming from malicious intent or natural occurrences, faults and errors can significantly undermine the reliability of any architecture. In response to this challenge, fault detection assumes a pivotal role in ensuring the secure…
In this paper, we use reduced precision checking (RPC) to detect errors in floating point arithmetic. Prior work explored RPC for addition and multiplication. In this work, we extend RPC to a complete floating point unit (FPU), including…
Particle filtering methods can be applied to estimation problems in discrete spaces on bounded domains, to sample from and marginalise over unknown hidden states. As in continuous settings, problems such as particle degradation can arise:…
Motivation: In microarray analysis, special consideration must be given to the issues of multiple statistical tests and typically p-values are adjusted to control family-wise error rate (FWER) or false discovery rate (FDR). FDR metrics have…
This paper describes a new approach for using changepoint detection (CPD) to estimate the starting and stopping times of a forced oscillation (FO) in measured power system data. As with a previous application of CPD to this problem, the…
We provide tools to help automate the error analysis of algorithms that evaluate simple functions over the floating-point numbers. The aim is to obtain tight relative error bounds for these algorithms, expressed as a function of the unit…
In this work, we provide energy-efficient architectural support for floating point accuracy. Our goal is to provide accuracy that is far greater than that provided by the processor's hardware floating point unit (FPU). Specifically, for…
Fault-tolerant logical entangling gates are essential for scalable quantum computing, but are limited by the error rates and overheads of physical two-qubit gates and measurements. To address this limitation, we introduce phantom…
Significant inaccuracy often occurs during the process of mathematical calculation due to the digit limitation of floating point, which may lead to catastrophic loss. Normally, people believe that adjustment of floating-point precision is…
Resilient algorithms in high-performance computing are subject to rigorous non-functional constraints. Resiliency must not increase the runtime, memory footprint or I/O demands too significantly. We propose a task-based soft error detection…
Specialized hardware accelerators have been designed and employed to maximize the performance efficiency of Spiking Neural Networks (SNNs). However, such accelerators are vulnerable to transient faults (i.e., soft errors), which occur due…