English
Related papers

Related papers: Understanding Soft Errors in Uncore Components

200 papers

To protect multicores from soft-error perturbations, resiliency schemes have been developed with high coverage but high power and performance overheads. Emerging safety-critical machine learning applications are increasingly being deployed…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-07-11 Qingchuan Shi , Hamza Omar , Omer Khan

We present a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time,…

We present CLEAR (Cross-Layer Exploration for Architecting Resilience), a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience…

Smaller feature size, higher clock frequency and lower power consumption are of core concerns of today's nano-technology, which has been resulted by continuous downscaling of CMOS technologies. The resultant 'device shrinking' reduces the…

Other Computer Science · Computer Science 2011-10-19 Muhammad Sheikh Sadi , Md. Mizanur Rahman Khan , Md. Nazim Uddin , Jan Jürjens

Acoustic-sensor-based soft error resilience is particularly promising, since it can verify the absence of soft errors and eliminate silent data corruptions at a low hardware cost. However, the state-of-the-art work incurs a significant…

Hardware Architecture · Computer Science 2022-02-22 Jianping Zeng , Hongjune Kim , Jaejin Lee , Changhee Jung

Rapid CMOS device size reduction resulted in billions of transistors on a chip have led to integration of many cores leading to many challenges such as increased power dissipation, thermal dissipation, occurrence of transient faults and…

Hardware Architecture · Computer Science 2023-04-12 Shashikiran Venkatesha , Ranjani Parthasarathi

Graphics processing units (GPUs) are gaining widespread use in computational chemistry and other scientific simulation contexts because of their huge performance advantages relative to conventional CPUs. However, the reliability of GPUs in…

Hardware Architecture · Computer Science 2009-11-14 Imran S. Haque , Vijay S. Pande

The memory consistency model is a fundamental system property characterizing a multiprocessor. The relative merits of strict versus relaxed memory models have been widely debated in terms of their impact on performance, hardware complexity…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-04-07 Alexander Jaffe , Thomas Moscibroda , Laura Effinger-Dean , Luis Ceze , Karin Strauss

Fault tolerance overhead of high performance computing (HPC) applications is becoming critical to the efficient utilization of HPC systems at large scale. HPC applications typically tolerate fail-stop failures by checkpointing. Another…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-06-22 Erlin Yao , Mingyu Chen , Rui Wang , Wenli Zhang , Guangming Tan

Handling faults is a growing concern in HPC. In future exascale systems, it is projected that silent undetected errors will occur several times a day, increasing the occurrence of corrupted results. In this article, we propose SEDAR, which…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-29 Diego Montezanti , Enzo Rucci , Armando De Giusti , Marcelo Naiouf , Dolores Rexachs , Emilio Luque

Soft errors in large VLSI circuits pose dramatic influence on computing- and memory-intensive neural network (NN) processing. Understanding the influence of soft errors on NNs is critical to protect against soft errors for reliable NN…

Machine Learning · Computer Science 2022-10-13 Haitong Huang , Xinghua Xue , Cheng Liu , Ying Wang , Tao Luo , Long Cheng , Huawei Li , Xiaowei Li

Advancements in multi-core have created interest among many research groups in finding out ways to harness the true power of processor cores. Recent research suggests that on-board component such as cache memory plays a crucial role in…

Hardware Architecture · Computer Science 2011-11-15 N. Ramasubramanian , Srinivas V. V. , N. Ammasai Gounden

AIoT processors fabricated with newer technology nodes suffer rising soft errors due to the shrinking transistor sizes and lower power supply. Soft errors on the AIoT processors particularly the deep learning accelerators (DLAs) with…

Hardware Architecture · Computer Science 2021-07-08 Dawen Xu , Meng He , Cheng Liu , Ying Wang , Long Cheng , Huawei Li , Xiaowei Li , Kwang-Ting Cheng

In stochastic circuits, major sources of error are correlation errors, soft errors and random fluctuation errors that affect the accuracy and reliability of the circuit. The soft error has the effect of changing the correlation status and…

Emerging Technologies · Computer Science 2021-07-30 Shyamali Mitra , Sayantan Banerjee , Mrinal Kanti Naskar

The increasing number of threads inside the cores of a multicore processor, and competitive access to the shared cache memory, become the main reasons for an increased number of competitive cache misses and performance decline. Inevitably,…

Hardware Architecture · Computer Science 2017-01-09 Milcho Prisagjanec , Pece Mitrevski

High-performance and safety-critical system architects must accurately evaluate the application-level silent data corruption (SDC) rates of processors to soft errors. Such an evaluation requires error propagation all the way from particle…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-05 Siva Kumar Sastry Hari , Paolo Rech , Timothy Tsai , Mark Stephenson , Arslan Zulfiqar , Michael Sullivan , Philip Shirvani , Paul Racunas , Joel Emer , Stephen W. Keckler

In contemporary times, the increasing complexity of the system poses significant challenges to the reliability, trustworthiness, and security of the SACRES. Key issues include the susceptibility to phenomena such as instantaneous voltage…

Hardware Architecture · Computer Science 2024-12-23 Enrico Magliano , Alessio Carpegna , Alessadro Savino , Stefano Di Carlo

Memory allocation, though constituting only a small portion of the executed code, can have a "butterfly effect" on overall program performance, leading to significant and far-reaching impacts. Despite accounting for just approximately 5% of…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-29 Ruihao Li , Qinzhe Wu , Krishna Kavi , Gayatri Mehta , Jonathan C. Beard , Neeraja J. Yadwadkar , Lizy K. John

In recent years, high availability and reliability of Data Storage Systems (DSS) have been significantly threatened by soft errors occurring in storage controllers. Due to their specific functionality and hardware-software stack, error…

Performance · Computer Science 2021-12-24 Mostafa Kishani , Mehdi Tahoori , Hossein Asadi

The ever growing demands of embedded systems to satisfy high computing performance and cost efficiency lead to the trend of using commercial off-the-shelf hardware. However, due to their highly integrated design they are becoming…

Software Engineering · Computer Science 2015-11-24 Andrea Höller , Tobias Rauter , Johannes Iber , Georg Macher , Christian Kreiner
‹ Prev 1 2 3 10 Next ›