Related papers: ALBERTA: ALgorithm-Based Error Resilience in Trans…
Transformer models rely on High-Performance Computing (HPC) resources for inference, where soft errors are inevitable in large-scale systems, making the reliability of the model particularly critical. Existing fault tolerance frameworks for…
Transformers and large language models (LLMs), powered by the attention mechanism, have transformed numerous AI applications, driving the need for specialized hardware accelerators. A major challenge in these accelerators is efficiently…
Vision Transformers (ViTs) that leverage self-attention mechanism have shown superior performance on many classical vision tasks compared to convolutional neural networks (CNNs) and gain increasing popularity recently. Existing ViTs works…
The ability of Convolutional Neural Networks (CNNs) to accurately process real-time telemetry has boosted their use in safety-critical and high-performance computing systems. As such systems require high levels of resilience to errors, CNNs…
Whether stemming from malicious intent or natural occurrences, faults and errors can significantly undermine the reliability of any architecture. In response to this challenge, fault detection assumes a pivotal role in ensuring the secure…
Detecting software vulnerabilities is critical to ensuring the security and reliability of modern computer systems. Deep neural networks have shown promising results on vulnerability detection, but they lack the capability to capture global…
Security vulnerabilities present in a code that has been written in diverse programming languages are among the most critical yet complicated aspects of source code to detect. Static analysis tools based on rule-based patterns usually do…
Transformer-based text classifiers such as BERT, RoBERTa, T5, and GPT have shown strong performance in natural language processing tasks but remain vulnerable to adversarial examples. These vulnerabilities raise significant security…
We present JavelinGuard, a suite of low-cost, high-performance model architectures designed for detecting malicious intent in Large Language Model (LLM) interactions, optimized specifically for production deployment. Recent advances in…
Fault-tolerant deep learning accelerator is the basis for highly reliable deep learning processing and critical to deploy deep learning in safety-critical applications such as avionics and robotics. Since deep learning is known to be…
Transformers are a popular choice for classification tasks and as backbones for object detection tasks. However, their high latency brings challenges in their adaptation to lightweight object detection systems. We present an approximation…
The demand for efficient large language model (LLM) inference has propelled the development of dedicated accelerators. As accelerators are vulnerable to hardware faults due to aging, variation, etc, existing accelerator designs often…
Ransomware and other forms of malware cause significant financial and operational damage to organizations by exploiting long-standing and often difficult-to-detect software vulnerabilities. To detect vulnerabilities such as buffer overflows…
Vision transformers (ViTs) have become essential backbones in advanced computer vision applications and multi-modal foundation models. Despite their strengths, ViTs remain vulnerable to adversarial perturbations, comparable to or even…
We present a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time,…
Recently, deep learning techniques have garnered substantial attention for their ability to identify vulnerable code patterns accurately. However, current state-of-the-art deep learning models, such as Convolutional Neural Networks (CNN),…
Cyber attacks are often identified using system and network logs. There have been significant prior works that utilize provenance graphs and ML techniques to detect attacks, specifically advanced persistent threats, which are very difficult…
Operating LLMs as coordinated multi-agent research systems over multi-hour runs surfaces failure modes that single-shot evaluation cannot: upstream providers throttle without warning, sub-agents drift the task to fit accessible tools,…
Deepfake detection has become increasingly important due to the rise of synthetic media, which poses significant risks to digital identity and cyber presence for security and trust. While multiple approaches have improved detection…
Transformer-based malware detection systems operating on graph modalities such as control flow graphs (CFGs) achieve strong performance by modeling structural relationships in program behavior. However, their robustness to adversarial…