硬件体系结构

CHIA: An open-source framework for principled, agentic AI-driven hardware/software co-design research

Agentic artificial intelligence shows great promise for radically improving the pace of innovation in hardware/software co-design research across computer architecture, systems, compilers, and VLSI. Thus far, however, applications of AI in…

硬件体系结构 · 计算机科学 2026-06-25 Angela Cui , Ferran Hermida-Rivera , Jack Toubes , Raghav Gupta , Jim Fang , Chengyi Lux Zhang , Ella Schwarz , Junha Kim , Yakun Sophia Shao , Borivoje Nikolic , Christopher W. Fletcher , Sagar Karandikar

Evaluating Architectural Trade-offs in CGRAs: The Impact of Scratchpad Memory and Heterogeneity on Compute-Intensive Kernels

Modern edge computing applications, particularly high-throughput stream processing like Vision Transformers (ViTs), demand massive spatial parallelism and efficient data movement under tight power and area constraints. Coarse-Grained…

硬件体系结构 · 计算机科学 2026-06-25 María José Belda , Lara Orlandic , Fernando Castro , Miguel Peón-Quirós , Katzalin Olcoz , David Atienza

Residual GPU Cache State on Apple M4 Pro

Apple silicon exposes unified CPU-GPU memory, but the cache state left after a completed GPU command is not documented. This paper characterizes that phase boundary on a 14-core Apple M4 Pro. We validate the measurement pipeline against…

硬件体系结构 · 计算机科学 2026-06-25 Faruk Alpay , Baris Basaran

SegFold: Accelerating Sparse GEMM with a Fine-Grained Dynamic Dataflow

Generalized sparse matrix-matrix multiplication (SpGEMM) is critical in many domains. Existing CPUs and GPUs, as well as specialized accelerators, rely on static dataflows (e.g., inner product, outer product, Gustavson, etc.). Each static…

硬件体系结构 · 计算机科学 2026-06-25 Xinrui Wu , Hanyu Wang , Jason Cong , Tony Nowatzki

GRAINS: Storage-Aware Algorithm-Architecture Co-Design Enabling High-Performance and Low-Cost Graph-Based Genome Analysis

Graph-based representations of genome sequences have emerged as a powerful approach for representing massive genomic databases in an expressive and efficient way. Despite their benefits, analysis on large-scale genome graphs incurs…

硬件体系结构 · 计算机科学 2026-06-25 Nika Mansouri Ghiasi , Harun Mustafa , Talu Güloglu , Rakesh Nadig , Konstantina Koliogeorgi , Susana Rebolledo Ruiz , Marc Rautmann , Furkan Eris , Mohammad Sadrosadati , Jisung Park , Onur Mutlu

Nanoelectromechanical Systems (NEMS) for Hardware Security in Advanced Packaging

As hardware security threats escalate across semiconductor manufacturing and advanced packaging, there is a growing need for novel physical mechanisms to counter sophisticated attacks such as tampering, counterfeiting, and supply chain…

硬件体系结构 · 计算机科学 2026-06-24 Himanandhan Reddy Kottur , Pavanbabu Arjunamahanthi , M. Shafkat M. Khan , Liton Kumar Biswas , Nitin Varshney , Navid Asadizanjani

CVA6-RT: an Open-Source Time-Predictable RV64 Processor for Mixed-Criticality Systems

This work presents CVA6-RT, a real-time micro-architectural extension of the CVA6 core to bound worst-case latency and reduce task's timing execution variability. CVA6-RT implements the rv64gch ISA and features advanced support for…

硬件体系结构 · 计算机科学 2026-06-24 Enrico Zelioli , Christopher Reinwardt , Nils Wistoff , Robert Balas , Alessandro Ottaviano , Luca Benini , Angelo Garofalo

Toward Mitigating Process-Induced Performance Degradation in 3.5D Heterogeneous Packages via Pre-Silicon Firmware Co-Optimization

This paper presents a pre-silicon analysis of XRM-SSD V24/V7.0, a physics-aware predictive firmware scheduling layer for Intel's 3.5D heterogeneous integrated packages (Foveros Direct 3D + PowerVia + EMIB-T + UCIe + HBM5). Using detailed…

硬件体系结构 · 计算机科学 2026-06-24 Chi Fei Chung , Nikolai Nedovodin

The Kernel's Write: Application Read-Only Memory

Alongside power, DRAM has become a major limiting factor in datacenter growth. As DRAM's cost-per-bit has plateaued over the past decade, a class of emerging memory technologies, called Long-term RAM (LtRAM), offers a path to denser and…

硬件体系结构 · 计算机科学 2026-06-18 Hui Sub Shim , Katherine Mohr , Philip Levis

Mitigating High-Frequency Geometric Noise in Non-Parametric 1-Bit Sparse

Energy-efficient neuromorphic computing requires alternative data-encoding paradigms that bypass power-hungry floating-point operations. This paper evaluates a deterministic, non-parametric dual-manifold execution framework that maps dense…

硬件体系结构 · 计算机科学 2026-06-18 Lars Kopp

elasticAI.explorer: Towards a Unified End-to-End Framework for Hardware-Aware Neural Architecture Search

Neural Architecture Search (NAS) has become an important approach for automatically designing neural networks under task-specific and hardware-specific constraints. However, many existing NAS frameworks tightly couple search space…

硬件体系结构 · 计算机科学 2026-05-29 Natalie Maman , Florian Hettstedt , Andreas Erbslöh , Gregor Schiele

Precomputed 1D-CNNs for Atrial Fibrillation Detection on Tiny Smart Sensor Systems

1D-CNNs play a crucial role for time-series analysis on tiny smart sensor systems, e.g. for biosignal analysis, predictive maintenance, or structural health monitoring. LUTbased precomputation has emerged as an interesting optimization…

硬件体系结构 · 计算机科学 2026-05-29 Lukas Einhaus , Natalie Maman , Julian Hoever , Andreas Erbslöh , Gregor Schiele

Design-Oriented Modeling of TSV Substrate Noise Coupling to Ring VCOs

Through-silicon vias (TSVs) enable dense vertical interconnects in 3D-IC and chiplet systems, but their metal-oxide-silicon structure introduces significant parasitic coupling paths that can degrade the spectral purity of sensitive RF…

硬件体系结构 · 计算机科学 2026-05-29 Ilias Exouzidis , Alberto Garcia-Ortiz , George Floros , Georgios Panagopoulos

Constant Depth Threshold Circuits For Exhaustive Epistasis Detection

The development of large-scale neuromorphic hardware has made practical implementations of threshold gate-based circuits a near-term possibility. The complexity advantages regarding traditional computing classes, as evidenced in the…

硬件体系结构 · 计算机科学 2026-05-29 André Ribeiro , Aleksandar Ilic , Leonel Sousa

Space-Control: Process-Level Isolation for Sharing CXL-based Disaggregated Memory

Memory disaggregation via CXL enables multi-host resource sharing. However, existing CXL sharing mechanisms enforce coarse-grained, host-level permissions only, leaving isolation to the operating system. Today, virtual memory enables…

硬件体系结构 · 计算机科学 2026-05-29 Kaustav Goswami , Sean Peisert , Venkatesh Akella , Jason Lowe-Power

Nonvolatile Charge-Domain Attention with HZO Ferroelectric Capacitors: A Simulation-Based Device-to-System Evaluation

Transformer decoding is constrained by both attention compute and KV-cache movement. This paper presents the Ferroelectric Charge-Domain Compute Cell (FCDC), a hafnium-zirconium-oxide (HZO) memcapacitor with an access device that stores…

硬件体系结构 · 计算机科学 2026-05-28 Faris Abouagour

FT-Pilot: Automated Fault-Tolerant RTL Rewriting via Vulnerability-Guided LLMs

As integrated circuit technologies continue to scale toward advanced process nodes, the continual reduction in node capacitance and supply voltage has made digital systems increasingly vulnerable to soft errors. Although traditional…

硬件体系结构 · 计算机科学 2026-05-28 Weixing Liu , Zizhen Liu , Jing Ye , Naixing Wang , Cheng Liu , Huawei Li , Xiaowei Li

CLIPGen: A Chiplet Link IP Modeling and Generation Framework for 2.5D Architecture Exploration

Advanced 2.5D Systems-in-Package (SiPs) compose a growing portion of high-performance systems. While the packaging and interconnect choices play a large role in the overall system design, system architects still lack a suitable framework…

硬件体系结构 · 计算机科学 2026-05-28 Zhengping Zhu , Austin Rovinski

CXL-ClusterSim: Modeling CXL-based Disaggregated Memory Cluster for Pooling and Sharing using gem5 and SST

Large-scale AI training and inference require hundreds of gigabytes to terabytes of DRAM with high peak to average utilization ratios, resulting in overprovisioning. In cloud computing, DRAM constitutes a significant share of the cost. Yet,…

硬件体系结构 · 计算机科学 2026-05-28 Kaustav Goswami , Maryam Babaie , Hoa Nguyen , Venkatesh Akella , Jason Lowe-Power

AssertLLM2: A Comprehensive LLM Benchmark for Assertion Generation from Design Specifications

Assertion-based verification (ABV) is a cornerstone of modern hardware design, yet manually translating design intent into formal SystemVerilog Assertions (SVAs) remains labor-intensive and error-prone. While Large Language Models (LLMs)…

硬件体系结构 · 计算机科学 2026-05-28 Yuchao Wu , Wenji Fang , Jing Wang , Wenkai Li , Ziyan Guo , Zhiyao Xie