Adaptation and Self-Organizing Systems · Physics
Energy Complexity of Software in Embedded Systems
Kostas Zotos, Andreas Litke, Alexander Chatzigeorgiou, Spyros Nikolaidis +1
2024-04-15
Distributed, Parallel, and Cluster Computing · Computer Science
Understanding the Impact of Input Entropy on FPU, CPU, and GPU Power
Sridutt Bhalachandra, Brian Austin, Samuel Williams, Nicholas J. Wright
2022-12-20
Hardware Architecture · Computer Science
Performance Analysis of Matrix Multiplication for Deep Learning on the Edge
Cristian Ramírez, Adrián Castelló, Héctor Martínez, Enrique S. Quintana-Ortí
2024-03-13
Distributed, Parallel, and Cluster Computing · Computer Science
High-Performance and Power-Efficient Emulation of Matrix Multiplication using INT8 Matrix Engines
Yuki Uchino, Katsuhisa Ozaki, Toshiyuki Imamura
2025-11-13
Distributed, Parallel, and Cluster Computing · Computer Science
Batched matrix operations on distributed GPUs with application in theoretical physics
Nenad Mijić, Davor Davidović
2022-03-18
Machine Learning · Computer Science
NeuralMatrix: Compute the Entire Neural Networks with Linear Matrix Operations for Efficient Inference
Ruiqi Sun, Siwei Ye, Jie Zhao, Xin He +3
2024-08-21
Computer Vision and Pattern Recognition · Computer Science
Optimising Resource Management for Embedded Machine Learning
Lei Xun, Long Tran-Thanh, Bashir M Al-Hashimi, Geoff V. Merrett
2021-05-11
Hardware Architecture · Computer Science
tuGEMM: Area-Power-Efficient Temporal Unary GEMM Architecture for Low-Precision Edge AI
Harideep Nair, Prabhu Vellaisamy, Albert Chen, Joseph Finn +3
2024-12-25
Distributed, Parallel, and Cluster Computing · Computer Science
Leveraging Hardware-Aware Computation in Mixed-Precision Matrix Multiply: A Tile-Centric Approach
Qiao Zhang, Rabab Alomairy, Dali Wang, Zhuowei Gu +1
2025-08-21
Distributed, Parallel, and Cluster Computing · Computer Science
Optimizing Irregular-Shaped Matrix-Matrix Multiplication on Multi-Core DSPs
Shangfei Yin, Qinglin Wang, Ruochen Hao, Tianyang Zhou +2
2022-08-12
Distributed, Parallel, and Cluster Computing · Computer Science
MPGemmFI: A Fault Injection Technique for Mixed Precision GEMM in ML Applications
Bo Fang, Xinyi Li, Harvey Dam, Cheng Tan +8
2023-11-13
Distributed, Parallel, and Cluster Computing · Computer Science
Toward matrix multiplication for deep learning inference on the Xilinx Versal
Jie Lei, José Flich, Enrique S. Quintana-Ortí
2023-02-16
Distributed, Parallel, and Cluster Computing · Computer Science
Accelerating 128-bit Floating-Point Matrix Multiplication on FPGAs
Fumiya Kono, Naohito Nakasato, Maho Nakata
2023-06-12
Distributed, Parallel, and Cluster Computing · Computer Science
Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs
Shixun Wu, Yujia Zhai, Jinyang Liu, Jiajun Huang +3
2023-05-03
Distributed, Parallel, and Cluster Computing · Computer Science
Parallel Algorithms for Masked Sparse Matrix-Matrix Products
Srđan Milaković, Oguz Selvitopi, Israt Nisa, Zoran Budimlić +1
2021-11-22
Distributed, Parallel, and Cluster Computing · Computer Science
FT-GEMM: A Fault Tolerant High Performance GEMM Implementation on x86 CPUs
Shixun Wu, Yujia Zhai, Jiajun Huang, Zizhe Jian +1
2023-05-10
Distributed, Parallel, and Cluster Computing · Computer Science
The Anatomy of Large-Scale Distributed Graph Algorithms
Jesun Sahariar Firoz, Thejaka Amila Kanewala, Marcin Zalewski, Martina Barnas +1
2015-07-27
Mathematical Software · Computer Science
Implementing Strassen's Algorithm with BLIS
Jianyu Huang, Tyler M. Smith, Greg M. Henry, Robert A. van de Geijn
2016-05-05