English
Related papers

Related papers: PASTA: A Modular Program Analysis Tool Framework f…

200 papers

AI compliance is becoming increasingly critical as AI systems grow more powerful and pervasive. Yet the rapid expansion of AI policies creates substantial burdens for resource-constrained practitioners lacking policy expertise. Existing…

Human-Computer Interaction · Computer Science 2026-03-26 Yu Yang , Ig-Jae Kim , Dongwook Yoon

Tensor methods have gained increasingly attention from various applications, including machine learning, quantum chemistry, healthcare analytics, social network analysis, data mining, and signal processing, to name a few. Sparse tensors and…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-12 Jiajia Li , Yuchen Ma , Xiaolong Wu , Ang Li , Kevin Barker

Leveraging Transformer attention has led to great advancements in HDR deghosting. However, the intricate nature of self-attention introduces practical challenges, as existing state-of-the-art methods often demand high-end GPUs or exhibit…

Computer Vision and Pattern Recognition · Computer Science 2024-04-10 Xiaoning Liu , Ao Li , Zongwei Wu , Yapeng Du , Le Zhang , Yulun Zhang , Radu Timofte , Ce Zhu

The increased demand for tools that automate the 3D content creation process led to tremendous progress in deep generative models that can generate diverse 3D objects of high fidelity. In this paper, we present PASTA, an autoregressive…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 Songlin Li , Despoina Paschalidou , Leonidas Guibas

The performance model of an application can pro- vide understanding about its runtime behavior on particular hardware. Such information can be analyzed by developers for performance tuning. However, model building and analyzing is…

Performance · Computer Science 2017-05-23 Kewen Meng , Boyana Norris

In this paper, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set of…

Hardware Architecture · Computer Science 2024-10-18 Licheng Guo , Yuze Chi , Jason Lau , Linghao Song , Xingyu Tian , Moazin Khatti , Weikang Qiao , Jie Wang , Ecenur Ustun , Zhenman Fang , Zhiru Zhang , Jason Cong

Particle accelerators are among the largest, most complex devices. To meet the challenges of increasing energy, intensity, accuracy, compactness, complexity and efficiency, increasingly sophisticated computational tools are required for…

Accelerator Physics · Physics 2023-01-13 Axel Huebl , Remi Lehe , Chad E. Mitchell , Ji Qiang , Robert D. Ryne , Ryan T. Sandberg , Jean-Luc Vay

Parameter-efficient tuning aims at updating only a small subset of parameters when adapting a pretrained model to downstream tasks. In this work, we introduce PASTA, in which we only modify the special token representations (e.g., [SEP] and…

Computation and Language · Computer Science 2023-02-15 Xiaocong Yang , James Y. Huang , Wenxuan Zhou , Muhao Chen

Developing efficient GPU kernels can be difficult because of the complexity of GPU architectures and programming models. Existing performance tools only provide coarse-grained suggestions at the kernel level, if any. In this paper, we…

Performance · Computer Science 2020-11-25 Keren Zhou , Xiaozhu Meng , Ryuichi Sai , John Mellor-Crummey

The Transformer has been an indispensable staple in deep learning. However, for real-life applications, it is very challenging to deploy efficient Transformers due to immense parameters and operations of models. To relieve this burden,…

Hardware Architecture · Computer Science 2022-11-01 Chao Fang , Aojun Zhou , Zhongfeng Wang

Need for the efficient processing of neural networks has given rise to the development of hardware accelerators. The increased adoption of specialized hardware has highlighted the need for more agile design flows for hardware-software…

Recent deep learning workloads increasingly push computational demand beyond what current memory systems can sustain, with many kernels stalling on data movement rather than computation. While modern dataflow accelerators incorporate…

Programming Languages · Computer Science 2025-09-09 Shihan Fang , Hongzheng Chen , Niansong Zhang , Jiajie Li , Han Meng , Adrian Liu , Zhiru Zhang

We present exa-AMD, an open-source, high-performance framework designed for accelerated materials discovery on modern supercomputers. exa-AMD overcomes key computational bottlenecks in large-scale structure prediction through task-based…

Materials Science · Physics 2025-12-11 Weiyi Xia , Maxim Moraru , Ying Wai Li , Cai-Zhuang Wang

AI-assisted imaging made substantial advances in tumor diagnosis and management. However, a major barrier to developing robust oncology foundation models is the scarcity of large-scale, high-quality annotated datasets, which are limited by…

In human-written articles, we often leverage the subtleties of text style, such as bold and italics, to guide the attention of readers. These textual emphases are vital for the readers to grasp the conveyed information. When interacting…

Computation and Language · Computer Science 2024-10-02 Qingru Zhang , Chandan Singh , Liyuan Liu , Xiaodong Liu , Bin Yu , Jianfeng Gao , Tuo Zhao

Modern transformer-based deep neural networks present unique technical challenges for effective acceleration in real-world applications. Apart from the vast amount of linear operations needed due to their sizes, modern transformer models…

Hardware Architecture · Computer Science 2024-11-07 Jiajun Wu , Mo Song , Jingmin Zhao , Yizhao Gao , Jia Li , Hayden Kwok-Hay So

The rapidly-changing deep learning landscape presents a unique opportunity for building inference accelerators optimized for specific datacenter-scale workloads. We propose Full-stack Accelerator Search Technique (FAST), a hardware…

Machine Learning · Computer Science 2022-02-02 Dan Zhang , Safeen Huda , Ebrahim Songhori , Kartik Prabhu , Quoc Le , Anna Goldie , Azalia Mirhoseini

Multi-accelerator servers are increasingly being deployed in shared multi-tenant environments (such as in cloud data centers) in order to meet the demands of large-scale compute-intensive workloads. In addition, these accelerators are…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-08 Kiran Ranganath , Joshua D. Suetterlein , Joseph B. Manzano , Shuaiwen Leon Song , Daniel Wong

Detecting unseen anomalies in unstructured environments presents a critical challenge for industrial and agricultural applications such as material recycling and weeding. Existing perception systems frequently fail to satisfy the strict…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Melanie Neubauer , Elmar Rueckert , Christian Rauch

While existing quantum hardware resources have limited availability and reliability, there is a growing demand for exploring and verifying quantum algorithms. Efficient classical simulators for high-performance quantum simulation are…

Quantum Physics · Physics 2025-03-26 Yuncheng Lu , Shuang Liang , Hongxiang Fan , Ce Guo , Wayne Luk , Paul H. J. Kelly
‹ Prev 1 2 3 10 Next ›