English
Related papers

Related papers: ImaGen: A General Framework for Generating Memory-…

200 papers

Image processing and machine learning applications benefit tremendously from hardware acceleration, but existing compilers target either FPGAs, which sacrifice power and performance for flexible hardware, or ASICs, which rapidly become…

Processing-in-memory (PIM) has shown extraordinary potential in accelerating neural networks. To evaluate the performance of PIM accelerators, we present an ISA-based simulation framework including a dedicated ISA targeting neural networks…

Hardware Architecture · Computer Science 2024-02-29 Xinyu Wang , Xiaotian Sun , Yinhe Han , Xiaoming Chen

Due to the emergence of embedded applications in image and video processing, communication and cryptography, improvement of pictorial information for better human perception like deblurring, denoising in several fields such as satellite…

Hardware Architecture · Computer Science 2014-04-16 Chandrajit Pal , Avik Kotal , Asit Samanta , Amlan Chakrabarti , Ranjan Ghosh

Recent advances in reprogrammable hardware (e.g., FPGAs) and memory technology (e.g., DDR4, HBM) promise to solve performance problems inherent to graph processing like irregular memory access patterns on traditional hardware (e.g., CPU).…

Hardware Architecture · Computer Science 2021-04-19 Jonas Dann , Daniel Ritter , Holger Fröning

While FPGAs have been used extensively as hardware accelerators in industrial computation, no theoretical model of computation has been devised for the study of FPGA-based accelerators. In this paper, we present a theoretical model of…

Data Structures and Algorithms · Computer Science 2018-11-19 Martin Hora , Václav Končický , Jakub Tětek

Recent trends in business and technology (e.g., machine learning, social network analysis) benefit from storing and processing growing amounts of graph-structured data in databases and data science platforms. FPGAs as accelerators for graph…

Databases · Computer Science 2021-02-09 Jonas Dann , Daniel Ritter , Holger Fröning

Deep neural networks (DNN) have demonstrated effectiveness for various applications such as image processing, video segmentation, and speech recognition. Running state-of-the-art DNNs on current systems mostly relies on either…

Neural and Evolutionary Computing · Computer Science 2019-04-15 Mohsen Imani , Mohammad Samragh , Yeseong Kim , Saransh Gupta , Farinaz Koushanfar , Tajana Rosing

Recent researches on neural network have shown significant advantage in machine learning over traditional algorithms based on handcrafted features and models. Neural network is now widely adopted in regions like image, speech and video…

Hardware Architecture · Computer Science 2018-12-07 Kaiyuan Guo , Shulin Zeng , Jincheng Yu , Yu Wang , Huazhong Yang

Medical image processing is often limited by the computational cost of the involved algorithms. Whereas dedicated computing devices (GPUs in particular) exist and do provide significant efficiency boosts, they have an extra cost of use in…

The software configurable processor finds best use in the embedded systems. These processors have onchip logic like FPGA (Field Programmable Gate Array) and thus can be configured to implement custom hardware functionality. The digital…

Hardware Architecture · Computer Science 2025-05-13 Ganesh Prabhu , Steevan Rodrigues , Niranjan Chiplunkar , Niranjan U. C

Specialized image processing accelerators are necessary to deliver the performance and energy efficiency required by important applications in computer vision, computational photography, and augmented reality. But creating,…

Software Engineering · Computer Science 2016-11-01 Jing Pu , Steven Bell , Xuan Yang , Jeff Setter , Stephen Richardson , Jonathan Ragan-Kelley , Mark Horowitz

Deep neural networks (DNNs) are of critical use in different domains. To accelerate DNN computation, tensor compilers are proposed to generate efficient code on different domain-specific accelerators. Existing tensor compilers mainly focus…

Machine Learning · Computer Science 2023-07-12 Zixuan Ma , Haojie Wang , Jingze Xing , Liyan Zheng , Chen Zhang , Huanqi Cao , Kezhao Huang , Shizhi Tang , Penghan Wang , Jidong Zhai

We propose a generic algorithmic building block to accelerate training of machine learning models on heterogeneous compute systems. Our scheme allows to efficiently employ compute accelerators such as GPUs and FPGAs for the training of…

Machine Learning · Computer Science 2017-11-08 Celestine Dünner , Thomas Parnell , Martin Jaggi

In the domain of image processing, often real-time constraints are required. In particular, in safety-critical applications, such as X-ray computed tomography in medical imaging or advanced driver assistance systems in the automotive…

Programming Languages · Computer Science 2015-02-27 Oliver Reiche , Konrad Häublein , Marc Reichenbach , Frank Hannig , Jürgen Teich , Dietmar Fey

Generative adversarial networks (GANs) have promoted remarkable advances in single-image super-resolution (SR) by recovering photo-realistic images. However, high memory consumption of GAN-based SR (usually generators) causes performance…

Hardware Architecture · Computer Science 2021-07-28 Wenlong Cheng , Mingbo Zhao , Zhiling Ye , Shuhang Gu

Modern computer systems typically conbine multicore CPUs with accelerators like GPUs for inproved performance and energy efficiency. However, these sys- tems suffer from poor performance portability, code tuned for one device must be…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-05-23 Thomas L. Falch , Anne C. Elster

Today, there is a trend to incorporate more intelligence (e.g., vision capabilities) into a wide range of devices, which makes high performance a necessity for computing systems. Furthermore, for embedded systems, low power consumption…

Other Computer Science · Computer Science 2014-08-25 Zhilei Chai , Zhibin Wang , Wenmin Yang , Shuai Ding , Yuanpu Zhang

This work deals with the optimization of computer programs targeting Graphics Processing Units (GPUs). The goal is to lift, from programmers to optimizing compilers, the heavy burden of determining program details that are dependent on the…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-16 Xiaohui Chen , Marc Moreno-Maza , Jeeva Paudel , Ning Xie

Future computing systems, from handhelds to supercomputers, will undoubtedly be more parallel and heterogeneous than todays systems to provide more performance and energy efficiency. Thus, GPUs are increasingly being used to accelerate…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-18 Saeed Taheri , Apan Qasem , Martin Burtscher

Computational memory (CM) is a promising approach for accelerating inference on neural networks (NN) by using enhanced memories that, in addition to storing data, allow computations on them. One of the main challenges of this approach is…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-27 Kornilios Kourtis , Martino Dazzi , Nikolas Ioannou , Tobias Grosser , Abu Sebastian , Evangelos Eleftheriou
‹ Prev 1 2 3 10 Next ›