Related papers: Simple FPGA routing graph compression
As compared to a large spectrum of performance optimizations, relatively little effort has been dedicated to optimize other aspects of embedded applications such as memory space requirements, power, real-time predictability, and…
As state of the art neural networks (NNs) continue to grow in size, their resource-efficient implementation becomes ever more important. In this paper, we introduce a compression scheme that reduces the number of computations required for…
Memory bandwidth is known to be a performance bottleneck for FPGA accelerators, especially when they deal with large multi-dimensional data-sets. A large body of work focuses on reducing of off-chip transfers, but few authors try to improve…
This study proposes a new router architecture to improve the performance of dynamic allocation of virtual channels. The proposed router is designed to reduce the hardware complexity and to improve power and area consumption, simultaneously.…
In the face of escalating complexity and size of contemporary FPGAs and circuits, routing emerges as a pivotal and time-intensive phase in FPGA compilation flows. In response to this challenge, we present an open-source parallel routing…
Several studies exhibit that the traffic load of the routers only has a small influence on their energy consumption. Hence, the power consumption in networks is strongly related to the number of active network elements, such as interfaces,…
The use of reconfigurable computing, and FPGAs in particular, to accelerate computational kernels has the potential to be of great benefit to scientific codes and the HPC community in general. However, whilst recent advanced in FPGA tooling…
Neural networks achieve state-of-the-art performance in image classification, speech recognition, scientific analysis and many more application areas. Due to the high computational complexity and memory footprint of neural networks, various…
In this letter, we propose a new routing strategy to improve the transportation efficiency on complex networks. Instead of using the routing strategy for shortest path, we give a generalized routing algorithm to find the so-called {\it…
Several studies have identified a significant amount of redundancy in the network traffic. For example, it is demonstrated that there is a great amount of redundancy within the content of a server over time. This redundancy can be leveraged…
Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Data compression offers an attractive approach to reducing communication costs by using available bandwidth effectively.…
In the classical RAM, we have the following useful property. If we have an algorithm that uses $M$ memory cells throughout its execution, and in addition is sparse, in the sense that, at any point in time, only $m$ out of $M$ cells will be…
Deep neural networks are an extremely successful and widely used technique for various pattern recognition and machine learning tasks. Due to power and resource constraints, these computationally intensive networks are difficult to…
The use of FPGAs for efficient graph processing has attracted significant interest. Recent memory subsystem upgrades including the introduction of HBM in FPGAs promise to further alleviate memory bottlenecks. However, modern multi-channel…
Network compression reduces the computational complexity and memory consumption of deep neural networks by reducing the number of parameters. In SVD-based network compression, the right rank needs to be decided for every layer of the…
We investigate how the underlying graph of a network supports a flow between a source node and a destination node and propose to compute the expected number of nodes and links that contribute to transferring items in random graphs. Since…
This paper reduces the cost of DNNs training by decreasing the amount of data movement across heterogeneous architectures composed of several GPUs and multicore CPU devices. In particular, this paper proposes an algorithm to dynamically…
Despite technological advancements, the significance of interdisciplinary subjects like complex networks has grown. Exploring communication within these networks is crucial, with traffic becoming a key concern due to the expanding…
We study online graph queries that retrieve nearby nodes of a query node from a large network. To answer such queries with high throughput and low latency, we partition the graph and process the data in parallel across a cluster of servers.…
Recent trends in business and technology (e.g., machine learning, social network analysis) benefit from storing and processing growing amounts of graph-structured data in databases and data science platforms. FPGAs as accelerators for graph…