English
Related papers

Related papers: Enhancing ASIC Technology Mapping via Parallel Sup…

200 papers

Parallel processing of information plays a critical role in accelerating computation. This includes quantum computers, where parallel processing of quantum information will play a critical role in practical quantum advantage. Here, we…

Combinational equivalence checking (CEC) remains a challenge EDA task in the formal verification of datapath circuits due to their complex arithmetic structures and the limited capability or scalability of SAT, BDD, and exact-simulation…

Logic in Computer Science · Computer Science 2025-12-09 Xindi Zhang , Furong Ye , Zhihan Chen , Shaowei Cai

To train modern large DNN models, pipeline parallelism has recently emerged, which distributes the model across GPUs and enables different devices to process different microbatches in pipeline. Earlier pipeline designs allow multiple…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-23 Ziyue Luo , Xiaodong Yi , Guoping Long , Shiqing Fan , Chuan Wu , Jun Yang , Wei Lin

The Advanced Encryption Standard (AES) algorithm is a symmetric block cipher which operates on a sequence of blocks each consists of 128, 192 or 256 bits. Moreover, the cipher key for the AES algorithm is a sequence of 128, 192 or 256 bits.…

Cryptography and Security · Computer Science 2015-01-08 Ghada F. Elkabbany , Heba K. Aslan , Mohamed N. Rasslan

In the application of IC design for microprocessors, there are often demands for optimizing the implementation of datapath circuits, on which various arithmetic operations are performed. Combinational equivalence checking (CEC) plays an…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-28 Zhihan Chen , Xindi Zhang , Yuhang Qian , Shaowei Cai

We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates. Through a novel perspective, we revisit and clarify a subtle but important technical issue present in…

Optimization and Control · Mathematics 2017-11-09 Rémi Leblond , Fabian Pedregosa , Simon Lacoste-Julien

Algorithm parallelization to leverage multi-core platforms for improving the efficiency of Electronic Design Automation~(EDA) tools plays a significant role in enhancing the scalability of Integrated Circuit (IC) designs. Logic optimization…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-23 Ye Cai , Zonglin Yang , Liwei Ni , Junfeng Liu , Biwei Xie , Xingquan Li

Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of big data and deep learning, SGD is no longer the most suitable choice due to its…

Machine Learning · Computer Science 2024-02-13 Anuraganand Sharma

Together with the improvements in state-of-the-art accuracies of various tasks, deep learning models are getting significantly larger. However, it is extremely difficult to implement these large models because limited GPU memory makes it…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-02 Boxiang Wang , Qifan Xu , Zhengda Bian , Yang You

As quantum computers continue to improve and support larger, more complex computations, smart control hardware and compilers are needed to efficiently leverage the capabilities of these systems. This paper introduces a novel approach to…

Quantum Physics · Physics 2025-11-19 Folkert de Ronde , Alexander Knapen , Stephan Wong , Sebastian Feld

The increasing size of deep learning models has made distributed training across multiple devices essential. However, current methods such as distributed data-parallel training suffer from large communication and synchronization overheads…

Machine Learning · Computer Science 2025-02-10 Cabrel Teguemne Fokam , Khaleelulla Khan Nazeer , Lukas König , David Kappel , Anand Subramoney

Parallel batched data structures are designed to process synchronized batches of operations in a parallel computing model. In this paper, we propose parallel combining, a technique that implements a concurrent data structure from a parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-14 Vitaly Aksenov , Petr Kuznetsov , Anatoly Shalyto

The Dadda algorithm is a parallel structured multiplier, which is quite faster as compared to array multipliers, i.e., Booth, Braun, Baugh-Wooley, etc. However, it consumes more power and needs a larger number of gates for hardware…

Systems and Control · Electrical Eng. & Systems 2023-07-13 Muteen Munawar , Zain Shabbir , Muhammad Akram

Extending Bayesian optimization to batch evaluation can enable the designer to make the most use of parallel computing technology. However, most of current batch approaches do not scale well with the batch size. That is, their performances…

Machine Learning · Computer Science 2025-04-25 Dawei Zhan , Zhaoxi Zeng , Shuoxiao Wei , Ping Wu

Anytime search algorithms are useful for planning problems where a solution is desired under a limited time budget. Anytime algorithms first aim to provide a feasible solution quickly and then attempt to improve it until the time budget…

Artificial Intelligence · Computer Science 2023-05-09 Hanlan Yang , Shohin Mukherjee , Maxim Likhachev

Prior work on Automatically Scalable Computation (ASC) suggests that it is possible to parallelize sequential computation by building a model of whole-program execution, using that model to predict future computations, and then…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-09-21 Peter Kraft , Amos Waterland , Daniel Y Fu , Anitha Gollamudi , Shai Szulanski , Margo Seltzer

Quality-Diversity (QD) optimization algorithms are a well-known approach to generate large collections of diverse and high-quality solutions. However, derived from evolutionary computation, QD algorithms are population-based methods which…

Neural and Evolutionary Computing · Computer Science 2022-10-11 Bryan Lim , Maxime Allard , Luca Grillotti , Antoine Cully

Parallel operations in conventional computing have proven to be an essential tool for efficient and practical computation, and the story is not different for quantum computing. Indeed, there exists a large body of works that study…

Quantum Physics · Physics 2022-02-02 Nikodem Grzesiak , Andrii Maksymov , Pradeep Niroula , Yunseong Nam

This paper proposes a parallel-in-time method for computing continuous-time maximum-a-posteriori (MAP) trajectory estimates of the states of partially observed stochastic differential equations (SDEs), with the goal of improving…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-16 Hassan Razavi , Ángel F. García-Fernández , Simo Särkkä

It is a challenging task to train large DNN models on sophisticated GPU platforms with diversified interconnect capabilities. Recently, pipelined training has been proposed as an effective approach for improving device utilization. However,…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-03 Shiqing Fan , Yi Rong , Chen Meng , Zongyan Cao , Siyu Wang , Zhen Zheng , Chuan Wu , Guoping Long , Jun Yang , Lixue Xia , Lansong Diao , Xiaoyong Liu , Wei Lin
‹ Prev 1 2 3 10 Next ›