English

PALS: Distributed Gradient Clocking on Chip

Hardware Architecture 2023-08-30 v1 Distributed, Parallel, and Cluster Computing

Abstract

Consider an arbitrary network of communicating modules on a chip, each requiring a local signal telling it when to execute a computational step. There are three common solutions to generating such a local clock signal: (i) by deriving it from a single, central clock source, (ii) by local, free-running oscillators, or (iii) by handshaking between neighboring modules. Conceptually, each of these solutions is the result of a perceived dichotomy in which (sub)systems are either clocked or asynchronous. We present a solution and its implementation that lies between these extremes. Based on a distributed gradient clock synchronization algorithm, we show a novel design providing modules with local clocks, the frequency bounds of which are almost as good as those of free-running oscillators, yet neighboring modules are guaranteed to have a phase offset substantially smaller than one clock cycle. Concretely, parameters obtained from a 15nm ASIC simulation running at 2GHz yield mathematical worst-case bounds of 20ps on the phase offset for a 32×3232 \times 32 node grid network.

Cite

@article{arxiv.2308.15098,
  title  = {PALS: Distributed Gradient Clocking on Chip},
  author = {Johannes Bund and Matthias Függer and Moti Medina},
  journal= {arXiv preprint arXiv:2308.15098},
  year   = {2023}
}