English

Dual-Numbers Reverse AD, Efficiently

Programming Languages 2022-05-24 v1

Abstract

Where dual-numbers forward-mode automatic differentiation (AD) pairs each scalar value with its tangent derivative, dual-numbers /reverse-mode/ AD attempts to achieve reverse AD using a similarly simple idea: by pairing each scalar value with a backpropagator function. Its correctness and efficiency on higher-order input languages have been analysed by Brunel, Mazza and Pagani, but this analysis was on a custom operational semantics for which it is unclear whether it can be implemented efficiently. We take inspiration from their use of /linear factoring/ to optimise dual-numbers reverse-mode AD to an algorithm that has the correct complexity and enjoys an efficient implementation in a standard functional language with resource-linear types, such as Haskell. Aside from the linear factoring ingredient, our optimisation steps consist of well-known ideas from the functional programming community. Furthermore, we observe a connection with classical imperative taping-based reverse AD, as well as Kmett's 'ad' Haskell library, recently analysed by Krawiec et al. We demonstrate the practical use of our technique by providing a performant implementation that differentiates most of Haskell98.

Keywords

Cite

@article{arxiv.2205.11368,
  title  = {Dual-Numbers Reverse AD, Efficiently},
  author = {Tom Smeding and Matthijs Vákár},
  journal= {arXiv preprint arXiv:2205.11368},
  year   = {2022}
}