Related papers: A parallel butterfly algorithm
This paper is concerned with the fast computation of Fourier integral operators of the general form $\int_{\R^d} e^{2\pi\i \Phi(x,k)} f(k) d k$, where $k$ is a frequency variable, $\Phi(x,k)$ is a phase function obeying a standard…
This paper introduces the multidimensional butterfly factorization as a data-sparse representation of multidimensional kernel matrices that satisfy the complementary low-rank property. This factorization approximates such a kernel matrix of…
Butterflies are the smallest non-trivial subgraph in bipartite graphs, and therefore having efficient computations for analyzing them is crucial to improving the quality of certain applications on bipartite graphs. In this paper, we design…
We introduce a fast algorithm for computing sparse Fourier transforms supported on smooth curves or surfaces. This problem appear naturally in several important problems in wave scattering and reflection seismology. The main observation is…
This paper presents an efficient multiscale butterfly algorithm for computing Fourier integral operators (FIOs) of the form $(\mathcal{L} f)(x) = \int_{R^d}a(x,\xi) e^{2\pi \i \Phi(x,\xi)}\hat{f}(\xi) d\xi$, where $\Phi(x,\xi)$ is a phase…
This paper concerns the fast evaluation of the matvec $g=Kf$ for $K\in \mathbb{C}^{N\times N}$, which is the discretization of the oscillatory integral transform $g(x) = \int K(x,\xi) f(\xi)d\xi$ with a kernel function…
The paper introduces the butterfly factorization as a data-sparse approximation for the matrices that satisfy a complementary low-rank property. The factorization can be constructed efficiently if either fast algorithms for applying the…
We describe an algorithm for the application of the forward and inverse spherical harmonic transforms. It is based on a new method for rapidly computing the forward and inverse associated Legendre transforms by hierarchically applying the…
This paper presents an adaptive randomized algorithm for computing the butterfly factorization of a $m\times n$ matrix with $m\approx n$ provided that both the matrix and its transpose can be rapidly applied to arbitrary vectors. The…
Recent neural networks (NNs) with self-attention exhibit competitiveness across different AI domains, but the essential attention mechanism brings massive computation and memory demands. To this end, various sparsity patterns are introduced…
Kernel matrix-vector product is ubiquitous in many science and engineering applications. However, a naive method requires $O(N^2)$ operations, which becomes prohibitive for large-scale problems. We introduce a parallel method that provably…
Many matrices associated with fast transforms posess a certain low-rank property characterized by the existence of several block partitionings of the matrix, where each block is of low rank. Provided that these partitionings are known,…
This paper focuses on the fast evaluation of the matvec $g=Kf$ for $K\in \mathbb{C}^{N\times N}$, which is the discretization of a multidimensional oscillatory integral transform $g(x) = \int K(x,\xi) f(\xi)d\xi$ with a kernel function…
We accelerate the computation of spherical harmonic transforms, using what is known as the butterfly scheme. This provides a convenient alternative to the approach taken in the second paper from this series on "Fast algorithms for spherical…
Bootstrap particle filter (BPF) is the corner stone of many popular algorithms used for solving inference problems involving time series that are observed through noisy measurements in a non-linear and non-Gaussian context. The long term…
We introduce an end-to-end deep learning architecture called the wide-band butterfly network (WideBNet) for approximating the inverse scattering map from wide-band scattering data. This architecture incorporates tools from computational…
Fast linear transforms are ubiquitous in machine learning, including the discrete Fourier transform, discrete cosine transform, and other structured transformations such as convolutions. All of these transforms can be represented by dense…
As the artificial intelligence community advances into the era of large models with billions of parameters, distributed training and inference have become essential. While various parallelism strategies-data, model, sequence, and…
Computing $k$-Nearest Neighbors (KNN) is one of the core kernels used in many machine learning, data mining and scientific computing applications. Although kd-tree based $O(\log n)$ algorithms have been proposed for computing KNN, due to…
Shuffling is the process of rearranging a sequence of elements into a random order such that any permutation occurs with equal probability. It is an important building block in a plethora of techniques used in virtually all scientific…