Related papers: Computing FFTs at Target Precision Using Lower-Pre…
The Fast Fourier Transform (FFT) is one of the most widely used algorithms in high performance computing, with critical applications in spectral analysis for both signal processing and the numerical solution of partial differential…
Fast Fourier Transform (FFT) is an essential tool in scientific and engineering computation. The increasing demand for mixed-precision FFT has made it possible to utilize half-precision floating-point (FP16) arithmetic for faster speed and…
The Discrete Fourier Transform (DFT) is essential for various applications ranging from signal processing to convolution and polynomial multiplication. The groundbreaking Fast Fourier Transform (FFT) algorithm reduces DFT time complexity…
Convolutional neural networks (CNNs) have a large number of variables and hence suffer from a complexity problem for their implementation. Different methods and techniques have developed to alleviate the problem of CNN's complexity, such as…
This paper presents a comprehensive exploration of Fast Fourier Transform (FFT) and linear convolution implementations, integrating both conventional methods and novel approaches leveraging the Bit Slicing Multiplier (BSM) technique. The…
Given a time series vector, how can we efficiently compute a specified part of Fourier coefficients? Fast Fourier transform (FFT) is a widely used algorithm that computes the discrete Fourier transform in many machine learning applications.…
The FFT of three-dimensional (3D) input data is an important computational kernel of numerical simulations and is widely used in High Performance Computing (HPC) codes running on a large number of processors. Performance of many scientific…
In this paper, we use multithreaded fast Fourier transforms provided in three highly optimized packages, FFTW-2.1.5, FFTW-3.3.7, and Intel MKL FFT, to present a novel model-based parallel computing technique as a very effective and portable…
The non-equidistant fast Fourier transform (NFFT) is an extension of the famous fast Fourier transform (FFT), which can be applied to non-equidistantly sampled data in time/space or frequency domain. It is an approximative algorithm that…
As the demand for AI computation rapidly increases, more hardware is being developed to efficiently perform the low-precision matrix multiplications required by such workloads. However, these operations are generally not directly applicable…
Fourier transform methods are used to analyze functions and data sets to provide frequencies, amplitudes, and phases of underlying oscillatory components. Fast Fourier transform (FFT) methods offer speed advantages over evaluation of…
The Fast Fourier Transform (FFT), as a core computation in a wide range of scientific applications, is increasingly threatened by reliability issues. In this paper, we introduce TurboFFT, a high-performance FFT implementation equipped with…
Convolution models with long filters have demonstrated state-of-the-art reasoning abilities in many long-sequence tasks but lag behind the most optimized Transformers in wall-clock time. A major bottleneck is the Fast Fourier Transform…
Nonequispaced discrete Fourier transformation (NDFT) is widely applied in all aspects of computational science and engineering. The computational efficiency and accuracy of NDFT has always been a critical issue in hindering its…
We have evaluated perturbation coefficients of Wilson loops up to $O(g^8)$ for the four-dimensional twisted Eguchi-Kawai model using the numerical stochastic perturbation theory (NSPT) in arXiv:1902.09847. In this talk we present a progress…
This paper shows that it is possible to improve the computational cost, the memory requirements and the accuracy of Quick Fourier Transform (QFT) algorithm for power-of-two FFT (Fast Fourier Transform) just introducing a slight modification…
Recently, a new polynomial basis over binary extension fields was proposed such that the fast Fourier transform (FFT) over such fields can be computed in the complexity of order $\mathcal{O}(n\lg(n))$, where $n$ is the number of points…
Computationally efficient numerical methods for high-order approximations of convolution integrals involving weakly singular kernels find many practical applications including those in the development of fast quadrature methods for…
The Number Theoretic Transform (NTT) is a critical computational bottleneck in many lattice-based postquantum cryptographic (PQC) algorithms. By leveraging the Fast Fourier Transform (FFT) algorithm, the NTT of a polynomial of degree N - 1…
This paper addresses emulation algorithms for matrix multiplication. General Matrix-Matrix Multiplication (GEMM), a fundamental operation in the Basic Linear Algebra Subprograms (BLAS), is typically optimized for specific hardware…