English

On Data-Driven Saak Transform

Computer Vision and Pattern Recognition 2017-10-17 v2

Abstract

Being motivated by the multilayer RECOS (REctified-COrrelations on a Sphere) transform, we develop a data-driven Saak (Subspace approximation with augmented kernels) transform in this work. The Saak transform consists of three steps: 1) building the optimal linear subspace approximation with orthonormal bases using the second-order statistics of input vectors, 2) augmenting each transform kernel with its negative, 3) applying the rectified linear unit (ReLU) to the transform output. The Karhunen-Lo\'eve transform (KLT) is used in the first step. The integration of Steps 2 and 3 is powerful since they resolve the sign confusion problem, remove the rectification loss and allow a straightforward implementation of the inverse Saak transform at the same time. Multiple Saak transforms are cascaded to transform images of a larger size. All Saak transform kernels are derived from the second-order statistics of input random vectors in a one-pass feedforward manner. Neither data labels nor backpropagation is used in kernel determination. Multi-stage Saak transforms offer a family of joint spatial-spectral representations between two extremes; namely, the full spatial-domain representation and the full spectral-domain representation. We select Saak coefficients of higher discriminant power to form a feature vector for pattern recognition, and use the MNIST dataset classification problem as an illustrative example.

Keywords

Cite

@article{arxiv.1710.04176,
  title  = {On Data-Driven Saak Transform},
  author = {C. -C. Jay Kuo and Yueru Chen},
  journal= {arXiv preprint arXiv:1710.04176},
  year   = {2017}
}

Comments

30 pages, 7 figures, 4 tables

R2 v1 2026-06-22T22:10:31.711Z