PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization

Tianshi Xu; Shuzhang Zhong; Wenxuan Zeng; Runsheng Wang; Meng Li

doi:10.1145/3676536.3676661

PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization

Cryptography and Security 2024-10-15 v1 Artificial Intelligence

Authors: Tianshi Xu , Shuzhang Zhong , Wenxuan Zeng , Runsheng Wang , Meng Li

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

Private deep neural network (DNN) inference based on secure two-party computation (2PC) enables secure privacy protection for both the server and the client. However, existing secure 2PC frameworks suffer from a high inference latency due to enormous communication. As the communication of both linear and non-linear DNN layers reduces with the bit widths of weight and activation, in this paper, we propose PrivQuant, a framework that jointly optimizes the 2PC-based quantized inference protocols and the network quantization algorithm, enabling communication-efficient private inference. PrivQuant proposes DNN architecture-aware optimizations for the 2PC protocols for communication-intensive quantized operators and conducts graph-level operator fusion for communication reduction. Moreover, PrivQuant also develops a communication-aware mixed precision quantization algorithm to improve inference efficiency while maintaining high accuracy. The network/protocol co-optimization enables PrivQuant to outperform prior-art 2PC frameworks. With extensive experiments, we demonstrate PrivQuant reduces communication by $11\times, 2.5\times \mathrm{and}~ 2.8\times$ , which results in $8.7\times, 1.8\times ~ \mathrm{and}~ 2.4\times$ latency reduction compared with SiRNN, COINN, and CoPriv, respectively.

Keywords

quantization differential privacy secure multi-party computation

Cite

@article{arxiv.2410.09531,
  title  = {PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization},
  author = {Tianshi Xu and Shuzhang Zhong and Wenxuan Zeng and Runsheng Wang and Meng Li},
  journal= {arXiv preprint arXiv:2410.09531},
  year   = {2024}
}

Comments

ICCAD 2024

PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization

Abstract

Keywords

Cite

Comments

Related papers