English

Optimizing Split Learning Latency in TinyML-Based IoT Systems

Networking and Internet Architecture 2026-05-07 v2 Artificial Intelligence Distributed, Parallel, and Cluster Computing

Abstract

Split learning (SL) addresses the limitation of running deep learning inference directly on low-power edge/IoT nodes, in which it executes part of the inference process on the sensor and offloading the remainder to a companion device. Despite its promise, the inference latency of SL on constrained hardware under realistic low-power wireless protocols remains unexplored. This paper presents the first experimental latency benchmark of TinyML-based SL on ESP32-S3 boards, comparing four wireless communication protocol solutions (UDP, TCP, ESP-NOW, BLE). We also analyze the impact of the choice of different split points across different models (MobileNet-V2 and ResNet50) in terms of communication and computation overhead as a way to minimize the end-to-end inference latency. We propose a Beam Search-based algorithm for split point optimization that minimizes end-to-end latency, and compare it with other methods, including Greedy Search, First-Fit, Random-Fit, and Brute Force. ESP-NOW achieves the best RTT (3.6 s) and serves as the base protocol for the algorithm, which delivers near-optimal latency with processing time of 0.1 s for 5 devices.

Keywords

Cite

@article{arxiv.2507.16594,
  title  = {Optimizing Split Learning Latency in TinyML-Based IoT Systems},
  author = {Zied Jenhani and Mounir Bensalem and Jasenka Dizdarević and Admela Jukan},
  journal= {arXiv preprint arXiv:2507.16594},
  year   = {2026}
}

Comments

This paper is uploaded here for research community, thus it is for non-commercial purposes