Efficient Distributed MLLM Training with Cornstarch

Insu Jang; Runyu Lu; Nikhil Bansal; Ang Chen; Mosharaf Chowdhury

Efficient Distributed MLLM Training with Cornstarch

Distributed, Parallel, and Cluster Computing 2026-05-26 v4

Authors: Insu Jang , Runyu Lu , Nikhil Bansal , Ang Chen , Mosharaf Chowdhury

Abstract

Multimodal large language models (MLLMs) extend the capabilities of large language models (LLMs) by combining heterogeneous model architectures to handle diverse modalities like images and audio. However, this inherent heterogeneity in MLLM model structure and data types makes makeshift extensions to existing LLM training frameworks unsuitable for efficient MLLM training. While there are a few works that have attempted to address the heterogeneity in MLLM training, their approaches are limited to only superficially considering the characteristics of MLLMs. In this paper, we present Cornstarch, an efficient distributed MLLM training framework that contemplates MLLM's unique characteristics in both model and data parallelization. Cornstarch introduces frozen-aware pipeline parallelism and token workload-balanced context parallelism to improve MLLM training throughput. Our extensive evaluation shows that Cornstarch outperforms state-of-the-art solutions by $2.26\times$ on average in terms of MLLM training throughput. Cornstarch is an open-source project available at https://github.com/cornstarch-org/Cornstarch.

Keywords

large language model training multimodal learning instruction tuning

Cite

@article{arxiv.2503.11367,
  title  = {Efficient Distributed MLLM Training with Cornstarch},
  author = {Insu Jang and Runyu Lu and Nikhil Bansal and Ang Chen and Mosharaf Chowdhury},
  journal= {arXiv preprint arXiv:2503.11367},
  year   = {2026}
}

Comments

ICML'26

Efficient Distributed MLLM Training with Cornstarch

Abstract

Keywords

Cite

Comments

Related papers