English
Related papers

Related papers: Efficient Distributed MLLM Training with Cornstarc…

200 papers

The rapid scaling of Large Language Models (LLMs) has pushed training workloads far beyond the limits of single-node analysis, demanding a deeper understanding of how these models behave across large-scale, multi-GPU systems. In this paper,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-22 Seokjin Go , Joongun Park , Spandan More , Hanjiang Wu , Irene Wang , Aaron Jezghani , Tushar Krishna , Divya Mahajan

In an era defined by the explosive growth of data and rapid technological advancements, Multimodal Large Language Models (MLLMs) stand at the forefront of artificial intelligence (AI) systems. Designed to seamlessly integrate diverse data…

Scaling long-context capabilities is crucial for Multimodal Large Language Models (MLLMs). However, real-world multimodal datasets are extremely heterogeneous. Existing training frameworks predominantly rely on static parallelism…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-26 Yifan Niu , Han Xiao , Dongyi Liu , Wei Zhou , Jia Li

Multi-modal Large Language Model (MLLM) refers to a model expanded from a Large Language Model (LLM) that possesses the capability to handle and infer multi-modal data. Current MLLMs typically begin by using LLMs to decompose tasks into…

Computation and Language · Computer Science 2023-09-01 Yongqiang Zhao , Zhenyu Li , Feng Zhang , Xinhai Xu , Donghong Liu

Multimodal Large Language Models (MLLMs) have achieved remarkable advances by integrating text, image, and audio understanding within a unified architecture. However, existing distributed training frameworks remain fundamentally data-blind:…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-20 Hyeonjun An , Sihyun Kim , Chaerim Lim , Hyunjoon Kim , Rathijit Sen , Sangmin Jung , Hyeonsoo Lee , Dongwook Kim , Takki Yu , Jinkyu Jeong , Youngsok Kim , Kwanghyun Park

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance in tasks such as visual question answering, visual understanding and reasoning. However, the extensive model size and high training and…

Computer Vision and Pattern Recognition · Computer Science 2026-01-23 Yizhang Jin , Jian Li , Yexin Liu , Tianjun Gu , Kai Wu , Zhengkai Jiang , Muyang He , Bo Zhao , Xin Tan , Zhenye Gan , Yabiao Wang , Chengjie Wang , Lizhuang Ma

With the rapid growth of large language models (LLMs), a wide range of methods have been developed to distribute computation and memory across hardware devices for efficient training and inference. While existing surveys provide descriptive…

Machine Learning · Computer Science 2026-02-11 Hossam Amer , Rezaul Karim , Ali Pourranjbar , Weiwei Zhang , Walid Ahmed , Boxing Chen

The rapid development of large language models (LLMs) has been witnessed in recent years. Based on the powerful LLMs, multi-modal LLMs (MLLMs) extend the modality from text to a broader spectrum of domains, attracting widespread attention…

Artificial Intelligence · Computer Science 2024-08-06 Zhen Qin , Daoyuan Chen , Wenhao Zhang , Liuyi Yao , Yilun Huang , Bolin Ding , Yaliang Li , Shuiguang Deng

Multimodal large language models (MLLMs) have extended the success of large language models (LLMs) to multiple data types, such as image, text and audio, achieving significant performance in various domains, including multimodal…

Computation and Language · Computer Science 2025-06-03 Weiqi Feng , Yangrui Chen , Shaoyu Wang , Yanghua Peng , Haibin Lin , Minlan Yu

Foundation models update slowly due to resource-intensive training, whereas domain-specific models evolve rapidly between releases. Model merging seeks to combine multiple expert models into a single, more capable model, reducing storage…

Artificial Intelligence · Computer Science 2026-03-04 Yongxian Wei , Runxi Cheng , Weike Jin , Enneng Yang , Li Shen , Lu Hou , Sinan Du , Chun Yuan , Xiaochun Cao , Dacheng Tao

Multimodal large language models (MLLMs) extend LLMs to handle images, videos, and audio by incorporating feature extractors and projection modules. However, these additional components -- combined with complex inference pipelines and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-12 Zedong Liu , Shenggan Cheng , Guangming Tan , Yang You , Dingwen Tao

Multimodal large language models (MLLMs), such as GPT-4o, are garnering significant attention. During the exploration of MLLM training, we identified Modality Composition Incoherence, a phenomenon that the proportion of a certain modality…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-13 Yijie Zheng , Bangjun Xiao , Lei Shi , Xiaoyang Li , Faming Wu , Tianyu Li , Xuefeng Xiao , Yang Zhang , Yuxuan Wang , Shouda Liu

In recent years, the training requirements of many state-of-the-art Deep Learning (DL) models have scaled beyond the compute and memory capabilities of a single processor, and necessitated distribution among processors. Training such…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-16 Quentin Anthony , Ammar Ahmad Awan , Jeff Rasley , Yuxiong He , Aamir Shafi , Mustafa Abduljabbar , Hari Subramoni , Dhabaleswar Panda

As the foundational component of versatile AI applications, training an multimodal large language model (MLLM) relies on multimodal datasets with dynamic modality mixture proportions and sample length distributions. However, existing MLLM…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-12 Chunyu Xue , Yangrui Chen , Jianyu Jiang , Ningxin Zheng , Junda Feng , Jingji Chen , Shixiong Zhao , Shen Yan , Yi Lin , Lei Shi , Zanbo Wang , Lishu Luo , Faming Wu , Haibin Lin , Xin Liu , Yanghua Peng , Quan Chen

Large Language Models (LLMs) represent a class of deep learning models adept at understanding natural language and generating coherent responses to various prompts or queries. These models far exceed the complexity of conventional neural…

Machine Learning · Computer Science 2024-12-05 Minghao Shao , Abdul Basit , Ramesh Karri , Muhammad Shafique

With the rapid adoption of large language models (LLMs) in recommendation systems, the computational and communication bottlenecks caused by their massive parameter sizes and large data volumes have become increasingly prominent. This paper…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-25 Haowei Yang , Yu Tian , Zhongheng Yang , Zhao Wang , Chengrui Zhou , Dannier Li

Extending the context length (i.e., the maximum supported sequence length) of LLMs is of paramount significance. To facilitate long context training of LLMs, sequence parallelism has emerged as an essential technique, which scatters each…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-12 Yujie Wang , Shiju Wang , Shenhan Zhu , Fangcheng Fu , Xinyi Liu , Xuefeng Xiao , Huixia Li , Jiashi Li , Faming Wu , Bin Cui

Multimodal Large Language Models (MLLMs) have made significant advancements, demonstrating powerful capabilities in processing and understanding multimodal data. Fine-tuning MLLMs with Federated Learning (FL) allows for expanding the…

Machine Learning · Computer Science 2025-03-11 Binqian Xu , Xiangbo Shu , Haiyang Mei , Guosen Xie , Basura Fernando , Jinhui Tang

Multimodal information retrieval (MMIR) has gained attention for its flexibility in handling text, images, or mixed queries and candidates. Recent breakthroughs in multimodal large language models (MLLMs) boost MMIR performance by…

Information Retrieval · Computer Science 2026-02-27 Dawei Su , Dongsheng Wang

Based on the foundation of Large Language Models (LLMs), Multilingual LLMs (MLLMs) have been developed to address the challenges faced in multilingual natural language processing, hoping to achieve knowledge transfer from high-resource…

Computation and Language · Computer Science 2024-12-10 Yuemei Xu , Ling Hu , Jiayi Zhao , Zihan Qiu , Kexin XU , Yuqi Ye , Hanwen Gu
‹ Prev 1 2 3 10 Next ›