English
Related papers

Related papers: Data Processing Techniques for Modern Multimodal M…

200 papers

Multimodal large language models (MLLMs) enhance the capabilities of standard large language models by integrating and processing data from multiple modalities, including text, vision, audio, video, and 3D environments. Data plays a pivotal…

Artificial Intelligence · Computer Science 2024-07-19 Tianyi Bai , Hao Liang , Binwang Wan , Yanran Xu , Xi Li , Shiyu Li , Ling Yang , Bozhou Li , Yifan Wang , Bin Cui , Ping Huang , Jiulong Shan , Conghui He , Binhang Yuan , Wentao Zhang

Data plays a fundamental role in training Large Language Models (LLMs). Efficient data management, particularly in formulating a well-suited training dataset, is significant for enhancing model performance and improving training efficiency…

Computation and Language · Computer Science 2024-08-05 Zige Wang , Wanjun Zhong , Yufei Wang , Qi Zhu , Fei Mi , Baojun Wang , Lifeng Shang , Xin Jiang , Qun Liu

Multimodal Large Models (MLMs) are becoming a significant research focus, combining powerful large language models with multimodal learning to perform complex tasks across different data modalities. This review explores the latest…

Machine Learning · Computer Science 2024-07-02 Xinji Mai , Zeng Tao , Junxiong Lin , Haoran Wang , Yang Chang , Yanlan Kang , Yan Wang , Wenqiang Zhang

Multimodal learning, a rapidly evolving field in artificial intelligence, seeks to construct more versatile and robust systems by integrating and analyzing diverse types of data, including text, images, audio, and video. Inspired by the…

Artificial Intelligence · Computer Science 2024-12-24 Priyaranjan Pattnayak , Hitesh Laxmichand Patel , Bhargava Kumar , Amit Agarwal , Ishan Banerjee , Srikant Panda , Tejaswini Kumar

The exploration of multimodal language models integrates multiple data types, such as images, text, language, audio, and other heterogeneity. While the latest large language models excel in text-based tasks, they often struggle to…

Artificial Intelligence · Computer Science 2023-11-23 Jiayang Wu , Wensheng Gan , Zefeng Chen , Shicheng Wan , Philip S. Yu

Instruction tuning is a vital step of training large language models (LLMs), so how to enhance the effect of instruction tuning has received increased attention. Existing works indicate that the quality of the dataset is more crucial than…

Computation and Language · Computer Science 2025-08-27 Bolin Zhang , Jiahao Wang , Qianlong Du , Jiajun Zhang , Zhiying Tu , Dianhui Chu

Multimodal Large Language Models (MLLMs) have become increasingly important due to their state-of-the-art performance and ability to integrate multiple data modalities, such as text, images, and audio, to perform complex tasks with high…

In this paper, I outline several conceptual and methodological issues related to modeling individual and group processes embedded in clustered/hierarchical data structures. We position multilevel modeling techniques within a broader set of…

Methodology · Statistics 2022-12-29 Amira Ibrahim El-Desokey

Large language models (LLMs) rely on pretraining on massive and heterogeneous corpora, where training data composition has a decisive impact on training efficiency and downstream generalization under realistic compute and data budget…

Computation and Language · Computer Science 2026-04-21 Zhuo Chen , Yuxuan Miao , Supryadi , Deyi Xiong

While large-scale training data is fundamental for developing capable large language models (LLMs), strategically selecting high-quality data has emerged as a critical approach to enhance training efficiency and reduce computational costs.…

Machine Learning · Computer Science 2025-07-23 Yang Yu , Kai Han , Hang Zhou , Yehui Tang , Kaiqi Huang , Yunhe Wang , Dacheng Tao

Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple sources of information for improved…

Machine Learning · Computer Science 2024-02-13 Felix Krones , Umar Marikkar , Guy Parsons , Adam Szmul , Adam Mahdi

Tables, typically two-dimensional and structured to store large amounts of data, are essential in daily activities like database queries, spreadsheet manipulations, web table question answering, and image table information extraction.…

Artificial Intelligence · Computer Science 2024-11-05 Weizheng Lu , Jing Zhang , Ju Fan , Zihao Fu , Yueguo Chen , Xiaoyong Du

Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However,…

Machine Learning · Computer Science 2020-08-11 Meng Wang , Weijie Fu , Xiangnan He , Shijie Hao , Xindong Wu

In the realm of Business Process Management (BPM), process modeling plays a crucial role in translating complex process dynamics into comprehensible visual representations, facilitating the understanding, analysis, improvement, and…

Software Engineering · Computer Science 2024-07-01 Humam Kourani , Alessandro Berti , Daniel Schuster , Wil M. P. van der Aalst

This survey discusses how recent developments in multimodal processing facilitate conceptual grounding of language. We categorize the information flow in multimodal processing with respect to cognitive models of human information processing…

Computation and Language · Computer Science 2019-07-04 Lisa Beinborn , Teresa Botschen , Iryna Gurevych

Diffusion-based generative modeling has been achieving state-of-the-art results on various generation tasks. Most diffusion models, however, are limited to a single-generation modeling. Can we generalize diffusion models with the ability of…

Computer Vision and Pattern Recognition · Computer Science 2024-09-26 Changyou Chen , Han Ding , Bunyamin Sisman , Yi Xu , Ouye Xie , Benjamin Z. Yao , Son Dinh Tran , Belinda Zeng

In an era defined by the explosive growth of data and rapid technological advancements, Multimodal Large Language Models (MLLMs) stand at the forefront of artificial intelligence (AI) systems. Designed to seamlessly integrate diverse data…

Large models, encompassing large language and diffusion models, have shown exceptional promise in approximating human-level intelligence, garnering significant interest from both academic and industrial spheres. However, the training of…

Machine Learning · Computer Science 2024-03-05 Yue Zhou , Chenlu Guo , Xu Wang , Yi Chang , Yuan Wu

Large language models are deep learning models with a large number of parameters. The models made noticeable progress on a large number of tasks, and as a consequence allowing them to serve as valuable and versatile tools for a diverse…

Software Engineering · Computer Science 2023-04-11 Maxim Vidgof , Stefan Bachhofner , Jan Mendling

With the rapid development of the large model domain, research related to fine-tuning has concurrently seen significant advancement, given that fine-tuning is a constituent part of the training process for large-scale models. Data…

Computation and Language · Computer Science 2024-07-12 Runyuan Ma , Wei Li , Fukai Shang
‹ Prev 1 2 3 10 Next ›