English
Related papers

Related papers: Curriculum Learning with Quality-Driven Data Selec…

200 papers

The rapid advancement of Large Language Models (LLMs) has significantly influenced various domains, leveraging their exceptional few-shot and zero-shot learning capabilities. In this work, we aim to explore and understand the LLMs-based…

Artificial Intelligence · Computer Science 2024-10-24 Dawei Li , Zhen Tan , Huan Liu

Recent research has highlighted the importance of data quality in scaling large language models (LLMs). However, automated data quality control faces unique challenges in collaborative settings where sharing is not allowed directly between…

Computation and Language · Computer Science 2025-07-08 Wanru Zhao , Hongxiang Fan , Shell Xu Hu , Wangchunshu Zhou , Bofan Chen , Nicholas D. Lane

In the field of information retrieval, Query Likelihood Models (QLMs) rank documents based on the probability of generating the query given the content of a document. Recently, advanced large language models (LLMs) have emerged as effective…

Information Retrieval · Computer Science 2023-10-23 Shengyao Zhuang , Bing Liu , Bevan Koopman , Guido Zuccon

Large Language Model (LLM) pre-training exhausts an ever growing compute budget, yet recent research has demonstrated that careful document selection enables comparable model quality with only a fraction of the FLOPs. Inspired by efforts…

Computation and Language · Computer Science 2024-06-10 Xiang Kong , Tom Gunter , Ruoming Pang

Multi-modal large language models (MLLMs), such as GPT-4o, excel at integrating text and visual data but face systematic challenges when interpreting ambiguous or incomplete visual stimuli. This study leverages statistical modeling to…

Machine Learning · Computer Science 2024-12-09 Ching-Yi Wang

Injecting world knowledge into pretrained multimodal large language models (MLLMs) is essential for domain-specific applications. Task-specific fine-tuning achieves this by tailoring MLLMs to high-quality in-domain data but encounters…

Multimedia · Computer Science 2026-03-31 Xiao An , Jiaxing Sun , Ting Hu , Wei He

The integration of Artificial Intelligence (AI), particularly Large Language Model (LLM)-based systems, in education has shown promise in enhancing teaching and learning experiences. However, the advent of Multimodal Large Language Models…

Large language models (LLMs) are very proficient text generators. We leverage this capability of LLMs to generate task-specific data via zero-shot prompting and promote cross-lingual transfer for low-resource target languages. Given…

Computation and Language · Computer Science 2024-07-16 Barah Fazili , Ashish Sunil Agrawal , Preethi Jyothi

The rapid advancement of Large Language Models (LLMs) has improved text understanding and generation but poses challenges in computational resources. This study proposes a curriculum learning-inspired, data-centric training strategy that…

Computation and Language · Computer Science 2024-05-14 Jisu Kim , Juhwan Lee

We propose an adaptation of the curriculum training framework, applicable to state-of-the-art meta learning techniques for few-shot classification. Curriculum-based training popularly attempts to mimic human learning by progressively…

Machine Learning · Computer Science 2021-12-07 Emmanouil Stergiadis , Priyanka Agrawal , Oliver Squire

The Multimodal Large Language Models (MLLMs) are continually pre-trained on a mixture of image-text caption data and interleaved document data, while the high-quality data filtering towards image-text interleaved document data is…

Computer Vision and Pattern Recognition · Computer Science 2025-10-20 Weizhi Wang , Rongmei Lin , Shiyang Li , Colin Lockard , Ritesh Sarkhel , Sanket Lokegaonkar , Jingbo Shang , Xifeng Yan , Nasser Zalmout , Xian Li

Multimodal Large Language Models (MLLMs) have demonstrated strong cross-modal reasoning capabilities, yet their potential for vision-only tasks remains underexplored. We investigate MLLMs as training-free similarity estimators for…

Computer Vision and Pattern Recognition · Computer Science 2026-04-16 Bahey Tharwat , Giorgos Kordopatis-Zilos , Pavel Suma , Ian Reid , Giorgos Tolias

This paper presents a comprehensive exploration of leveraging Large Language Models (LLMs), specifically GPT-4, in the field of instructional design. With a focus on scaling evidence-based instructional design expertise, our research aims…

Computation and Language · Computer Science 2023-06-27 Gautam Yadav

Large language models (LLMs) have demonstrated competitive performance in zero-shot multilingual machine translation (MT). Some follow-up works further improved MT performance via preference optimization, but they leave a key aspect largely…

Computation and Language · Computer Science 2026-04-20 Alexandra Dragomir , Florin Brad , Radu Tudor Ionescu

While Multimodal Large Language Models (MLLMs) have experienced significant advancement in visual understanding and reasoning, their potential to serve as powerful, flexible, interpretable, and text-driven models for Image Quality…

Computer Vision and Pattern Recognition · Computer Science 2024-07-12 Tianhe Wu , Kede Ma , Jie Liang , Yujiu Yang , Lei Zhang

The composition of pre-training datasets for large language models (LLMs) remains largely undisclosed, hindering transparency and efforts to optimize data quality, a critical driver of model performance. Current data selection methods, such…

Computation and Language · Computer Science 2025-08-07 Xinlin Zhuang , Jiahui Peng , Ren Ma , Yinfan Wang , Tianyi Bai , Xingjian Wei , Jiantao Qiu , Chi Zhang , Ying Qian , Conghui He

Data selection in instruction tuning emerges as a pivotal process for acquiring high-quality data and training instruction-following large language models (LLMs), but it is still a new and unexplored research area for vision-language models…

Computation and Language · Computer Science 2024-02-21 Ruibo Chen , Yihan Wu , Lichang Chen , Guodong Liu , Qi He , Tianyi Xiong , Chenxi Liu , Junfeng Guo , Heng Huang

For specialized domains, there is often not a wealth of data with which to train large machine learning models. In such limited data / compute settings, various methods exist aiming to $\textit{do more with less}$, such as finetuning from a…

Machine Learning · Computer Science 2024-10-22 Rohan Saha , Abrar Fahim , Alona Fyshe , Alex Murphy

In recent years, with the rapid development of powerful multimodal large language models (MLLMs), explainable image quality assessment (IQA) has gradually become popular, aiming at providing quality-related descriptions and answers of…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Yunhao Li , Sijing Wu , Huiyu Duan , Yucheng Zhu , Qi Jia , Guangtao Zhai

Clinical decision support systems require models that are not only highly accurate but also equitable and sensitive to the implications of missed diagnoses. In this study, we introduce a knowledge-guided in-context learning (ICL) framework…

Machine Learning · Computer Science 2025-07-28 Fatemeh Nazary , Yashar Deldjoo , Tommaso Di Noia , Eugenio di Sciascio
‹ Prev 1 2 3 10 Next ›