English
Related papers

Related papers: LONGCODEU: Benchmarking Long-Context Language Mode…

200 papers

Context lengths for models have grown rapidly, from thousands to millions of tokens in just a few years. The extreme context sizes of modern long-context models have made it difficult to construct realistic long-context benchmarks -- not…

Computation and Language · Computer Science 2025-10-23 Stefano Rando , Luca Romani , Alessio Sampieri , Luca Franco , John Yang , Yuta Kyuragi , Fabio Galasso , Tatsunori Hashimoto

Recent advances in Code Large Language Models (CodeLLMs) have primarily focused on open-ended code generation, often overlooking the crucial aspect of code understanding and reasoning. To bridge this gap, we introduce CodeMMLU, a…

Software Engineering · Computer Science 2025-04-10 Dung Nguyen Manh , Thang Phan Chau , Nam Le Hai , Thong T. Doan , Nam V. Nguyen , Quang Pham , Nghi D. Q. Bui

Long Context Understanding (LCU) is a critical area for exploration in current large language models (LLMs). However, due to the inherently lengthy nature of long-text data, existing LCU benchmarks for LLMs often result in prohibitively…

Computation and Language · Computer Science 2025-07-31 Zhongzhan Huang , Guoming Ling , Shanshan Zhong , Hefeng Wu , Liang Lin

Although large language models (LLMs) demonstrate impressive performance for many language tasks, most of them can only handle texts a few thousand tokens long, limiting their applications on longer sequence inputs, such as books, reports,…

Computation and Language · Computer Science 2024-06-21 Yushi Bai , Xin Lv , Jiajie Zhang , Hongchang Lyu , Jiankai Tang , Zhidian Huang , Zhengxiao Du , Xiao Liu , Aohan Zeng , Lei Hou , Yuxiao Dong , Jie Tang , Juanzi Li

The emergence of long-context language models with context windows extending to millions of tokens has created new opportunities for sophisticated code understanding and software development evaluation. We propose LoCoBench, a comprehensive…

Large language models (LLMs), despite their impressive performance in various language tasks, are typically limited to processing texts within context-window size. This limitation has spurred significant research efforts to enhance LLMs'…

Computation and Language · Computer Science 2024-09-09 Jiaqi Li , Mengmeng Wang , Zilong Zheng , Muhan Zhang

The long-context capabilities of large language models (LLMs) have been a hot topic in recent years. To evaluate the performance of LLMs in different scenarios, various assessment benchmarks have emerged. However, as most of these…

Computation and Language · Computer Science 2025-08-14 Shawn Gavin , Tuney Zheng , Jiaheng Liu , Quehry Que , Noah Wang , Jian Yang , Chenchen Zhang , Wenhao Huang , Ge Zhang

Large language models (LLMs) are equipped with increasingly extended context windows recently, yet their long context understanding capabilities over long dependency tasks remain fundamentally limited and underexplored. This gap is…

Computation and Language · Computer Science 2025-10-28 Ziyuan He , Yuxuan Wang , Jiaqi Li , Kexin Liang , Muhan Zhang

Large Multimodal Models (LMMs) have demonstrated impressive performance in short video understanding tasks but face great challenges when applied to long video understanding. In contrast, Large Language Models (LLMs) exhibit outstanding…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Hongchen Wei , Zhenzhong Chen

Long-context language models (LCLMs) have exhibited impressive capabilities in long-context understanding tasks. Among these, long-context referencing -- a crucial task that requires LCLMs to attribute items of interest to specific parts of…

Computation and Language · Computer Science 2025-08-05 Junjie Wu , Gefei Gu , Yanan Zheng , Dit-Yan Yeung , Arman Cohan

Recent advancements in Large Language Models (LLMs) have demonstrated sophisticated capabilities, including the ability to process and comprehend extended contexts. These emergent capabilities necessitate rigorous evaluation methods to…

Code review is a cornerstone of software quality assurance, and recent advances in Large Language Models (LLMs) have shown promise in its automation. However, existing benchmarks for LLM-based code review face three major limitations. Lack…

Software Engineering · Computer Science 2026-01-01 Ruida Hu , Xinchen Wang , Xin-Cheng Wen , Zhao Zhang , Bo Jiang , Pengfei Gao , Chao Peng , Cuiyun Gao

Understanding context is key to understanding human language, an ability which Large Language Models (LLMs) have been increasingly seen to demonstrate to an impressive extent. However, though the evaluation of LLMs encompasses various…

Computation and Language · Computer Science 2024-02-02 Yilun Zhu , Joel Ruben Antony Moniz , Shruti Bhargava , Jiarui Lu , Dhivya Piraviperumal , Site Li , Yuan Zhang , Hong Yu , Bo-Hsiang Tseng

Long-context capability is considered one of the most important abilities of LLMs, as a truly long context-capable LLM enables users to effortlessly process many originally exhausting tasks -- e.g., digesting a long-form document to find…

Computation and Language · Computer Science 2025-05-27 Wang Yang , Hongye Jin , Shaochen Zhong , Song Jiang , Qifan Wang , Vipin Chaudhary , Xiaotian Han

Multiple recent studies have documented large language models' (LLMs) performance on calling external tools/functions. Others focused on LLMs' abilities to handle longer context lengths. At the intersection of these areas lies another…

The rapid advancement of large vision language models (LVLMs) has led to a significant expansion of their context windows. However, an extended context window does not guarantee the effective utilization of the context, posing a critical…

Computer Vision and Pattern Recognition · Computer Science 2025-10-16 Keyan Zhou , Zecheng Tang , Lingfeng Ming , Guanghao Zhou , Qiguang Chen , Dan Qiao , Zheming Yang , Libo Qin , Minghui Qiu , Juntao Li , Min Zhang

Large language models (LLMs) have demonstrated remarkable progress in understanding long-context inputs. However, benchmarks for evaluating the long-context reasoning abilities of LLMs fall behind the pace. Existing benchmarks often focus…

Computation and Language · Computer Science 2025-11-19 Zhan Ling , Kang Liu , Kai Yan , Yifan Yang , Weijian Lin , Ting-Han Fan , Lingfeng Shen , Zhengyin Du , Jiecao Chen

Large Language Models (LLMs) have made significant strides in handling long sequences. Some models like Gemini could even to be capable of dealing with millions of tokens. However, their performance evaluation has largely been confined to…

Computation and Language · Computer Science 2024-06-13 Tianle Li , Ge Zhang , Quy Duc Do , Xiang Yue , Wenhu Chen

Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document…

Computer Vision and Pattern Recognition · Computer Science 2024-11-13 Yubo Ma , Yuhang Zang , Liangyu Chen , Meiqi Chen , Yizhu Jiao , Xinze Li , Xinyuan Lu , Ziyu Liu , Yan Ma , Xiaoyi Dong , Pan Zhang , Liangming Pan , Yu-Gang Jiang , Jiaqi Wang , Yixin Cao , Aixin Sun

Large Language Models (LLMs) have demonstrated remarkable capabilities in code understanding and generation. However, their effectiveness on non-code Software Engineering (SE) tasks remains underexplored. We present 'Software Engineering…

Software Engineering · Computer Science 2026-02-12 Fabian C. Peña , Steffen Herbold
‹ Prev 1 2 3 10 Next ›