English
Related papers

Related papers: Parallel Context Windows for Large Language Models

200 papers

We identify two crucial limitations in the evaluation of recent parallel-integrated method Parallel Context Windows (PCW), which extends the maximum context lengths of language models, e.g., 2048 for LLaMA, by harnessing window-wise…

Computation and Language · Computer Science 2023-05-25 Kejuan Yang , Xiao Liu , Kaiwen Men , Aohan Zeng , Yuxiao Dong , Jie Tang

Large language model (LLM) providers boast big numbers for maximum context window sizes. To test the real world use of context windows, we 1) define a concept of maximum effective context window, 2) formulate a testing method of a context…

Computation and Language · Computer Science 2026-04-24 Norman Paulsen

Large Multimodal Models (LMMs) have demonstrated impressive performance in short video understanding tasks but face great challenges when applied to long video understanding. In contrast, Large Language Models (LLMs) exhibit outstanding…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Hongchen Wei , Zhenzhong Chen

Extending large language models (LLMs) to process longer inputs is crucial for a wide range of applications. However, the substantial computational cost of transformers and limited generalization of positional encoding restrict the size of…

Computation and Language · Computer Science 2025-06-11 Howard Yen , Tianyu Gao , Danqi Chen

Transformer-based large language models (LLMs) typically have a limited context window, resulting in significant performance degradation when processing text beyond the length of the context window. Extensive studies have been proposed to…

Computation and Language · Computer Science 2024-11-19 Zican Dong , Junyi Li , Xin Men , Wayne Xin Zhao , Bingbing Wang , Zhen Tian , Weipeng Chen , Ji-Rong Wen

In-Context Learning (ICL) is a technique by which language models make predictions based on examples provided in their input context. Previously, their context window size imposed a limit on the number of examples that can be shown, making…

Computation and Language · Computer Science 2025-05-29 Jinheon Baek , Sun Jae Lee , Prakhar Gupta , Geunseob Oh , Siddharth Dalmia , Prateek Kolhar

Large Language Models (LLMs) have demonstrated remarkable capabilities in comprehending and analyzing lengthy sequential inputs, owing to their extensive context windows that allow processing millions of tokens in a single forward pass.…

Computation and Language · Computer Science 2024-12-23 Peyman Hosseini , Ignacio Castro , Iacopo Ghinassi , Matthew Purver

In-context learning (ICL) is critical for large language models (LLMs), but its effectiveness is constrained by finite context windows, particularly in ultra-long contexts. To overcome this, we introduce InfiniteICL, a framework that…

Computation and Language · Computer Science 2025-04-04 Bowen Cao , Deng Cai , Wai Lam

Large language models (LLMs) have advanced in large strides due to the effectiveness of the self-attention mechanism that processes and compares all tokens at once. However, this mechanism comes with a fundamental issue -- the predetermined…

Computation and Language · Computer Science 2023-10-10 Howard Chen , Ramakanth Pasunuru , Jason Weston , Asli Celikyilmaz

Large language models (LLMs) face significant challenges in handling long-context tasks because of their limited effective context window size during pretraining, which restricts their ability to generalize over extended sequences.…

Computation and Language · Computer Science 2024-09-05 Zhiyuan Hu , Yuliang Liu , Jinman Zhao , Suyuchen Wang , Yan Wang , Wei Shen , Qing Gu , Anh Tuan Luu , See-Kiong Ng , Zhiwei Jiang , Bryan Hooi

As the context limits of Large Language Models (LLMs) increase, the range of possible applications and downstream functions broadens. In many real-world tasks, decisions depend on details scattered across collections of often disparate…

Computation and Language · Computer Science 2025-04-24 Jonathan Roberts , Kai Han , Samuel Albanie

Large Language Models (LLMs) have shown exciting performance in listwise passage ranking. Due to the limited input length, existing methods often adopt the sliding window strategy. Such a strategy, though effective, is inefficient as it…

Information Retrieval · Computer Science 2024-12-20 Wenhan Liu , Xinyu Ma , Yutao Zhu , Ziliang Zhao , Shuaiqiang Wang , Dawei Yin , Zhicheng Dou

Transformer-based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses. This constraint restricts their applicability in scenarios involving long…

Computation and Language · Computer Science 2023-12-18 Weizhi Fei , Xueyan Niu , Pingyi Zhou , Lu Hou , Bo Bai , Lei Deng , Wei Han

Large language models (LLMs) have revolutionized AI, but are constrained by limited context windows, hindering their utility in tasks like extended conversations and document analysis. To enable using context beyond limited context windows,…

Artificial Intelligence · Computer Science 2024-02-13 Charles Packer , Sarah Wooders , Kevin Lin , Vivian Fang , Shishir G. Patil , Ion Stoica , Joseph E. Gonzalez

The limited context window of contemporary large language models (LLMs) remains a primary bottleneck for their broader application across diverse domains. Although continual pre-training on long-context data offers a straightforward…

Computation and Language · Computer Science 2026-04-10 Wei Han , Pan Zhou , Soujanya Poria , Shuicheng Yan

To extend the context length of Transformer-based large language models (LLMs) and improve comprehension capabilities, we often face limitations due to computational resources and bounded memory storage capacity. This work introduces a…

Computation and Language · Computer Science 2024-06-11 Chensen Huang , Guibo Zhu , Xuepeng Wang , Yifei Luo , Guojing Ge , Haoran Chen , Dong Yi , Jinqiao Wang

The development of Long-Context Large Language Models (LLMs) has markedly advanced natural language processing by facilitating the process of textual data across long documents and multiple corpora. However, Long-Context LLMs still face two…

Computation and Language · Computer Science 2024-10-10 Jingyang Deng , Zhengyang Shen , Boyang Wang , Lixin Su , Suqi Cheng , Ying Nie , Junfeng Wang , Dawei Yin , Jinwen Ma

Large Language Models (LLMs) have become increasingly capable of interacting with external tools, granting access to specialized knowledge beyond their training data - critical in dynamic, knowledge-intensive domains such as Chemistry and…

Large vision-language models (LVLMs) employ multi-modal in-context learning (MM-ICL) to adapt to new tasks by leveraging demonstration examples. While increasing the number of demonstrations boosts performance, they incur significant…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Shin'ya Yamaguchi , Daiki Chijiwa , Tamao Sakao , Taku Hasegawa

It is well known that LLMs cannot generalize well to long contexts whose lengths are larger than the training sequence length. This poses challenges when employing LLMs for processing long input sequences during inference. In this work, we…

Computation and Language · Computer Science 2024-07-12 Hongye Jin , Xiaotian Han , Jingfeng Yang , Zhimeng Jiang , Zirui Liu , Chia-Yuan Chang , Huiyuan Chen , Xia Hu
‹ Prev 1 2 3 10 Next ›