Related papers: Long-Context Linear System Identification

Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models

Long-context modeling capabilities are important for large language models (LLMs) in various applications. However, directly training LLMs with long context windows is insufficient to enhance this capability since some training samples do…

Computation and Language · Computer Science 2024-05-29 Longze Chen , Ziqiang Liu , Wanwei He , Yunshui Li , Run Luo , Min Yang

Beyond Length: Quantifying Long-Range Information for Long-Context LLM Pretraining Data

Long-context language models unlock advanced capabilities in reasoning, code generation, and document summarization by leveraging dependencies across extended spans of text. However, a significant portion of readily available long-text data…

Computation and Language · Computer Science 2025-10-31 Haoran Deng , Yingyu Lin , Zhenghao Lin , Xiao Liu , Yizhou Sun , Yi-An Ma , Yeyun Gong

Revisiting In-Context Learning with Long Context Language Models

In-Context Learning (ICL) is a technique by which language models make predictions based on examples provided in their input context. Previously, their context window size imposed a limit on the number of examples that can be shown, making…

Computation and Language · Computer Science 2025-05-29 Jinheon Baek , Sun Jae Lee , Prakhar Gupta , Geunseob Oh , Siddharth Dalmia , Prateek Kolhar

Learning Linearized Models from Nonlinear Systems under Initialization Constraints with Finite Data

The identification of a linear system model from data has wide applications in control theory. The existing work that provides finite sample guarantees for linear system identification typically uses data from a single long system…

Machine Learning · Statistics 2025-05-09 Lei Xin , Baike She , Qi Dou , George Chiu , Shreyas Sundaram

Linear Dynamics: Clustering without identification

Linear dynamical systems are a fundamental and powerful parametric model class. However, identifying the parameters of a linear dynamical system is a venerable task, permitting provably efficient solutions only in special cases. This work…

Machine Learning · Computer Science 2020-03-03 Chloe Ching-Yun Hsu , Michaela Hardt , Moritz Hardt

In-Context Learning with Long-Context Models: An In-Depth Exploration

As model context lengths continue to increase, the number of demonstrations that can be provided in-context approaches the size of entire training datasets. We study the behavior of in-context learning (ICL) at this extreme scale on…

Computation and Language · Computer Science 2025-03-05 Amanda Bertsch , Maor Ivgi , Emily Xiao , Uri Alon , Jonathan Berant , Matthew R. Gormley , Graham Neubig

Parallel Context Windows for Large Language Models

When applied to processing long text, Large Language Models (LLMs) are limited by their context window. Existing efforts to address this limitation involve training specialized architectures, and cannot be easily applied to off-the-shelf…

Computation and Language · Computer Science 2023-08-02 Nir Ratner , Yoav Levine , Yonatan Belinkov , Ori Ram , Inbal Magar , Omri Abend , Ehud Karpas , Amnon Shashua , Kevin Leyton-Brown , Yoav Shoham

When Does Context Help? Error Dynamics of Contextual Information in Large Language Models

Contextual information at inference time, such as demonstrations, retrieved knowledge, or interaction history, can substantially improve large language models (LLMs) without parameter updates, yet its theoretical role remains poorly…

Computation and Language · Computer Science 2026-02-10 Dingzirui Wang , Xuanliang Zhang , Keyan Xu , Qingfu Zhu , Wanxiang Che , Yang Deng

ACER: Automatic Language Model Context Extension via Retrieval

Long-context modeling is one of the critical capabilities of language AI for digesting and reasoning over complex information pieces. In practice, long-context capabilities are typically built into a pre-trained language model~(LM) through…

Computation and Language · Computer Science 2024-10-15 Luyu Gao , Yunyi Zhang , Jamie Callan

Active Learning for Nonlinear System Identification with Guarantees

While the identification of nonlinear dynamical systems is a fundamental building block of model-based reinforcement learning and feedback control, its sample complexity is only understood for systems that either have discrete states and…

Machine Learning · Statistics 2020-06-19 Horia Mania , Michael I. Jordan , Benjamin Recht

From system models to class models: An in-context learning paradigm

Is it possible to understand the intricacies of a dynamical system not solely from its input/output pattern, but also by observing the behavior of other systems within the same class? This central question drives the study presented in this…

Systems and Control · Electrical Eng. & Systems 2023-12-21 Marco Forgione , Filippo Pura , Dario Piga

In-Context Learning for Text Classification with Many Labels

In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a…

Computation and Language · Computer Science 2023-12-07 Aristides Milios , Siva Reddy , Dzmitry Bahdanau

Visual Context Window Extension: A New Perspective for Long Video Understanding

Large Multimodal Models (LMMs) have demonstrated impressive performance in short video understanding tasks but face great challenges when applied to long video understanding. In contrast, Large Language Models (LLMs) exhibit outstanding…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Hongchen Wei , Zhenzhong Chen

Short Data, Long Context: Distilling Positional Knowledge in Transformers

Extending the context window of language models typically requires expensive long-context pre-training, posing significant challenges for both training efficiency and data collection. In this paper, we present evidence that long-context…

Computation and Language · Computer Science 2026-04-08 Patrick Huber , Ernie Chang , Chinnadhurai Sankar , Rylan Conway , Igor Fedorov , Md Rifat Arefin , Adithya Sagar

LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs

Long-context modeling has drawn more and more attention in the area of Large Language Models (LLMs). Continual training with long-context data becomes the de-facto method to equip LLMs with the ability to process long inputs. However, it…

Computation and Language · Computer Science 2025-10-14 Jianghao Chen , Junhong Wu , Yangyifan Xu , Jiajun Zhang

Enhanced Transformer architecture for in-context learning of dynamical systems

Recently introduced by some of the authors, the in-context identification paradigm aims at estimating, offline and based on synthetic data, a meta-model that describes the behavior of a whole class of systems. Once trained, this meta-model…

Machine Learning · Computer Science 2024-10-07 Matteo Rufolo , Dario Piga , Gabriele Maroni , Marco Forgione

On the adaptation of in-context learners for system identification

In-context system identification aims at constructing meta-models to describe classes of systems, differently from traditional approaches that model single systems. This paradigm facilitates the leveraging of knowledge acquired from…

Machine Learning · Computer Science 2023-12-08 Dario Piga , Filippo Pura , Marco Forgione

In-Context Learning (and Unlearning) of Length Biases

Large language models have demonstrated strong capabilities to learn in-context, where exemplar input-output pairings are appended to the prompt for demonstration. However, existing work has demonstrated the ability of models to learn…

Computation and Language · Computer Science 2025-02-11 Stephanie Schoch , Yangfeng Ji

In-Context Learning Dynamics with Random Binary Sequences

Large language models (LLMs) trained on huge corpora of text datasets demonstrate intriguing capabilities, achieving state-of-the-art performance on tasks they were not explicitly trained for. The precise nature of LLM capabilities is often…

Artificial Intelligence · Computer Science 2024-04-17 Eric J. Bigelow , Ekdeep Singh Lubana , Robert P. Dick , Hidenori Tanaka , Tomer D. Ullman

Continuous-time system identification with neural networks: Model structures and fitting criteria

This paper presents tailor-made neural model structures and two custom fitting criteria for learning dynamical systems. The proposed framework is based on a representation of the system behavior in terms of continuous-time state-space…

Systems and Control · Electrical Eng. & Systems 2021-09-02 Marco Forgione , Dario Piga