English
Related papers

Related papers: Understanding Language Model Circuits through Know…

200 papers

The remarkable capabilities of modern large language models are rooted in their vast repositories of knowledge encoded within their parameters, enabling them to perceive the world and engage in reasoning. The inner workings of how these…

Computation and Language · Computer Science 2025-01-06 Yunzhi Yao , Ningyu Zhang , Zekun Xi , Mengru Wang , Ziwen Xu , Shumin Deng , Huajun Chen

A fundamental question in interpretability research is to what extent neural networks, particularly language models, implement reusable functions through subnetworks that can be composed to perform more complex tasks. Recent advances in…

Machine Learning · Computer Science 2025-06-24 Philipp Mondorf , Sondre Wold , Barbara Plank

Transformer-based language models have achieved significant success; however, their internal mechanisms remain largely opaque due to the complexity of non-linear interactions and high-dimensional operations. While previous studies have…

Artificial Intelligence · Computer Science 2025-02-17 Lin Zhang , Lijie Hu , Di Wang

While transformer models exhibit strong capabilities on linguistic tasks, their complex architectures make them difficult to interpret. Recent work has aimed to reverse engineer transformer models into human-readable representations called…

Computation and Language · Computer Science 2024-10-08 Michael Lan , Philip Torr , Fazl Barez

Despite exceptional capabilities in knowledge-intensive tasks, Large Language Models (LLMs) face a critical gap in understanding how they internalize new knowledge, particularly how to structurally embed acquired knowledge in their neural…

Machine Learning · Computer Science 2025-06-03 Yixin Ou , Yunzhi Yao , Ningyu Zhang , Hui Jin , Jiacheng Sun , Shumin Deng , Zhenguo Li , Huajun Chen

Which components in transformer language models are responsible for discourse understanding? We hypothesize that sparse computational graphs, termed as discursive circuits, control how models process discourse relations. Unlike simpler…

Computation and Language · Computer Science 2025-10-14 Yisong Miao , Min-Yen Kan

Model editing, the process of efficiently modifying factual knowledge in pre-trained language models, is critical for maintaining their accuracy and relevance. However, existing editing methods often introduce unintended side effects,…

Computation and Language · Computer Science 2025-09-23 Tsung-Hsuan Pan , Chung-Chi Chen , Hen-Hsen Huang , Hsin-Hsi Chen

Model editing has become an increasingly popular alternative for efficiently updating knowledge within language models. Current methods mainly focus on reliability, generalization, and locality, with many methods excelling across these…

Artificial Intelligence · Computer Science 2024-10-25 Qi Li , Xiang Liu , Zhenheng Tang , Peijie Dong , Zeyu Li , Xinglin Pan , Xiaowen Chu

Deploying Large Language Models (LLMs) in real-world dynamic environments raises the challenge of updating their pre-trained knowledge. While existing knowledge editing methods can reliably patch isolated facts, they frequently suffer from…

Computation and Language · Computer Science 2026-04-08 Tianyi Zhao , Yinhan He , Wendy Zheng , Chen Chen

Recent studies on reasoning in language models (LMs) have sparked a debate on whether they can learn systematic inferential principles or merely exploit superficial patterns in the training data. To understand and uncover the mechanisms…

Computation and Language · Computer Science 2025-06-24 Geonhee Kim , Marco Valentino , André Freitas

The field of natural language understanding has experienced exponential progress in the last few years, with impressive results in several tasks. This success has motivated researchers to study the underlying knowledge encoded by these…

Artificial Intelligence · Computer Science 2021-06-03 Carlos Aspillaga , Marcelo Mendoza , Alvaro Soto

Large language models (LLMs) can effectively handle outdated information through knowledge editing. However, current approaches face two key limitations: (I) Poor generalization: Most approaches rigidly inject new knowledge without ensuring…

Computation and Language · Computer Science 2026-04-08 Jinhu Fu , Yan Bai , Longzhu He , Yihang Lou , Yanxiao Zhao , Li Sun , Sen Su

Locating and editing knowledge in large language models (LLMs) is crucial for enhancing their accuracy, safety, and inference rationale. We introduce ``concept editing'', an innovative variation of knowledge editing that uncovers…

Computation and Language · Computer Science 2024-08-23 Nura Aljaafari , Danilo S. Carvalho , André Freitas

Knowledge editing, which aims to update the knowledge encoded in language models, can be deceptive. Despite the fact that many existing knowledge editing algorithms achieve near-perfect performance on conventional metrics, the models edited…

Computation and Language · Computer Science 2025-05-20 Jiakuan Xie , Pengfei Cao , Yubo Chen , Kang Liu , Jun Zhao

Language models based on the Transformer architecture achieve excellent results in many language-related tasks, such as text classification or sentiment analysis. However, despite the architecture of these models being well-defined, little…

Computation and Language · Computer Science 2025-04-14 Miguel López-Otal , Jorge Gracia , Jordi Bernad , Carlos Bobed , Lucía Pitarch-Ballesteros , Emma Anglés-Herrero

Model editing has been gaining increasing attention over the past few years. For Knowledge Editing in particular, more challenging evaluation datasets have recently been released. These datasets use different methodologies to score the…

Computation and Language · Computer Science 2025-07-09 Sebastian Pohl , Max Ploner , Alan Akbik

Knowledge Editing (KE) enables the modification of outdated or incorrect information in large language models (LLMs). While existing KE methods can update isolated facts, they often fail to generalize these updates to multi-hop reasoning…

Computation and Language · Computer Science 2025-11-21 Yunzhi Yao , Jizhan Fang , Jia-Chen Gu , Ningyu Zhang , Shumin Deng , Huajun Chen , Nanyun Peng

Large language models (LLMs) acquire vast knowledge from large text corpora, but this information can become outdated or inaccurate. Since retraining is computationally expensive, knowledge editing offers an efficient alternative --…

Artificial Intelligence · Computer Science 2025-08-13 Amir Mohammad Salehoof , Ali Ramezani , Yadollah Yaghoobzadeh , Majid Nili Ahmadabadi

Neural network models have achieved high performance on a wide variety of complex tasks, but the algorithms that they implement are notoriously difficult to interpret. It is often necessary to hypothesize intermediate variables involved in…

Computation and Language · Computer Science 2025-02-13 Michael A. Lepori , Thomas Serre , Ellie Pavlick

Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training,…

‹ Prev 1 2 3 10 Next ›