Related papers: Exploring Coding Spot: Understanding Parametric Co…

Unveiling A Core Linguistic Region in Large Language Models

Brain localization, which describes the association between specific regions of the brain and their corresponding functions, is widely accepted in the field of cognitive science as an objective fact. Today's large language models (LLMs)…

Computation and Language · Computer Science 2023-10-24 Jun Zhao , Zhihao Zhang , Yide Ma , Qi Zhang , Tao Gui , Luhui Gao , Xuanjing Huang

Unveiling Linguistic Regions in Large Language Models

Large Language Models (LLMs) have demonstrated considerable cross-lingual alignment and generalization ability. Current research primarily focuses on improving LLMs' cross-lingual generalization capabilities. However, there is still a lack…

Computation and Language · Computer Science 2024-05-31 Zhihao Zhang , Jun Zhao , Qi Zhang , Tao Gui , Xuanjing Huang

Coding Triangle: How Does Large Language Model Understand Code?

Large language models (LLMs) have achieved remarkable progress in code generation, yet their true programming competence remains underexplored. We introduce the Code Triangle framework, which systematically evaluates LLMs across three…

Computation and Language · Computer Science 2025-07-09 Taolin Zhang , Zihan Ma , Maosong Cao , Junnan Liu , Songyang Zhang , Kai Chen

Interpreting and Improving Large Language Models in Arithmetic Calculation

Large language models (LLMs) have demonstrated remarkable potential across numerous applications and have shown an emergent ability to tackle complex reasoning tasks, such as mathematical computations. However, even for the simplest…

Computation and Language · Computer Science 2024-09-04 Wei Zhang , Chaoqun Wan , Yonggang Zhang , Yiu-ming Cheung , Xinmei Tian , Xu Shen , Jieping Ye

The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units

Large language models (LLMs) exhibit remarkable capabilities on not just language tasks, but also various tasks that are not linguistic in nature, such as logical reasoning and social inference. In the human brain, neuroscience has…

Computation and Language · Computer Science 2025-02-14 Badr AlKhamissi , Greta Tuckute , Antoine Bosselut , Martin Schrimpf

Neuron-Guided Interpretation of Code LLMs: Where, Why, and How?

Code language models excel on code intelligence tasks, yet their internal interpretability is underexplored. Existing neuron interpretability techniques from NLP are suboptimal for source code due to programming languages formal,…

Software Engineering · Computer Science 2026-03-20 Zhe Yin , Xiaodong Gu , Beijun Shen

CodeMind: Evaluating Large Language Models for Code Reasoning

Large Language Models (LLMs) have been widely used to automate programming tasks. Their capabilities have been evaluated by assessing the quality of generated code through tests or proofs. The extent to which they can reason about code is a…

Software Engineering · Computer Science 2026-04-08 Changshu Liu , Yang Chen , Reyhaneh Jabbarvand

Not All Code Is Equal: A Data-Centric Study of Code Complexity and LLM Reasoning

Large Language Models (LLMs) increasingly exhibit strong reasoning abilities, often attributed to their capacity to generate chain-of-thought-style intermediate reasoning. Recent work suggests that exposure to code can further enhance these…

Machine Learning · Computer Science 2026-01-30 Lukas Twist , Shu Yang , Hanqi Yan , Jingzhi Gong , Di Wang , Helen Yannakoudakis , Jie M. Zhang

How Programming Concepts and Neurons Are Shared in Code Language Models

Several studies have explored the mechanisms of large language models (LLMs) in coding tasks, but most have focused on programming languages (PLs) in a monolingual setting. In this paper, we investigate the relationship between multiple PLs…

Computation and Language · Computer Science 2025-06-03 Amir Hossein Kargaran , Yihong Liu , François Yvon , Hinrich Schütze

Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning

Instruction Fine-Tuning (IFT) significantly enhances the zero-shot capabilities of pretrained Large Language Models (LLMs). While coding data is known to boost LLM reasoning abilities during pretraining, its role in activating internal…

Artificial Intelligence · Computer Science 2024-12-13 Xinlu Zhang , Zhiyu Zoey Chen , Xi Ye , Xianjun Yang , Lichang Chen , William Yang Wang , Linda Ruth Petzold

Revealing Algorithmic Deductive Circuits for Logical Reasoning

Recent studies have shown that Large Language Models (LLMs) can achieve strong reasoning performance by incorporating functional symbolic representations that abstractly describe graph traversal algorithms and step-by-step reasoning in…

Artificial Intelligence · Computer Science 2026-05-28 Phuong Minh Nguyen , Tien Huu Dang , Naoya Inoue

The Emergence of Abstract Thought in Large Language Models Beyond Any Language

As large language models (LLMs) continue to advance, their capacity to function effectively across a diverse range of languages has shown marked improvement. Preliminary studies observe that the hidden activations of LLMs often resemble…

Computation and Language · Computer Science 2025-06-12 Yuxin Chen , Yiran Zhao , Yang Zhang , An Zhang , Kenji Kawaguchi , Shafiq Joty , Junnan Li , Tat-Seng Chua , Michael Qizhe Shieh , Wenxuan Zhang

Exploring the Translation Mechanism of Large Language Models

While large language models (LLMs) demonstrate remarkable success in multilingual translation, their internal core translation mechanisms, even at the fundamental word level, remain insufficiently understood. To address this critical gap,…

Computation and Language · Computer Science 2026-01-16 Hongbin Zhang , Kehai Chen , Xuefeng Bai , Xiucheng Li , Yang Xiang , Min Zhang

Perplexed: Understanding When Large Language Models are Confused

Large Language Models (LLMs) have become dominant in the Natural Language Processing (NLP) field causing a huge surge in progress in a short amount of time. However, their limitations are still a mystery and have primarily been explored…

Software Engineering · Computer Science 2024-04-11 Nathan Cooper , Torsten Scholak

On Code-Induced Reasoning in LLMs

Code data has been shown to enhance the reasoning capabilities of large language models (LLMs), but it remains unclear which aspects of code are most responsible. We investigate this question with a systematic, data-centric framework. We…

Computation and Language · Computer Science 2025-10-03 Abdul Waheed , Zhen Wu , Carolyn Rosé , Daphne Ippolito

Large Language Models for Code Generation: The Practitioners Perspective

Large Language Models (LLMs) have emerged as coding assistants, capable of generating source code from natural language prompts. With the increasing adoption of LLMs in software development, academic research and industry based projects are…

Software Engineering · Computer Science 2025-01-29 Zeeshan Rasheed , Muhammad Waseem , Kai Kristian Kemell , Aakash Ahmad , Malik Abdul Sami , Jussi Rasku , Kari Systä , Pekka Abrahamsson

Large Language Models for Code Analysis: Do LLMs Really Do Their Job?

Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks. Their capacity to comprehend and generate human-like code has spurred research into…

Software Engineering · Computer Science 2024-03-07 Chongzhou Fang , Ning Miao , Shaurya Srivastav , Jialin Liu , Ruoyu Zhang , Ruijie Fang , Asmita , Ryan Tsang , Najmeh Nazari , Han Wang , Houman Homayoun

LLM-Aided Customizable Profiling of Code Data Based On Programming Language Concepts

Data profiling is critical in machine learning for generating descriptive statistics, supporting both deeper understanding and downstream tasks like data valuation and curation. This work addresses profiling specifically in the context of…

Software Engineering · Computer Science 2025-03-21 Pankaj Thorat , Adnan Qidwai , Adrija Dhar , Aishwariya Chakraborty , Anand Eswaran , Hima Patel , Praveen Jayachandran

HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages

Large Language Model (LLM) based coding tools have been tremendously successful as software development assistants, yet they are often designed for general purpose programming tasks and perform poorly for more specialized domains such as…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-20 Aman Chaturvedi , Daniel Nichols , Siddharth Singh , Abhinav Bhatele

Code Evolution Graphs: Understanding Large Language Model Driven Design of Algorithms

Large Language Models (LLMs) have demonstrated great promise in generating code, especially when used inside an evolutionary computation framework to iteratively optimize the generated algorithms. However, in some cases they fail to…

Neural and Evolutionary Computing · Computer Science 2025-03-24 Niki van Stein , Anna V. Kononova , Lars Kotthoff , Thomas Bäck