English
Related papers

Related papers: Language Models Implement Simple Word2Vec-style Ve…

200 papers

Recent work has shown that large pretrained Language Models (LMs) can not only perform remarkably well on a range of Natural Language Processing (NLP) tasks but also start improving on reasoning tasks such as arithmetic induction, symbolic…

Computation and Language · Computer Science 2022-08-11 Jing Qian , Hong Wang , Zekun Li , Shiyang Li , Xifeng Yan

In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. However, existing literature has highlighted the…

Computation and Language · Computer Science 2024-02-14 Xinyi Wang , Wanrong Zhu , Michael Saxon , Mark Steyvers , William Yang Wang

Large language models (LLMs) have exhibited impressive competence in various tasks, but their internal mechanisms on mathematical problems are still under-explored. In this paper, we study a fundamental question: how language models encode…

Computation and Language · Computer Science 2024-11-15 Fangwei Zhu , Damai Dai , Zhifang Sui

Large Language Models (LLMs) are known for their expensive and time-consuming training. Thus, oftentimes, LLMs are fine-tuned to address a specific task, given the pretrained weights of a pre-trained LLM considered a foundation model. In…

Computation and Language · Computer Science 2025-12-05 Eshed Gal , Moshe Eliasof , Javier Turek , Uri Ascher , Eran Treister , Eldad Haber

Large language models (LLMs) were invented for natural language tasks such as translation, but they have proved that they can perform highly complex functions across domains. Additionally, they have been thought to develop new skills…

Computation and Language · Computer Science 2026-05-12 Jung H. Lee , Sujith Vijayan

Understanding the internal mechanisms of large language models (LLMs) remains a challenging and complex endeavor. Even fundamental questions, such as how fine-tuning affects model behavior, often require extensive empirical evaluation. In…

Large Language Models (LLMs) are widely deployed in real-world applications, yet little is known about their training dynamics at the token level. Evaluation typically relies on aggregated training loss, measured at the batch level, which…

Computation and Language · Computer Science 2024-10-17 Andrea Pinto , Tomer Galanti , Randall Balestriero

This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in…

Though large language models (LLMs) have enabled great success across a wide variety of tasks, they still appear to fall short of one of the loftier goals of artificial intelligence research: creating an artificial system that can adapt its…

Computation and Language · Computer Science 2026-05-04 Michael A. Lepori , Tal Linzen , Ann Yuan , Katja Filippova

Large language models (LLMs) process and predict sequences containing text to answer questions, and address tasks including document summarization, providing recommendations, writing software and solving quantitative problems. We provide a…

Numerical Analysis · Mathematics 2026-02-02 Ricardo Baptista , Andrew Stuart , Son Tran

Pretrained large Language Models (LLMs) are able to answer questions that are unlikely to have been encountered during training. However a diversity of potential applications exist in the broad domain of reasoning systems and considerations…

Computation and Language · Computer Science 2024-11-27 Tim Hartill

Large language models (LLMs) have demonstrated remarkable potential across numerous applications and have shown an emergent ability to tackle complex reasoning tasks, such as mathematical computations. However, even for the simplest…

Computation and Language · Computer Science 2024-09-04 Wei Zhang , Chaoqun Wan , Yonggang Zhang , Yiu-ming Cheung , Xinmei Tian , Xu Shen , Jieping Ye

In recent years, Large Language Models (LLMs) have garnered significant attention from the research community due to their exceptional performance and generalization capabilities. In this paper, we introduce a novel method for…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-21 Egor Lakomkin , Chunyang Wu , Yassir Fathullah , Ozlem Kalinli , Michael L. Seltzer , Christian Fuegen

Large Language Models (LLMs) demonstrate exceptional reasoning abilities, enabling strong generalization across diverse tasks such as commonsense reasoning and instruction following. However, as LLMs scale, inference costs become…

Computation and Language · Computer Science 2025-02-06 Rhea Sanjay Sukthanker , Benedikt Staffler , Frank Hutter , Aaron Klein

Language models (LMs) are sentence-completion engines trained on massive corpora. LMs have emerged as a significant breakthrough in natural-language processing, providing capabilities that go far beyond sentence completion including…

Artificial Intelligence · Computer Science 2021-10-26 Robert E. Wray , III , James R. Kirk , John E. Laird

Despite their outstanding performance, large language models (LLMs) suffer notorious flaws related to their preference for simple, surface-level textual relations over full semantic complexity of the problem. This proposal investigates a…

Computation and Language · Computer Science 2022-06-20 Michal Štefánik

Low-resource languages (LRLs) face significant challenges in natural language processing (NLP) due to limited data. While current state-of-the-art large language models (LLMs) still struggle with LRLs, smaller multilingual models (mLMs)…

Computation and Language · Computer Science 2025-02-17 Daniil Gurgurov , Ivan Vykopal , Josef van Genabith , Simon Ostermann

The representation space of pretrained Language Models (LMs) encodes rich information about words and their relationships (e.g., similarity, hypernymy, polysemy) as well as abstract semantic notions (e.g., intensity). In this paper, we…

Computation and Language · Computer Science 2023-06-02 Qing Lyu , Marianna Apidianaki , Chris Callison-Burch

Parametric language models (LMs), which are trained on vast amounts of web data, exhibit remarkable flexibility and capability. However, they still face practical challenges such as hallucinations, difficulty in adapting to new data…

Computation and Language · Computer Science 2024-03-06 Akari Asai , Zexuan Zhong , Danqi Chen , Pang Wei Koh , Luke Zettlemoyer , Hannaneh Hajishirzi , Wen-tau Yih

Word embedding models offer continuous vector representations that can capture rich contextual semantics based on their word co-occurrence patterns. While these word vectors can provide very effective features used in many NLP tasks such as…

Computation and Language · Computer Science 2017-02-27 Cem Safak Sahin , Rajmonda S. Caceres , Brandon Oselio , William M. Campbell
‹ Prev 1 2 3 10 Next ›