Related papers: Token Assorted: Mixing Latent and Text Tokens for …

Latent Thoughts Tuning: Bridging Context and Reasoning with Fused Information in Latent Tokens

While explicit Chain-of-Thought (CoT) equips Large Language Models (LLMs) with strong reasoning capabilities, it requires models to verbalize every intermediate step in text tokens, constraining the model thoughts to the discrete vocabulary…

Computation and Language · Computer Science 2026-02-12 Weihao Liu , Dehai Min , Lu Cheng

Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding

Large language models (LLMs) have demonstrated remarkable capabilities in tasks requiring reasoning and multi-step problem-solving through the use of chain-of-thought (CoT) prompting. However, generating the full CoT process results in…

Computation and Language · Computer Science 2024-09-16 Tianqiao Liu , Zui Chen , Zitao Liu , Mi Tian , Weiqi Luo

Pretraining with Token-Level Adaptive Latent Chain-of-Thought

Scaling large language models by increasing parameters and training data is increasingly constrained by limited high-quality corpora and rising communication costs. This work explores an alternative axis: increasing per-token computation…

Computation and Language · Computer Science 2026-03-11 Boyi Zeng , Yiqin Hao , He Li , Shixiang Song , Feichen Song , Zitong Wang , Siyuan Huang , Yi Xu , ZiWei He , Xinbing Wang , Zhouhan Lin

Fast Thinking for Large Language Models

Reasoning-oriented Large Language Models (LLMs) often rely on generating explicit tokens step by step, and their effectiveness typically hinges on large-scale supervised fine-tuning or reinforcement learning. While Chain-of-Thought (CoT)…

Computation and Language · Computer Science 2025-09-30 Haoyu Zheng , Zhuonan Wang , Yuqian Yuan , Tianwei Lin , Wenqiao Zhang , Zheqi Lv , Juncheng Li , Siliang Tang , Yueting Zhuang , Hongyang He

Learning Modal-Mixed Chain-of-Thought Reasoning with Latent Embeddings

We study how to extend chain-of-thought (CoT) beyond language to better handle multimodal reasoning. While CoT helps LLMs and VLMs articulate intermediate steps, its text-only form often fails on vision-intensive problems where key…

Artificial Intelligence · Computer Science 2026-02-03 Yifei Shao , Kun Zhou , Ziming Xu , Mohammad Atif Quamar , Shibo Hao , Zhen Wang , Zhiting Hu , Biwei Huang

Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

Large Language Models (LLMs) have shown impressive performance on complex tasks through Chain-of-Thought (CoT) reasoning. However, conventional CoT relies on explicitly verbalized intermediate steps, which constrains its broader…

Computation and Language · Computer Science 2025-11-04 Xinghao Chen , Anhao Zhao , Heming Xia , Xuan Lu , Hanlin Wang , Yanjun Chen , Wei Zhang , Jian Wang , Wenjie Li , Xiaoyu Shen

Beyond Imitation: Reinforcement Learning for Active Latent Planning

Aiming at efficient and dense chain-of-thought (CoT) reasoning, latent reasoning methods fine-tune Large Language Models (LLMs) to substitute discrete language tokens with continuous latent tokens. These methods consume fewer tokens…

Artificial Intelligence · Computer Science 2026-01-30 Zhi Zheng , Wee Sun Lee

Latent Reasoning with Supervised Thinking States

Reasoning with a chain-of-thought (CoT) enables Large Language Models (LLMs) to solve complex tasks but incurs significant inference costs due to the generation of long rationales. We propose Thinking States, a method that performs…

Computation and Language · Computer Science 2026-02-10 Ido Amos , Avi Caciularu , Mor Geva , Amir Globerson , Jonathan Herzig , Lior Shani , Idan Szpektor

Token-Budget-Aware LLM Reasoning

Reasoning is critical for large language models (LLMs) to excel in a wide range of tasks. While methods like Chain-of-Thought (CoT) reasoning and enhance LLM performance by decomposing problems into intermediate steps, they also incur…

Computation and Language · Computer Science 2025-06-03 Tingxu Han , Zhenting Wang , Chunrong Fang , Shiyu Zhao , Shiqing Ma , Zhenyu Chen

Thinking Without Words: Efficient Latent Reasoning with Abstract Chain-of-Thought

While long, explicit chains-of-thought (CoT) have proven effective on complex reasoning tasks, they are costly to generate during inference. Non-verbal reasoning methods have emerged with shorter generation lengths by leveraging continuous…

Computation and Language · Computer Science 2026-04-28 Keshav Ramji , Tahira Naseem , Ramón Fernandez Astudillo

LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification

Chain-of-thought (CoT) reasoning improves large language models (LLMs) on difficult tasks, but it also makes inference expensive because every intermediate step must be generated as a discrete token. Latent reasoning reduces visible token…

Computation and Language · Computer Science 2026-05-11 Xuan Li , Yining Wang , Yuchen Liu , Guanjun Liu , Delai Qiu , Shengping Liu , Jiaen Liang , Wei Huang , Jun Yu , Junnan Zhu

Improving Latent Reasoning in LLMs via Soft Concept Mixing

Unlike human reasoning in abstract conceptual spaces, large language models (LLMs) typically reason by generating discrete tokens, which potentially limit their expressive power. The recent work Soft Thinking has shown that LLMs' latent…

Computation and Language · Computer Science 2025-11-24 Kang Wang , Xiangyu Duan , Tianyi Du

Extending Token Computation for LLM Reasoning

Large Language Models (LLMs) are pivotal in advancing natural language processing but often struggle with complex reasoning tasks due to inefficient attention distributions. In this paper, we explore the effect of increased computed tokens…

Computation and Language · Computer Science 2024-06-25 Bingli Liao , Danilo Vasconcellos Vargas

LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens

Large reasoning models (LRMs) have led to new possibilities in terms of problem-solving, through the devising of a natural language thought process prior to answering a query. While their capabilities are well known across mathematics and…

Computation and Language · Computer Science 2025-10-15 Armel Zebaze , Rachel Bawden , Benoît Sagot

Dual-Track CoT: Budget-Aware Stepwise Guidance for Small LMs

Large Language Models (LLMs) solve many reasoning tasks via chain-of-thought (CoT) prompting, but smaller models (about 7 to 8B parameters) still struggle with multi-step reasoning under tight compute and token budgets. Existing test time…

Computation and Language · Computer Science 2026-04-29 Sagnik Chatterjee , Atharva Patil , Sricharan Ramesh

Self-Training Elicits Concise Reasoning in Large Language Models

Chain-of-thought (CoT) reasoning has enabled large language models (LLMs) to utilize additional computation through intermediate tokens to solve complex tasks. However, we posit that typical reasoning traces contain many redundant tokens,…

Computation and Language · Computer Science 2025-06-11 Tergel Munkhbat , Namgyu Ho , Seo Hyun Kim , Yongjin Yang , Yujin Kim , Se-Young Yun

Guiding Language Model Reasoning with Planning Tokens

Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought (CoT) reasoning. However, most of the existing approaches to enhance this ability rely…

Computation and Language · Computer Science 2024-08-08 Xinyi Wang , Lucas Caccia , Oleksiy Ostapenko , Xingdi Yuan , William Yang Wang , Alessandro Sordoni

Training Large Language Models to Reason in a Continuous Latent Space

Large language models (LLMs) are typically constrained to reason in the language space, where they express the reasoning process through a chain-of-thought (CoT) to solve complex problems. However, the language space may not always be…

Computation and Language · Computer Science 2025-11-04 Shibo Hao , Sainbayar Sukhbaatar , DiJia Su , Xian Li , Zhiting Hu , Jason Weston , Yuandong Tian

Lightweight Latent Reasoning for Narrative Tasks

Large language models (LLMs) tackle complex tasks by generating long chains of thought or "reasoning traces" that act as latent variables in the generation of an output given a query. A model's ability to generate such traces can be…

Computation and Language · Computer Science 2025-12-03 Alexander Gurung , Nikolay Malkin , Mirella Lapata

Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information

Recent large language models achieve strong reasoning performance by generating detailed chain-of-thought traces, but this often leads to excessive token use and high inference latency. Existing efficiency approaches typically focus on…

Computation and Language · Computer Science 2025-12-01 Lukas Struppek , Dominik Hintersdorf , Hannah Struppek , Daniel Neider , Kristian Kersting