Related papers: Decoding-based Regression

Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

Decoding-based regression, which reformulates regression as a sequence generation task, has emerged as a promising paradigm of applying large language models for numerical prediction. However, its progress is hindered by the misalignment…

Machine Learning · Computer Science 2025-12-09 Ming Chen , Sheng Tang , Rong-Xi Tan , Ziniu Li , Jiacheng Chen , Ke Xue , Chao Qian

Learning Generic Sentence Representations Using Convolutional Neural Networks

We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a…

Computation and Language · Computer Science 2017-07-28 Zhe Gan , Yunchen Pu , Ricardo Henao , Chunyuan Li , Xiaodong He , Lawrence Carin

Know Your Limits: Entropy Estimation Modeling for Compression and Generalization

Language prediction is constrained by informational entropy intrinsic to language, such that there exists a limit to how accurate any language model can become and equivalently a lower bound to language compression. The most efficient…

Computation and Language · Computer Science 2025-11-14 Benjamin L. Badger , Matthew Neligeorge

Interpreting Encoding and Decoding Models

Encoding and decoding models are widely used in systems, cognitive, and computational neuroscience to make sense of brain-activity data. However, the interpretation of their results requires care. Decoding models can help reveal whether…

Neurons and Cognition · Quantitative Biology 2019-04-29 Nikolaus Kriegeskorte , Pamela K. Douglas

Learning Disentangled Representations for Natural Language Definitions

Disentangling the encodings of neural models is a fundamental aspect for improving interpretability, semantic control and downstream task performance in Natural Language Processing. Currently, most disentanglement methods are unsupervised…

Computation and Language · Computer Science 2023-02-17 Danilo S. Carvalho , Giangiacomo Mercatali , Yingji Zhang , Andre Freitas

Building Bridges between Regression, Clustering, and Classification

Regression, the task of predicting a continuous scalar target y based on some features x is one of the most fundamental tasks in machine learning and statistics. It has been observed and theoretically analyzed that the classical approach,…

Machine Learning · Statistics 2025-02-19 Lawrence Stewart , Francis Bach , Quentin Berthet

Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition

We propose a new model for multi-token prediction in transformers, aiming to enhance sampling efficiency without compromising accuracy. Motivated by recent work that predicts the probabilities of subsequent tokens using multiple heads, we…

Machine Learning · Computer Science 2025-02-11 Artem Basharin , Andrei Chertkov , Ivan Oseledets

Object Recognition as Next Token Prediction

We present an approach to pose object recognition as next token prediction. The idea is to apply a language decoder that auto-regressively predicts the text tokens from image embeddings to form labels. To ground this prediction process in…

Computer Vision and Pattern Recognition · Computer Science 2024-04-02 Kaiyu Yue , Bor-Chun Chen , Jonas Geiping , Hengduo Li , Tom Goldstein , Ser-Nam Lim

On Tree-Based Neural Sentence Modeling

Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing tree-based encoders adopt syntactic parsing trees as the explicit structure prior. To study the effectiveness of…

Computation and Language · Computer Science 2018-08-30 Haoyue Shi , Hao Zhou , Jiaze Chen , Lei Li

Speculative Decoding and Beyond: An In-Depth Survey of Techniques

Sequential dependencies present a fundamental bottleneck in deploying large-scale autoregressive models, particularly for real-time applications. While traditional optimization approaches like pruning and quantization often compromise model…

Computation and Language · Computer Science 2025-10-09 Yunhai Hu , Zining Liu , Zhenyuan Dong , Tianfan Peng , Bradley McDanel , Sai Qian Zhang

Language Model Memory and Memory Models for Language

The ability of machine learning models to store input information in hidden layer vector embeddings, analogous to the concept of `memory', is widely employed but not well characterized. We find that language model embeddings typically…

Computation and Language · Computer Science 2026-05-20 Benjamin L. Badger

A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks

Autoregressive language models, pretrained using large text corpora to do well on next word prediction, have been successful at solving many downstream tasks, even with zero-shot usage. However, there is little theoretical understanding of…

Computation and Language · Computer Science 2021-04-15 Nikunj Saunshi , Sadhika Malladi , Sanjeev Arora

Decoding Generic Visual Representations From Human Brain Activity using Machine Learning

Among the most impressive recent applications of neural decoding is the visual representation decoding, where the category of an object that a subject either sees or imagines is inferred by observing his/her brain activity. Even though…

Neural and Evolutionary Computing · Computer Science 2018-11-06 Angeliki Papadimitriou , Nikolaos Passalis , Anastasios Tefas

Arithmetic with Language Models: from Memorization to Computation

A better understanding of the emergent computation and problem-solving capabilities of recent large language models is of paramount importance to further improve them and broaden their applicability. This work investigates how a language…

Artificial Intelligence · Computer Science 2024-08-05 Davide Maltoni , Matteo Ferrara

On Next-Token Prediction in LLMs: How End Goals Determine the Consistency of Decoding Algorithms

Probabilistic next-token prediction trained using cross-entropy loss is the basis of most large language models. Given a sequence of previous values, next-token prediction assigns a probability to each possible next value in the vocabulary.…

Machine Learning · Statistics 2025-05-19 Jacob Trauger , Ambuj Tewari

Deconvolutional Latent-Variable Model for Text Sequence Matching

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we…

Computation and Language · Computer Science 2017-11-23 Dinghan Shen , Yizhe Zhang , Ricardo Henao , Qinliang Su , Lawrence Carin

Local Translation Prediction with Global Sentence Representation

Statistical machine translation models have made great progress in improving the translation quality. However, the existing models predict the target translation with only the source- and target-side local context information. In practice,…

Computation and Language · Computer Science 2015-03-02 Jiajun Zhang

A Thorough Examination of Decoding Methods in the Era of LLMs

Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers. Prior research on decoding methods, primarily focusing on task-specific models, may not extend to the current…

Computation and Language · Computer Science 2024-10-10 Chufan Shi , Haoran Yang , Deng Cai , Zhisong Zhang , Yifan Wang , Yujiu Yang , Wai Lam

The Devil is in the Decoder: Classification, Regression and GANs

Many machine vision applications, such as semantic segmentation and depth prediction, require predictions for every pixel of the input image. Models for such problems usually consist of encoders which decrease spatial resolution while…

Computer Vision and Pattern Recognition · Computer Science 2019-02-21 Zbigniew Wojna , Vittorio Ferrari , Sergio Guadarrama , Nathan Silberman , Liang-Chieh Chen , Alireza Fathi , Jasper Uijlings

Exploiting Invertible Decoders for Unsupervised Sentence Representation Learning

The encoder-decoder models for unsupervised sentence representation learning tend to discard the decoder after being trained on a large unlabelled corpus, since only the encoder is needed to map the input sentence into a vector…

Neural and Evolutionary Computing · Computer Science 2019-06-03 Shuai Tang , Virginia R. de Sa