Related papers: Interlocking-free Selective Rationalization Throug…

Understanding Interlocking Dynamics of Cooperative Rationalization

Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output. The selection mechanism is commonly integrated into the model itself…

Machine Learning · Computer Science 2021-10-27 Mo Yu , Yang Zhang , Shiyu Chang , Tommi S. Jaakkola

Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control

Selective rationalization has become a common mechanism to ensure that predictive models reveal how they use any available features. The selection may be soft or hard, and identifies a subset of input features relevant for prediction. The…

Computation and Language · Computer Science 2019-12-17 Mo Yu , Shiyu Chang , Yang Zhang , Tommi S. Jaakkola

Unsupervised Selective Rationalization with Noise Injection

A major issue with using deep learning models in sensitive applications is that they provide no explanation for their output. To address this problem, unsupervised selective rationalization produces rationales alongside predictions by…

Computation and Language · Computer Science 2023-05-30 Adam Storek , Melanie Subbiah , Kathleen McKeown

Pruning neural networks without any data by iteratively conserving synaptic flow

Pruning the parameters of deep neural networks has generated intense interest due to potential savings in time, memory and energy both during training and at test time. Recent works have identified, through an expensive sequence of training…

Machine Learning · Computer Science 2020-11-20 Hidenori Tanaka , Daniel Kunin , Daniel L. K. Yamins , Surya Ganguli

Interlock-Free Multi-Aspect Rationalization for Text Classification

Explanation is important for text classification tasks. One prevalent type of explanation is rationales, which are text snippets of input text that suffice to yield the prediction and are meaningful to humans. A lot of research on…

Computation and Language · Computer Science 2022-05-16 Shuangqi Li , Diego Antognini , Boi Faltings

End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

Machine learning systems perform well on pattern matching tasks, but their ability to perform algorithmic or logical reasoning is not well understood. One important reasoning capability is algorithmic extrapolation, in which models trained…

Machine Learning · Computer Science 2022-10-18 Arpit Bansal , Avi Schwarzschild , Eitan Borgnia , Zeyad Emam , Furong Huang , Micah Goldblum , Tom Goldstein

Think Straight, Stop Smart: Structured Reasoning for Efficient Multi-Hop RAG

Multi-hop retrieval-augmented generation (RAG) is a promising strategy for complex reasoning, yet existing iterative prompting approaches remain inefficient. They often regenerate predictable token sequences at every step and rely on…

Computation and Language · Computer Science 2025-10-23 Jihwan Bang , Juntae Lee , Seunghan Yang , Sungha Choi

Just Interpolate: Kernel "Ridgeless" Regression Can Generalize

In the absence of explicit regularization, Kernel "Ridgeless" Regression with nonlinear kernels has the potential to fit the training data perfectly. It has been observed empirically, however, that such interpolated solutions can still…

Statistics Theory · Mathematics 2020-07-27 Tengyuan Liang , Alexander Rakhlin

Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models

Reasoning LLMs (RLMs) such as OpenAI o1, DeepSeek-R1, and Qwen3 deliver strong multi-step reasoning through chain-of-thought generation, but their large model sizes and lengthy decode-time outputs make them costly to deploy and unsuitable…

Computation and Language · Computer Science 2025-12-03 Ziyan Wang , Enmao Diao , Qi Le , Pu Wang , Guanchu Wang , Minwoo Lee , Shu-ping Yeh , Li Yang

Neural Algorithmic Reasoning Without Intermediate Supervision

Neural algorithmic reasoning is an emerging area of machine learning focusing on building models that can imitate the execution of classic algorithms, such as sorting, shortest paths, etc. One of the main challenges is to learn algorithms…

Machine Learning · Computer Science 2023-11-02 Gleb Rodionov , Liudmila Prokhorenkova

Towards Explainable NLP: A Generative Explanation Framework for Text Classification

Building explainable systems is a critical problem in the field of Natural Language Processing (NLP), since most machine learning models provide no explanations for the predictions. Existing approaches for explainable machine learning…

Computation and Language · Computer Science 2019-06-12 Hui Liu , Qingyu Yin , William Yang Wang

A novel topology design approach using an integrated deep learning network architecture

Topology design optimization offers tremendous opportunity in design and manufacturing freedoms by designing and producing a part from the ground-up without a meaningful initial design as required by conventional shape design optimization…

Machine Learning · Statistics 2019-01-10 Sharad Rawat , M. H. Herman Shen

The generalized stochastic preference choice model

We propose a new discrete choice model, called the generalized stochastic preference (GSP) model, that incorporates non-rationality into the stochastic preference (SP) choice model, also known as the rank-based model. Our model can capture…

Computer Science and Game Theory · Computer Science 2025-08-28 Gerardo Berbeglia , Ashwin Venkataraman

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

This study investigates the self-rationalization framework constructed with a cooperative game, where a generator initially extracts the most informative segment from raw input, and a subsequent predictor utilizes the selected subset for…

Artificial Intelligence · Computer Science 2025-08-07 Wei Liu , Zhongyu Niu , Lang Gao , Zhiying Deng , Jun Wang , Haozhao Wang , Ruixuan Li

Early Stopping of Untrained Convolutional Neural Networks

In recent years, new regularization methods based on (deep) neural networks have shown very promising empirical performance for the numerical solution of ill-posed problems, e.g., in medical imaging and imaging science. Due to the…

Numerical Analysis · Mathematics 2024-06-07 Tim Jahn , Bangti Jin

Gold-Switch: Training-Free Superposition of Slow- and Fast- Thinking LLMs

Large Reasoning Models (LRMs) excel in structured tasks by emulating deliberate human reasoning but often suffer from overthinking, degrading performance and wasting resources. One possible baseline is to deploy both LLM and LRM, then route…

Computation and Language · Computer Science 2025-10-09 Jaeseong Lee , Dayoung Kwon , seung-won hwang

Learning to Shuffle: Block Reshuffling and Reversal Schemes for Stochastic Optimization

Shuffling strategies for stochastic gradient descent (SGD), including incremental gradient, shuffle-once, and random reshuffling, are supported by rigorous convergence analyses for arbitrary within-epoch permutations. In particular, random…

Machine Learning · Computer Science 2026-04-02 Lam M. Nguyen , Dzung T. Phan , Jayant Kalagnanam

Connecting the Dots Between MLE and RL for Sequence Prediction

Sequence prediction models can be learned from example sequences with a variety of training algorithms. Maximum likelihood learning is simple and efficient, yet can suffer from compounding error at test time. Reinforcement learning such as…

Machine Learning · Computer Science 2019-07-02 Bowen Tan , Zhiting Hu , Zichao Yang , Ruslan Salakhutdinov , Eric Xing

Rationales for Sequential Predictions

Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain. We consider model explanations though rationales, subsets of context that can explain individual model predictions. We find…

Computation and Language · Computer Science 2021-11-19 Keyon Vafa , Yuntian Deng , David M. Blei , Alexander M. Rush

Unsupervised Learning of Predictors from Unpaired Input-Output Samples

Unsupervised learning is the most challenging problem in machine learning and especially in deep learning. Among many scenarios, we study an unsupervised learning problem of high economic value --- learning to predict without costly pairing…

Machine Learning · Computer Science 2016-06-16 Jianshu Chen , Po-Sen Huang , Xiaodong He , Jianfeng Gao , Li Deng