Related papers: Understanding intermediate layers using linear cla…

Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers

In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks.…

Machine Learning · Computer Science 2023-11-21 Andrea Apicella , Francesco Isgrò , Roberto Prevete

Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

Over the past decade, deep learning has proven to be a highly effective tool for learning meaningful features from raw data. However, it remains an open question how deep networks perform hierarchical feature learning across layers. In this…

Machine Learning · Computer Science 2025-11-17 Peng Wang , Xiao Li , Can Yaras , Zhihui Zhu , Laura Balzano , Wei Hu , Qing Qu

Assessing Intersectional Bias in Representations of Pre-Trained Image Recognition Models

Deep Learning models have achieved remarkable success. Training them is often accelerated by building on top of pre-trained models which poses the risk of perpetuating encoded biases. Here, we investigate biases in the representations of…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Valerie Krug , Sebastian Stober

Confidence Scoring Using Whitebox Meta-models with Linear Classifier Probes

We propose a novel confidence scoring mechanism for deep neural networks based on a two-model paradigm involving a base model and a meta-model. The confidence score is learned by the meta-model observing the base model succeeding/failing at…

Machine Learning · Computer Science 2019-04-19 Tongfei Chen , Jiří Navrátil , Vijay Iyengar , Karthikeyan Shanmugam

Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions

Deep neural networks are widely used for classification. These deep models often suffer from a lack of interpretability -- they are particularly difficult to understand because of their non-linear nature. As a result, neural networks are…

Artificial Intelligence · Computer Science 2017-11-22 Oscar Li , Hao Liu , Chaofan Chen , Cynthia Rudin

Collaborative Layer-wise Discriminative Learning in Deep Neural Networks

Intermediate features at different layers of a deep neural network are known to be discriminative for visual patterns of different complexities. However, most existing works ignore such cross-layer heterogeneities when classifying samples…

Computer Vision and Pattern Recognition · Computer Science 2016-07-20 Xiaojie Jin , Yunpeng Chen , Jian Dong , Jiashi Feng , Shuicheng Yan

Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes

Large Language Models (LLMs) are often used as automated judges to evaluate text, but their effectiveness can be hindered by various unintentional biases. We propose using linear classifying probes, trained by leveraging differences between…

Computation and Language · Computer Science 2025-03-25 Sharan Maiya , Yinhong Liu , Ramit Debnath , Anna Korhonen

Kernelized Classification in Deep Networks

We propose a kernelized classification layer for deep networks. Although conventional deep networks introduce an abundance of nonlinearity for representation (feature) learning, they almost universally use a linear classifier on the learned…

Machine Learning · Computer Science 2021-03-22 Sadeep Jayasumana , Srikumar Ramalingam , Sanjiv Kumar

DE-PACRR: Exploring Layers Inside the PACRR Model

Recent neural IR models have demonstrated deep learning's utility in ad-hoc information retrieval. However, deep models have a reputation for being black boxes, and the roles of a neural IR model's components may not be obvious at first…

Information Retrieval · Computer Science 2017-07-25 Andrew Yates , Kai Hui

Towards Disentangling Information Paths with Coded ResNeXt

The conventional, widely used treatment of deep learning models as black boxes provides limited or no insights into the mechanisms that guide neural network decisions. Significant research effort has been dedicated to building interpretable…

Computer Vision and Pattern Recognition · Computer Science 2023-09-21 Apostolos Avranas , Marios Kountouris

Decision Explanation and Feature Importance for Invertible Networks

Deep neural networks are vulnerable to adversarial attacks and hard to interpret because of their black-box nature. The recently proposed invertible network is able to accurately reconstruct the inputs to a layer from its outputs, thus has…

Machine Learning · Computer Science 2019-10-16 Juntang Zhuang , Nicha C. Dvornek , Xiaoxiao Li , Junlin Yang , James S. Duncan

Understanding and Leveraging the Learning Phases of Neural Networks

The learning dynamics of deep neural networks are not well understood. The information bottleneck (IB) theory proclaimed separate fitting and compression phases. But they have since been heavily debated. We comprehensively analyze the…

Machine Learning · Computer Science 2023-12-15 Johannes Schneider , Mohit Prabhushankar

Detecting Adversarial Examples and Other Misclassifications in Neural Networks by Introspection

Despite having excellent performances for a wide variety of tasks, modern neural networks are unable to provide a reliable confidence value allowing to detect misclassifications. This limitation is at the heart of what is known as an…

Machine Learning · Computer Science 2019-05-23 Jonathan Aigrain , Marcin Detyniecki

Probing Classifiers: Promises, Shortcomings, and Advances

Probing classifiers have emerged as one of the prominent methodologies for interpreting and analyzing deep neural network models of natural language processing. The basic idea is simple -- a classifier is trained to predict some linguistic…

Computation and Language · Computer Science 2021-09-23 Yonatan Belinkov

Learning Neural Network Classifiers with Low Model Complexity

Modern neural network architectures for large-scale learning tasks have substantially higher model complexities, which makes understanding, visualizing and training these architectures difficult. Recent contributions to deep learning…

Machine Learning · Computer Science 2024-10-30 Jayadeva , Himanshu Pant , Mayank Sharma , Abhimanyu Dubey , Sumit Soman , Suraj Tripathi , Sai Guruju , Nihal Goalla

Neural Networks as Functional Classifiers

In recent years, there has been considerable innovation in the world of predictive methodologies. This is evident by the relative domination of machine learning approaches in various classification competitions. While these algorithms have…

Machine Learning · Statistics 2020-10-12 Barinder Thind , Kevin Multani , Jiguo Cao

Complexity of Representations in Deep Learning

Deep neural networks use multiple layers of functions to map an object represented by an input vector progressively to different representations, and with sufficient training, eventually to a single score for each class that is the output…

Machine Learning · Computer Science 2022-09-02 Tin Kam Ho

Layer by Layer: Uncovering Hidden Representations in Language Models

From extracting features to generating text, the outputs of large language models (LLMs) typically rely on the final layers, following the conventional wisdom that earlier layers capture only low-level cues. However, our analysis shows that…

Machine Learning · Computer Science 2025-06-17 Oscar Skean , Md Rifat Arefin , Dan Zhao , Niket Patel , Jalal Naghiyev , Yann LeCun , Ravid Shwartz-Ziv

Discriminative Learning via Semidefinite Probabilistic Models

Discriminative linear models are a popular tool in machine learning. These can be generally divided into two types: The first is linear classifiers, such as support vector machines, which are well studied and provide state-of-the-art…

Machine Learning · Computer Science 2012-07-02 Koby Crammer , Amir Globerson

Concept Probing: Where to Find Human-Defined Concepts (Extended Version)

Concept probing has recently gained popularity as a way for humans to peek into what is encoded within artificial neural networks. In concept probing, additional classifiers are trained to map the internal representations of a model into…

Machine Learning · Computer Science 2025-07-28 Manuel de Sousa Ribeiro , Afonso Leote , João Leite