Related papers: Learning on Model Weights using Tree Experts

Deep Linear Probe Generators for Weight Space Learning

Weight space learning aims to extract information about a neural network, such as its training dataset or generalization error. Recent approaches learn directly from model weights, but this presents many challenges as weights are…

Machine Learning · Computer Science 2025-10-23 Jonathan Kahana , Eliahu Horwitz , Imri Shuval , Yedid Hoshen

Learning to Reweight Examples for Robust Deep Learning

Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns. However, they can also easily overfit to training set biases and label noises. In addition to…

Machine Learning · Computer Science 2019-05-07 Mengye Ren , Wenyuan Zeng , Bin Yang , Raquel Urtasun

Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights

With the increasing numbers of publicly available models, there are probably pretrained, online models for most tasks users require. However, current model search methods are rudimentary, essentially a text-based search in the…

Machine Learning · Computer Science 2025-02-14 Jonathan Kahana , Or Nathan , Eliahu Horwitz , Yedid Hoshen

Revealing Secrets From Pre-trained Models

With the growing burden of training deep learning models with large data sets, transfer-learning has been widely adopted in many emerging deep learning algorithms. Transformer models such as BERT are the main player in natural language…

Cryptography and Security · Computer Science 2022-07-21 Mujahid Al Rafi , Yuan Feng , Hyeran Jeon

Improving Simple Models with Confidence Profiles

In this paper, we propose a new method called ProfWeight for transferring information from a pre-trained deep neural network that has a high test accuracy to a simpler interpretable model or a very shallow network of low complexity and a…

Machine Learning · Computer Science 2018-11-20 Amit Dhurandhar , Karthikeyan Shanmugam , Ronny Luss , Peder Olsen

Structural Dropout for Model Width Compression

Existing ML models are known to be highly over-parametrized, and use significantly more resources than required for a given task. Prior work has explored compressing models offline, such as by distilling knowledge from larger models into…

Machine Learning · Computer Science 2022-05-17 Julian Knodt

TREX: Tree-Ensemble Representer-Point Explanations

How can we identify the training examples that contribute most to the prediction of a tree ensemble? In this paper, we introduce TREX, an explanation system that provides instance-attribution explanations for tree ensembles, such as random…

Machine Learning · Computer Science 2021-12-20 Jonathan Brophy , Daniel Lowd

Probing Classifiers: Promises, Shortcomings, and Advances

Probing classifiers have emerged as one of the prominent methodologies for interpreting and analyzing deep neural network models of natural language processing. The basic idea is simple -- a classifier is trained to predict some linguistic…

Computation and Language · Computer Science 2021-09-23 Yonatan Belinkov

Understanding intermediate layers using linear classifier probes

Neural network models have a reputation for being black boxes. We propose to monitor the features at every layer of a model and measure how suitable they are for classification. We use linear classifiers, which we refer to as "probes",…

Machine Learning · Statistics 2018-11-26 Guillaume Alain , Yoshua Bengio

MLDS: A Dataset for Weight-Space Analysis of Neural Networks

Neural networks are powerful models that solve a variety of complex real-world problems. However, the stochastic nature of training and large number of parameters in a typical neural model makes them difficult to evaluate via inspection.…

Machine Learning · Computer Science 2021-04-22 John Clemens

Exploring space efficiency in a tree-based linear model for extreme multi-label classification

Extreme multi-label classification (XMC) aims to identify relevant subsets from numerous labels. Among the various approaches for XMC, tree-based linear models are effective due to their superior efficiency and simplicity. However, the…

Machine Learning · Computer Science 2024-10-15 He-Zhe Lin , Cheng-Hung Liu , Chih-Jen Lin

Knowledge Trees: Gradient Boosting Decision Trees on Knowledge Neurons as Probing Classifier

To understand how well a large language model captures certain semantic or syntactic features, researchers typically apply probing classifiers. However, the accuracy of these classifiers is critical for the correct interpretation of the…

Computation and Language · Computer Science 2023-12-19 Sergey A. Saltykov

Probing via Prompting

Probing is a popular method to discern what linguistic information is contained in the representations of pre-trained language models. However, the mechanism of selecting the probe model has recently been subject to intense debate, as it is…

Computation and Language · Computer Science 2022-07-06 Jiaoda Li , Ryan Cotterell , Mrinmaya Sachan

Deep Weighted Averaging Classifiers

Recent advances in deep learning have achieved impressive gains in classification accuracy on a variety of types of data, including images and text. Despite these gains, however, concerns have been raised about the calibration, robustness,…

Machine Learning · Computer Science 2018-11-20 Dallas Card , Michael Zhang , Noah A. Smith

Experiments with Optimal Model Trees

Model trees provide an appealing way to perform interpretable machine learning for both classification and regression problems. In contrast to ``classic'' decision trees with constant values in their leaves, model trees can use linear…

Machine Learning · Computer Science 2026-03-11 Sabino Francesco Roselli , Eibe Frank

Interpreting Shared Deep Learning Models via Explicable Boundary Trees

Despite outperforming the human in many tasks, deep neural network models are also criticized for the lack of transparency and interpretability in decision making. The opaqueness results in uncertainty and low confidence when deploying such…

Machine Learning · Computer Science 2017-09-14 Huijun Wu , Chen Wang , Jie Yin , Kai Lu , Liming Zhu

Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale

Large language model pre-training has traditionally relied on human experts to craft heuristics for improving the corpora quality, resulting in numerous rules developed to date. However, these rules lack the flexibility to address the…

Computation and Language · Computer Science 2025-02-17 Fan Zhou , Zengzhi Wang , Qian Liu , Junlong Li , Pengfei Liu

Model Discovery with Grammatical Evolution. An Experiment with Prime Numbers

Machine Learning produces efficient decision and prediction models based on input-output data only. Such models have the form of decision trees or neural nets and are far from transparent analytical models, based on mathematical formulas.…

Artificial Intelligence · Computer Science 2025-05-20 Jakub Skrzyński , Dominik Sepioło , Antoni Ligęza

Deep Learning Through the Lens of Example Difficulty

Existing work on understanding deep learning often employs measures that compress all data-dependent information into a few numbers. In this work, we adopt a perspective based on the role of individual examples. We introduce a measure of…

Machine Learning · Computer Science 2021-06-21 Robert J. N. Baldock , Hartmut Maennel , Behnam Neyshabur

Training Machine Learning Models by Regularizing their Explanations

Neural networks are among the most accurate supervised learning methods in use today. However, their opacity makes them difficult to trust in critical applications, especially when conditions in training may differ from those in practice.…

Machine Learning · Computer Science 2018-10-03 Andrew Slavin Ross