Related papers: Convolutional Analysis Operator Learning: Dependen…

Convolutional Analysis Operator Learning: Acceleration and Convergence

Convolutional operator learning is gaining attention in many signal processing and computer vision applications. Learning kernels has mostly relied on so-called patch-domain approaches that extract and store many overlapping patches across…

Image and Video Processing · Electrical Eng. & Systems 2023-08-31 Il Yong Chun , Jeffrey A. Fessler

The Power of Linear Combinations: Learning with Random Convolutions

Following the traditional paradigm of convolutional neural networks (CNNs), modern CNNs manage to keep pace with more recent, for example transformer-based, models by not only increasing model depth and width but also the kernel size. This…

Computer Vision and Pattern Recognition · Computer Science 2023-06-23 Paul Gavrikov , Janis Keuper

Impact of Training Dataset Size on Neural Answer Selection Models

It is held as a truism that deep neural networks require large datasets to train effective models. However, large datasets, especially with high-quality labels, can be expensive to obtain. This study sets out to investigate (i) how large a…

Information Retrieval · Computer Science 2019-01-31 Trond Linjordet , Krisztian Balog

Analysis of Filter Size Effect In Deep Learning

With the use of deep learning in many areas, how to improve this technology or how to develop the structure used more effectively and in a shorter time is an issue that is of interest to many people working in this field. Many studies are…

Computer Vision and Pattern Recognition · Computer Science 2021-01-05 Yunus Camgözlü , Yakup Kutlu

Small-to-Large Generalization: Data Influences Models Consistently Across Scale

Choice of training data distribution greatly influences model behavior. Yet, in large-scale settings, precisely characterizing how changes in training data affects predictions is often difficult due to model training costs. Current practice…

Machine Learning · Computer Science 2025-05-23 Alaa Khaddaj , Logan Engstrom , Aleksander Madry

On the Importance of Data Size in Probing Fine-tuned Models

Several studies have investigated the reasons behind the effectiveness of fine-tuning, usually through the lens of probing. However, these studies often neglect the role of the size of the dataset on which the model is fine-tuned. In this…

Computation and Language · Computer Science 2022-03-21 Houman Mehrafarin , Sara Rajaee , Mohammad Taher Pilehvar

Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models

Instruction-tuning language models has become a crucial step in aligning them for general use. Typically, this process involves extensive training on large datasets, incurring high training costs. In this paper, we introduce a novel…

Computation and Language · Computer Science 2024-02-19 Dheeraj Mekala , Alex Nguyen , Jingbo Shang

Convolutional Analysis Operator Learning by End-To-End Training of Iterative Neural Networks

The concept of sparsity has been extensively applied for regularization in image reconstruction. Typically, sparsifying transforms are either pre-trained on ground-truth images or adaptively trained during the reconstruction. Thereby,…

Image and Video Processing · Electrical Eng. & Systems 2022-03-07 Andreas Kofler , Christian Wald , Tobias Schaeffter , Markus Haltmeier , Christoph Kolbitsch

Data Dropout: Optimizing Training Data for Convolutional Neural Networks

Deep learning models learn to fit training data while they are highly expected to generalize well to testing data. Most works aim at finding such models by creatively designing architectures and fine-tuning parameters. To adapt to…

Computer Vision and Pattern Recognition · Computer Science 2018-09-10 Tianyang Wang , Jun Huan , Bo Li

Data Curation Through the Lens of Spectral Dynamics: Static Limits, Dynamic Acceleration, and Practical Oracles

Large-scale neural models are increasingly trained with data pruning, synthetic data generation, cross-model distillation, reinforcement learning from human feedback (RLHF), and difficulty-based sampling. While several of these data-centric…

Machine Learning · Computer Science 2025-12-03 Yizhou Zhang , Lun Du

Credal Learning Theory

Statistical learning theory is the foundation of machine learning, providing theoretical bounds for the risk of models learned from a (single) training set, assumed to issue from an unknown probability distribution. In actual deployment,…

Machine Learning · Computer Science 2024-10-25 Michele Caprio , Maryam Sultana , Eleni Elia , Fabio Cuzzolin

Human-like machine learning: limitations and suggestions

This paper attempts to address the issues of machine learning in its current implementation. It is known that machine learning algorithms require a significant amount of data for training purposes, whereas recent developments in deep…

Machine Learning · Computer Science 2018-11-16 Georgios Mastorakis

The Role of Pre-training Data in Transfer Learning

The transfer learning paradigm of model pre-training and subsequent fine-tuning produces high-accuracy models. While most studies recommend scaling the pre-training size to benefit most from transfer learning, a question remains: what data…

Computer Vision and Pattern Recognition · Computer Science 2023-03-02 Rahim Entezari , Mitchell Wortsman , Olga Saukh , M. Moein Shariatnia , Hanie Sedghi , Ludwig Schmidt

Sharing the Learned Knowledge-base to Estimate Convolutional Filter Parameters for Continual Image Restoration

Continual learning is an emerging topic in the field of deep learning, where a model is expected to learn continuously for new upcoming tasks without forgetting previous experiences. This field has witnessed numerous advancements, but few…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Aupendu Kar , Krishnendu Ghosh , Prabir Kumar Biswas

On Reducing the Amount of Samples Required for Training of QNNs: Constraints on the Linear Structure of the Training Data

Training classical neural networks generally requires a large number of training samples. Using entangled training samples, Quantum Neural Networks (QNNs) have the potential to significantly reduce the amount of training samples required in…

Quantum Physics · Physics 2023-12-12 Alexander Mandl , Johanna Barzen , Frank Leymann , Daniel Vietz

Recursive Training Loops in LLMs: How training data properties modulate distribution shift in generated data?

Large language models (LLMs) are increasingly used in the creation of online content, creating feedback loops as subsequent generations of models will be trained on this synthetic data. Such loops were shown to lead to distribution shifts -…

Machine Learning · Computer Science 2025-12-29 Grgur Kovač , Jérémy Perez , Rémy Portelas , Peter Ford Dominey , Pierre-Yves Oudeyer

An analysis of the transfer learning of convolutional neural networks for artistic images

Transfer learning from huge natural image datasets, fine-tuning of deep neural networks and the use of the corresponding pre-trained networks have become de facto the core of art analysis applications. Nevertheless, the effects of transfer…

Computer Vision and Pattern Recognition · Computer Science 2020-11-25 Nicolas Gonthier , Yann Gousseau , Saïd Ladjal

Causal Deep Reinforcement Learning Using Observational Data

Deep reinforcement learning (DRL) requires the collection of interventional data, which is sometimes expensive and even unethical in the real world, such as in the autonomous driving and the medical field. Offline reinforcement learning…

Machine Learning · Computer Science 2023-06-12 Wenxuan Zhu , Chao Yu , Qiang Zhang

Data Complexity Estimates for Operator Learning

Operator learning has emerged as a new paradigm for the data-driven approximation of nonlinear operators. Despite its empirical success, the theoretical underpinnings governing the conditions for efficient operator learning remain…

Machine Learning · Computer Science 2024-10-21 Nikola B. Kovachki , Samuel Lanthaler , Hrushikesh Mhaskar

Capturing the Temporal Dependence of Training Data Influence

Traditional data influence estimation methods, like influence function, assume that learning algorithms are permutation-invariant with respect to training data. However, modern training paradigms, especially for foundation models using…

Machine Learning · Computer Science 2024-12-13 Jiachen T. Wang , Dawn Song , James Zou , Prateek Mittal , Ruoxi Jia