Related papers: Adaptive Data Selection for Multi-Layer Perceptron…

Finding High-Value Training Data Subset through Differentiable Convex Programming

Finding valuable training data points for deep neural networks has been a core research challenge with many applications. In recent years, various techniques for calculating the "value" of individual training datapoints have been proposed…

Machine Learning · Computer Science 2021-04-29 Soumi Das , Arshdeep Singh , Saptarshi Chatterjee , Suparna Bhattacharya , Sourangshu Bhattacharya

Deep Variable-Block Chain with Adaptive Variable Selection

The architectures of deep neural networks (DNN) rely heavily on the underlying grid structure of variables, for instance, the lattice of pixels in an image. For general high dimensional data with variables not associated with a grid, the…

Machine Learning · Statistics 2024-08-07 Lixiang Zhang , Lin Lin , Jia Li

Exploring Data Redundancy in Real-world Image Classification through Data Selection

Deep learning models often require large amounts of data for training, leading to increased costs. It is particularly challenging in medical imaging, i.e., gathering distributed data for centralized training, and meanwhile, obtaining…

Computer Vision and Pattern Recognition · Computer Science 2023-06-27 Zhenyu Tang , Shaoting Zhang , Xiaosong Wang

Selective Deep Convolutional Neural Network for Low Cost Distorted Image Classification

Deep convolutional neural networks have proven to be well suited for image classification applications. However, if there is distortion in the image, the classification accuracy can be significantly degraded, even with state-of-the-art…

Computer Vision and Pattern Recognition · Computer Science 2019-02-15 Minho Ha , Younghoon Byeon , Youngjoo Lee , Sunggu Lee

Your Vision-Language Model Itself Is a Strong Filter: Towards High-Quality Instruction Tuning with Data Selection

Data selection in instruction tuning emerges as a pivotal process for acquiring high-quality data and training instruction-following large language models (LLMs), but it is still a new and unexplored research area for vision-language models…

Computation and Language · Computer Science 2024-02-21 Ruibo Chen , Yihan Wu , Lichang Chen , Guodong Liu , Qi He , Tianyi Xiong , Chenxi Liu , Junfeng Guo , Heng Huang

DMVC: Multi-Camera Video Compression Network aimed at Improving Deep Learning Accuracy

We introduce a cutting-edge video compression framework tailored for the age of ubiquitous video data, uniquely designed to serve machine learning applications. Unlike traditional compression methods that prioritize human visual perception,…

Computer Vision and Pattern Recognition · Computer Science 2024-10-25 Huan Cui , Qing Li , Hanling Wang , Yong jiang

D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning

Recent advancements in instruction tuning for large language models (LLMs) suggest that a small, high-quality dataset can significantly equip LLMs with instruction-following capabilities, outperforming large datasets often burdened by…

Machine Learning · Computer Science 2025-05-20 Jia Zhang , Chen-Xi Zhang , Yao Liu , Yi-Xuan Jin , Xiao-Wen Yang , Bo Zheng , Yi Liu , Lan-Zhe Guo

On Training and Evaluation of Neural Network Approaches for Model Predictive Control

The contribution of this paper is a framework for training and evaluation of Model Predictive Control (MPC) implemented using constrained neural networks. Recent studies have proposed to use neural networks with differentiable convex…

Machine Learning · Statistics 2020-05-11 Rebecka Winqvist , Arun Venkitaraman , Bo Wahlberg

Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds

Deep learning systems extensively use convolution operations to process input data. Though convolution is clearly defined for structured data such as 2D images or 3D volumes, this is not true for other data types such as sparse point…

Computer Vision and Pattern Recognition · Computer Science 2018-09-26 Pedro Hermosilla , Tobias Ritschel , Pere-Pau Vázquez , Àlvar Vinacua , Timo Ropinski

ADRS-CNet: An adaptive dimensionality reduction selection and classification network for DNA storage clustering algorithms

DNA storage technology offers new possibilities for addressing massive data storage due to its high storage density, long-term preservation, low maintenance cost, and compact size. To improve the reliability of stored information, base…

Machine Learning · Computer Science 2024-09-24 Bowen Liu , Jiankun Li

A Cost-Sensitive Deep Belief Network for Imbalanced Classification

Imbalanced data with a skewed class distribution are common in many real-world applications. Deep Belief Network (DBN) is a machine learning technique that is effective in classification tasks. However, conventional DBN does not work well…

Machine Learning · Computer Science 2018-05-08 Chong Zhang , Kay Chen Tan , Haizhou Li , Geok Soon Hong

Data Valuation using Reinforcement Learning

Quantifying the value of data is a fundamental problem in machine learning. Data valuation has multiple important use cases: (1) building insights about the learning task, (2) domain adaptation, (3) corrupted sample discovery, and (4)…

Machine Learning · Computer Science 2019-09-27 Jinsung Yoon , Sercan O. Arik , Tomas Pfister

Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

We approach structured output prediction by optimizing a deep value network (DVN) to precisely estimate the task loss on different output configurations for a given input. Once the model is trained, we perform inference by gradient descent…

Machine Learning · Computer Science 2017-08-09 Michael Gygli , Mohammad Norouzi , Anelia Angelova

Adaptive Prioritized Random Linear Coding and Scheduling for Layered Data Delivery from Multiple Servers

In this paper, we deal with the problem of jointly determining the optimal coding strategy and the scheduling decisions when receivers obtain layered data from multiple servers. The layered data is encoded by means of Prioritized Random…

Information Theory · Computer Science 2016-11-18 Nikolaos Thomos , Eymen Kurdoglu , Pascal Frossard , Mihaela Van der Schaar

Learning What Data to Learn

Machine learning is essentially the sciences of playing with data. An adaptive data selection strategy, enabling to dynamically choose different data at various training stages, can reach a more effective model in a more efficient way. In…

Machine Learning · Computer Science 2017-03-01 Yang Fan , Fei Tian , Tao Qin , Jiang Bian , Tie-Yan Liu

Domain Adaptive Transfer Learning on Visual Attention Aware Data Augmentation for Fine-grained Visual Categorization

Fine-Grained Visual Categorization (FGVC) is a challenging topic in computer vision. It is a problem characterized by large intra-class differences and subtle inter-class differences. In this paper, we tackle this problem in a weakly…

Computer Vision and Pattern Recognition · Computer Science 2020-10-08 Ashiq Imran , Vassilis Athitsos

Scalable Decision-Focused Learning through Cost-Sensitive Regression

Many real-world combinatorial problems involve uncertain parameters, which can be predicted given contextual features and historical data. These `predict-then-optimize' or `contextual optimization' problems have gained significant…

Machine Learning · Computer Science 2026-05-19 Noah Schutte , Senne Berden , Tias Guns , Krzysztof Postek , Neil Yorke-Smith

Adaptive Data Dropout: Towards Self-Regulated Learning in Deep Neural Networks

Deep neural networks are typically trained by uniformly sampling large datasets across epochs, despite evidence that not all samples contribute equally throughout learning. Recent work shows that progressively reducing the amount of…

Machine Learning · Computer Science 2026-04-15 Amar Gahir , Varshil Patel , Shreyank N Gowda

Convex Dataset Valuation for Post-Training

Improving LLM performance on downstream tasks sometimes requires leveraging auxiliary datasets during post-training. In practice, however, developers face constraints on compute, labeling, and licensing costs that preclude using all…

Machine Learning · Computer Science 2026-05-19 Siqi Zeng , Christopher Jung , Rui Li , Zhe Kang , Ming Li , Nima Noorshams , Zhigang Wang , Fuchun Peng , Han Zhao , Xue Feng

RL-Guided Data Selection for Language Model Finetuning

Data selection for finetuning Large Language Models (LLMs) can be framed as a budget-constrained optimization problem: maximizing a model's downstream performance under a strict training data budget. Solving this problem is generally…

Machine Learning · Computer Science 2025-10-01 Animesh Jha , Harshit Gupta , Ananjan Nandi