English
Related papers

Related papers: View-Driven Deduplication with Active Learning

200 papers

In the era of big data, the issue of data quality has become increasingly prominent. One of the main challenges is the problem of duplicate data, which can arise from repeated entry or the merging of multiple data sources. These "dirty…

Machine Learning · Computer Science 2025-01-13 Haochen Shi , Xinyao Liu , Fengmao Lv , Hongtao Xue , Jie Hu , Shengdong Du , Tianrui Li

Recent dataset deduplication techniques have demonstrated that content-aware dataset pruning can dramatically reduce the cost of training Vision-Language Pretrained (VLP) models without significant performance losses compared to training on…

Computer Vision and Pattern Recognition · Computer Science 2024-04-26 Eric Slyman , Stefan Lee , Scott Cohen , Kushal Kafle

Benchmark datasets in computer vision often contain off-topic images, near duplicates, and label errors, leading to inaccurate estimates of model performance. In this paper, we revisit the task of data cleaning and formalize it as either a…

One of the most useful techniques to help visual data analysis systems is interactive filtering (brushing). However, visualization techniques often suffer from overlap of graphical items and multiple attributes complexity, making visual…

Graphics · Computer Science 2015-07-07 Jose Rodrigues , Luciana Romani , Agma Traina , Caetano Traina

Active learners alleviate the burden of labeling large amounts of data by detecting and asking the user to label only the most informative examples in the domain. We focus here on active learning for multi-view domains, in which there are…

Machine Learning · Computer Science 2011-10-06 C. A. Knoblock , S. Minton , I. Muslea

Imperfections in data annotation, known as label noise, are detrimental to the training of machine learning models and have an often-overlooked confounding effect on the assessment of model performance. Nevertheless, employing experts to…

We contribute a deep-learning-based method that assists in designing analytical dashboards for analyzing a data table. Given a data table, data workers usually need to experience a tedious and time-consuming process to select meaningful…

Human-Computer Interaction · Computer Science 2021-07-19 Aoyu Wu , Yun Wang , Mengyu Zhou , Xinyi He , Haidong Zhang , Huamin Qu , Dongmei Zhang

Big data analysis has become an active area of study with the growth of machine learning techniques. To properly analyze data, it is important to maintain high-quality data. Thus, research on data cleaning is also important. It is difficult…

Databases · Computer Science 2019-10-25 Toshiyuki Shimizu , Hiroki Omori , Masatoshi Yoshikawa

Constructing supervised machine learning models for real-world video analysis require substantial labeled data, which is costly to acquire due to scarce domain expertise and laborious manual inspection. While data programming shows promise…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Jianben He , Xingbo Wang , Kam Kwai Wong , Xijie Huang , Changjian Chen , Zixin Chen , Fengjie Wang , Min Zhu , Huamin Qu

Many active learning and search approaches are intractable for large-scale industrial settings with billions of unlabeled examples. Existing approaches search globally for the optimal examples to label, scaling linearly or even…

Quality control is a key activity performed by manufacturing enterprises to ensure products meet quality standards and avoid potential damage to the brand's reputation. The decreased cost of sensors and connectivity enabled an increasing…

Machine Learning · Computer Science 2021-09-07 Elena Trajkova , Jože M. Rožanec , Paulien Dam , Blaž Fortuna , Dunja Mladenić

Active learning aims to reduce the high labeling cost involved in training machine learning models on large datasets by efficiently labeling only the most informative samples. Recently, deep active learning has shown success on various…

Computer Vision and Pattern Recognition · Computer Science 2019-12-12 Sudhanshu Mittal , Maxim Tatarchenko , Özgün Çiçek , Thomas Brox

Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many…

Computation and Language · Computer Science 2022-05-10 Akim Tsvigun , Artem Shelmanov , Gleb Kuzmin , Leonid Sanochkin , Daniil Larionov , Gleb Gusev , Manvel Avetisian , Leonid Zhukov

Interactive data visualization is a major part of modern exploratory data analysis, with web-based technologies enabling a rich ecosystem of both specialized and general tools. However, current visualization tools often lack support for…

Human-Computer Interaction · Computer Science 2025-08-14 Jan Simson

In many applications, data is easy to acquire but expensive and time-consuming to label prominent examples include medical imaging and NLP. This disparity has only grown in recent years as our ability to collect data improves. Under these…

Machine Learning · Computer Science 2021-04-07 Jaya Krishna Mandivarapu , Blake Camp , Rolando Estrada

The aim of Active Learning is to select the most informative samples from an unlabelled set of data. This is useful in cases where the amount of data is large and labelling is expensive, such as in machine vision or medical imaging. Two…

Computer Vision and Pattern Recognition · Computer Science 2026-01-13 Julien Combes , Alexandre Derville , Jean-François Coeurjolly

We propose ViewAL, a novel active learning strategy for semantic segmentation that exploits viewpoint consistency in multi-view datasets. Our core idea is that inconsistencies in model predictions across viewpoints provide a very reliable…

Computer Vision and Pattern Recognition · Computer Science 2020-03-20 Yawar Siddiqui , Julien Valentin , Matthias Nießner

Recent advances in visual analytics have enabled us to learn from user interactions and uncover analytic goals. These innovations set the foundation for actively guiding users during data exploration. Providing such guidance will become…

Human-Computer Interaction · Computer Science 2022-07-19 Shayan Monadjemi , Sunwoo Ha , Quan Nguyen , Henry Chai , Roman Garnett , Alvitta Ottley

Interactive visualizations are crucial in ad hoc data exploration and analysis. However, with the growing number of massive datasets, generating visualizations in interactive timescales is increasingly challenging. One approach for…

Databases · Computer Science 2017-01-25 Yongjoo Park , Michael Cafarella , Barzan Mozafari

Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry. Their data curation poses the challenges of expensive human labeling, inadequate computing resources and larger experiment turn around…

Computer Vision and Pattern Recognition · Computer Science 2019-01-07 Vishal Kaushal , Rishabh Iyer , Suraj Kothawade , Rohan Mahadev , Khoshrav Doctor , Ganesh Ramakrishnan
‹ Prev 1 2 3 10 Next ›