English
Related papers

Related papers: Active Deep Learning on Entity Resolution by Risk …

200 papers

The state-of-the-art performance on entity resolution (ER) has been achieved by deep learning. However, deep models are usually trained on large quantities of accurately labeled training data, and can not be easily tuned towards a target…

Machine Learning · Computer Science 2022-04-12 Zhaoqiang Chen , Qun Chen , Youcef Nafa , Tianyi Duan , Wei Pan , Lijun Zhang , Zhanhuai Li

Entity Resolution (ER) is a critical task for data integration, yet state-of-the-art supervised deep learning models remain impractical for many real-world applications due to their need for massive, expensive-to-obtain labeled datasets.…

Databases · Computer Science 2026-01-29 Dimitrios Karapiperis , Leonidas Akritidis , Panayiotis Bozanis , Vassilios Verykios

Usually considered as a classification problem, entity resolution (ER) can be very challenging on real data due to the prevalence of dirty values. The state-of-the-art solutions for ER were built on a variety of learning models (most…

Databases · Computer Science 2019-06-17 Boyi Hou , Qun Chen , Yanyan Wang , Youcef Nafa , Zhanhuai Li

Entity resolution (ER) is the task of identifying different representations of the same real-world entities across databases. It is a key step for knowledge base creation and text mining. Recent adaptation of deep learning methods for ER…

Databases · Computer Science 2019-06-20 Jungo Kasai , Kun Qian , Sairam Gurajada , Yunyao Li , Lucian Popa

Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many…

Computation and Language · Computer Science 2022-05-10 Akim Tsvigun , Artem Shelmanov , Gleb Kuzmin , Leonid Sanochkin , Daniil Larionov , Gleb Gusev , Manvel Avetisian , Leonid Zhukov

Labeling data can be an expensive task as it is usually performed manually by domain experts. This is cumbersome for deep learning, as it is dependent on large labeled datasets. Active learning (AL) is a paradigm that aims to reduce…

Computation and Language · Computer Science 2021-11-05 Pieter Floris Jacobs , Gideon Maillette de Buy Wenniger , Marco Wiering , Lambert Schomaker

Most of the existing learning models, particularly deep neural networks, are reliant on large datasets whose hand-labeling is expensive and time demanding. A current trend is to make the learning of these models frugal and less dependent on…

Computer Vision and Pattern Recognition · Computer Science 2022-12-12 Sebastien Deschamps , Hichem Sahbi

Active learning (AL) aims to enable training high performance classifiers with low annotation cost by predicting which subset of unlabelled instances would be most beneficial to label. The importance of AL has motivated extensive research,…

Machine Learning · Computer Science 2018-06-14 Kunkun Pang , Mingzhi Dong , Yang Wu , Timothy Hospedales

Pure machine-based solutions usually struggle in the challenging classification tasks such as entity resolution (ER). To alleviate this problem, a recent trend is to involve the human in the resolution process, most notably the…

Databases · Computer Science 2018-08-15 Zhaoqiang Chen , Qun Chen , Boyi Hou , Murtadha Ahmed , Zhanhuai Li

Given a limited labeling budget, active learning (AL) aims to sample the most informative instances from an unlabeled pool to acquire labels for subsequent model training. To achieve this, AL typically measures the informativeness of…

Machine Learning · Computer Science 2023-07-07 Cheng Chen , Yong Wang , Lizi Liao , Yueguo Chen , Xiaoyong Du

Active learning (AL) for multiple target models aims to reduce labeled data querying while effectively training multiple models concurrently. Existing AL algorithms often rely on iterative model training, which can be computationally…

Machine Learning · Computer Science 2024-10-03 Sheng-Jun Huang , Yi Li , Yiming Sun , Ying-Peng Tang

Named entity recognition (NER) aims to identify mentions of named entities in an unstructured text and classify them into predefined named entity classes. While deep learning-based pre-trained language models help to achieve good predictive…

Computation and Language · Computer Science 2023-06-16 Ali Osman Berk Sapci , Oznur Tastan , Reyyan Yeniterzi

Convolutional Neural Networks (CNNs) have proven to be state-of-the-art models for supervised computer vision tasks, such as image classification. However, large labeled data sets are generally needed for the training and validation of such…

Machine Learning · Computer Science 2020-10-28 Patrick Hemmer , Niklas Kühl , Jakob Schöffer

Entity Matching (EM) is a core data cleaning task, aiming to identify different mentions of the same real-world entity. Active learning is one way to address the challenge of scarce labeled data in practice, by dynamically collecting the…

Databases · Computer Science 2020-03-31 Venkata Vamsikrishna Meduri , Lucian Popa , Prithviraj Sen , Mohamed Sarwat

Recently, several studies have investigated active learning (AL) for natural language processing tasks to alleviate data dependency. However, for query selection, most of these studies mainly rely on uncertainty-based sampling, which…

Computation and Language · Computer Science 2020-11-30 Yekyung Kim

Accurately identifying different representations of the same real-world entity is an integral part of data cleaning and many methods have been proposed to accomplish it. The challenges of this entity resolution task that demand so much…

Machine Learning · Computer Science 2021-06-02 Alex Bogatu , Norman W. Paton , Mark Douthwaite , Stuart Davie , Andre Freitas

At its core, this thesis aims to enhance the practicality of deep learning by improving the label and training efficiency of deep learning models. To this end, we investigate data subset selection techniques, specifically active learning…

Machine Learning · Computer Science 2024-03-11 Andreas Kirsch

Efficient data annotation remains a critical challenge in machine learning, particularly for object detection tasks requiring extensive labeled data. Active learning (AL) has emerged as a promising solution to minimize annotation costs by…

Computer Vision and Pattern Recognition · Computer Science 2025-09-25 Somraj Gautam , Nachiketa Purohit , Gaurav Harit

Risk-based active learning is an approach to developing statistical classifiers for online decision-support. In this approach, data-label querying is guided according to the expected value of perfect information for incipient data points.…

Machine Learning · Computer Science 2022-06-28 Aidan J. Hughes , Lawrence A. Bull , Paul Gardner , Nikolaos Dervilis , Keith Worden

Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion. Current mainstream methods -- neural EA models -- rely on training with seed alignment, i.e., a set of…

Computation and Language · Computer Science 2021-10-14 Bing Liu , Harrisen Scells , Guido Zuccon , Wen Hua , Genghong Zhao
‹ Prev 1 2 3 10 Next ›