English
Related papers

Related papers: Cost-effective Variational Active Entity Resolutio…

200 papers

Entity resolution (ER) is the task of identifying different representations of the same real-world entities across databases. It is a key step for knowledge base creation and text mining. Recent adaptation of deep learning methods for ER…

Databases · Computer Science 2019-06-20 Jungo Kasai , Kun Qian , Sairam Gurajada , Yunyao Li , Lucian Popa

The high cost of acquiring labels is one of the main challenges in deploying supervised machine learning algorithms. Active learning is a promising approach to control the learning process and address the difficulties of data labeling by…

Machine Learning · Computer Science 2019-11-19 Farhad Pourkamali-Anaraki , Michael B. Wakin

Usually considered as a classification problem, entity resolution (ER) can be very challenging on real data due to the prevalence of dirty values. The state-of-the-art solutions for ER were built on a variety of learning models (most…

Databases · Computer Science 2019-06-17 Boyi Hou , Qun Chen , Yanyan Wang , Youcef Nafa , Zhanhuai Li

Entity resolution targets at identifying records that represent the same real-world entity from one or more datasets. A major challenge in learning-based entity resolution is how to reduce the label cost for training. Due to the quadratic…

Machine Learning · Computer Science 2020-12-21 Jingyu Shao , Qing Wang , Asiri Wijesinghe , Erhard Rahm

The state-of-the-art performance on entity resolution (ER) has been achieved by deep learning. However, deep models are usually trained on large quantities of accurately labeled training data, and can not be easily tuned towards a target…

Machine Learning · Computer Science 2022-04-12 Zhaoqiang Chen , Qun Chen , Youcef Nafa , Tianyi Duan , Wei Pan , Lijun Zhang , Zhanhuai Li

While the state-of-the-art performance on entity resolution (ER) has been achieved by deep learning, its effectiveness depends on large quantities of accurately labeled training data. To alleviate the data labeling burden, Active Learning…

Machine Learning · Computer Science 2020-12-25 Youcef Nafa , Qun Chen , Zhaoqiang Chen , Xingyu Lu , Haiyang He , Tianyi Duan , Zhanhuai Li

We consider a serious, previously-unexplored challenge facing almost all approaches to scaling up entity resolution (ER) to multiple data sources: the prohibitive cost of labeling training data for supervised learning of similarity scores…

Databases · Computer Science 2012-08-10 Sahand Negahban , Benjamin I. P. Rubinstein , Jim Gemmell

Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning. However, deep ensembles often come with high computational costs and memory…

Entity resolution, the task of identifying and merging records that refer to the same real-world entity, is crucial in sectors like e-commerce, healthcare, and law enforcement. Large Language Models (LLMs) introduce an innovative approach…

Computation and Language · Computer Science 2024-09-13 Huahang Li , Longyu Feng , Shuangyin Li , Fei Hao , Chen Jason Zhang , Yuanfeng Song

Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many…

Computation and Language · Computer Science 2022-05-10 Akim Tsvigun , Artem Shelmanov , Gleb Kuzmin , Leonid Sanochkin , Daniil Larionov , Gleb Gusev , Manvel Avetisian , Leonid Zhukov

Deep convolutional neural networks have achieved great success in various applications. However, training an effective DNN model for a specific task is rather challenging because it requires a prior knowledge or experience to design the…

Machine Learning · Computer Science 2018-06-06 Sheng-Jun Huang , Jia-Wei Zhao , Zhao-Yang Liu

Entity resolution (ER) is a key data integration problem. Despite the efforts in 70+ years in all aspects of ER, there is still a high demand for democratizing ER - humans are heavily involved in labeling data, performing feature…

Databases · Computer Science 2019-11-20 Muhammad Ebraheem , Saravanan Thirumuruganathan , Shafiq Joty , Mourad Ouzzani , Nan Tang

In the domain of computer vision, deep residual neural networks like EfficientNet have set new standards in terms of robustness and accuracy. One key problem underlying the training of deep neural networks is the immanent lack of a…

Computer Vision and Pattern Recognition · Computer Science 2022-02-22 Raoul Schönhof , Jannes Elstner , Radu Manea , Steffen Tauber , Ramez Awad , Marco F. Huber

Active learning aims to reduce the high labeling cost involved in training machine learning models on large datasets by efficiently labeling only the most informative samples. Recently, deep active learning has shown success on various…

Computer Vision and Pattern Recognition · Computer Science 2019-12-12 Sudhanshu Mittal , Maxim Tatarchenko , Özgün Çiçek , Thomas Brox

Deep learning has yielded state-of-the-art performance on many natural language processing tasks including named entity recognition (NER). However, this typically requires large amounts of labeled data. In this work, we demonstrate that the…

Computation and Language · Computer Science 2018-02-06 Yanyao Shen , Hyokun Yun , Zachary C. Lipton , Yakov Kronrod , Animashree Anandkumar

Representation learning is a fundamental building block for analyzing entities in a database. While the existing embedding learning methods are effective in various data mining problems, their applicability is often limited because these…

Machine Learning · Computer Science 2020-09-24 Chin-Chia Michael Yeh , Dhruv Gelda , Zhongfang Zhuang , Yan Zheng , Liang Gou , Wei Zhang

The goal of entity matching is to find the corresponding records representing the same real-world entity from different data sources. At present, in the mainstream methods, rule-based entity matching methods need tremendous domain…

Machine Learning · Computer Science 2024-03-25 Youfang Han , Chunping Li

We focus on the problem of learning distributed representations for entity search queries, named entities, and their short descriptions. With our representation learning models, the entity search query, named entity and description can be…

Computation and Language · Computer Science 2017-01-17 Shijia E , Yang Xiang , Mohan Zhang

In standard methodology for natural language processing, entities in text are typically embedded in dense vector spaces with pre-trained models. The embeddings produced this way are effective when fed into downstream models, but they…

Computation and Language · Computer Science 2020-10-14 Yasumasa Onoe , Greg Durrett

While deep neural networks have succeeded in several visual applications, such as object recognition, detection, and localization, by reaching very high classification accuracies, it is important to note that many real-world applications…

Computer Vision and Pattern Recognition · Computer Science 2020-10-06 Yu-An Chung , Shao-Wen Yang , Hsuan-Tien Lin
‹ Prev 1 2 3 10 Next ›