English
Related papers

Related papers: Data Programming by Demonstration: A Framework for…

200 papers

Despite rapid developments in the field of machine learning research, collecting high-quality labels for supervised learning remains a bottleneck for many applications. This difficulty is exacerbated by the fact that state-of-the-art models…

Computation and Language · Computer Science 2021-06-25 Dongjin Choi , Sara Evensen , Çağatay Demiralp , Estevam Hruschka

Programming by demonstration (PbD) is an effective technique for developing complex robot manipulation tasks, such as opening bottles or using human tools. In order for such tasks to generalize to new scenes, the robot needs to be able to…

Robotics · Computer Science 2016-12-05 Justin Huang , Maya Cakmak

Deep learning models for natural language processing rely heavily on high-quality labeled datasets. However, existing labeling approaches often struggle to balance label quality with labeling cost. To address this challenge, we propose…

Human-Computer Interaction · Computer Science 2026-02-17 Guozheng Li , Ao Wang , Shaoxiang Wang , Yu Zhang , Pengcheng Cao , Yang Bai , Chi Harold Liu

Labeled datasets are essential for supervised machine learning. Various data labeling tools have been built to collect labels in different usage scenarios. However, developing labeling tools is time-consuming, costly, and…

Human-Computer Interaction · Computer Science 2022-03-29 Yu Zhang , Yun Wang , Haidong Zhang , Bin Zhu , Siming Chen , Dongmei Zhang

Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive…

Machine Learning · Statistics 2018-12-10 Alexander Ratner , Christopher De Sa , Sen Wu , Daniel Selsam , Christopher Ré

The paradigm of data programming, which uses weak supervision in the form of rules/labelling functions, and semi-supervised learning, which augments small amounts of labelled data with a large unlabelled dataset, have shown great promise in…

Machine Learning · Computer Science 2021-06-15 Ayush Maheshwari , Oishik Chatterjee , KrishnaTeja Killamsetty , Ganesh Ramakrishnan , Rishabh Iyer

Modern machine learning models require large labelled datasets to achieve good performance, but manually labelling large datasets is expensive and time-consuming. The data programming paradigm enables users to label large datasets…

Machine Learning · Computer Science 2024-02-12 Naiqing Guan , Nick Koudas

Most advanced supervised Machine Learning (ML) models rely on vast amounts of point-by-point labelled training examples. Hand-labelling vast amounts of data may be tedious, expensive, and error-prone. Recently, some studies have explored…

Machine Learning · Computer Science 2021-08-27 Chufan Gao , Mononito Goswami

Scarcity of labeled data is a bottleneck for supervised learning models. A paradigm that has evolved for dealing with this problem is data programming. An existing data programming paradigm allows human supervision to be provided as a set…

Machine Learning · Computer Science 2019-11-25 Oishik Chatterjee , Ganesh Ramakrishnan , Sunita Sarawagi

Deep learning model design, development, and debugging is a process driven by best practices, guidelines, trial-and-error, and the personal experiences of model developers. At multiple stages of this process, performance and internal model…

Human-Computer Interaction · Computer Science 2024-07-26 Thilo Spinner , Daniel Fürst , Mennatallah El-Assady

Real-world text classification tasks often require many labeled training examples that are expensive to obtain. Recent advancements in machine teaching, specifically the data programming paradigm, facilitate the creation of training data…

Machine Learning · Computer Science 2020-02-05 Neil Mallinar , Abhishek Shah , Tin Kam Ho , Rajendra Ugrani , Ayush Gupta

As machine learning for images becomes democratized in the Software 2.0 era, one of the serious bottlenecks is securing enough labeled data for training. This problem is especially critical in a manufacturing setting where smart factories…

Machine Learning · Computer Science 2022-12-02 Geon Heo , Yuji Roh , Seonghyeon Hwang , Dayun Lee , Steven Euijong Whang

As the number of applications that use machine learning algorithms increases, the need for labeled data useful for training such algorithms intensifies. Getting labels typically involves employing humans to do the annotation, which directly…

Machine Learning · Computer Science 2013-07-16 Alexandros Ntoulas , Omar Alonso , Vasilis Kandylas

Although large language models (LLMs) have advanced the state-of-the-art in NLP significantly, deploying them for downstream applications is still challenging due to cost, responsiveness, control, or concerns around privacy and security. As…

Computation and Language · Computer Science 2023-11-01 Dong-Ho Lee , Jay Pujara , Mohit Sewak , Ryen W. White , Sujay Kumar Jauhar

Labeling training datasets has become a key barrier to building medical machine learning models. One strategy is to generate training labels programmatically, for example by applying natural language processing pipelines to text reports…

Data-driven approaches are becoming more common as problem-solving techniques in many areas of research and industry. In most cases, machine learning models are the key component of these solutions, but a solution involves multiple such…

Artificial Intelligence · Computer Science 2019-06-20 Parisa Kordjamshidi , Dan Roth , Kristian Kersting

The cost of manual data labeling can be a significant obstacle in supervised learning. Data programming (DP) offers a weakly supervised solution for training dataset creation, wherein the outputs of user-defined programmatic labeling…

Machine Learning · Computer Science 2023-10-26 Jacqueline R. M. A. Maasch , Hao Zhang , Qian Yang , Fei Wang , Volodymyr Kuleshov

Programmatic weak supervision methodologies facilitate the expedited labeling of extensive datasets through the use of label functions (LFs) that encapsulate heuristic data sources. Nonetheless, the creation of precise LFs necessitates…

Computation and Language · Computer Science 2023-11-03 Naiqing Guan , Kaiwen Chen , Nick Koudas

Labeling data (e.g., labeling the people, objects, actions and scene in images) comprehensively and efficiently is a widely needed but challenging task. Numerous models were proposed to label various data and many approaches were designed…

Machine Learning · Computer Science 2020-02-14 Mu Yuan , Lan Zhang , Xiang-Yang Li , Hui Xiong

Scene labeling task is to segment the image into meaningful regions and categorize them into classes of objects which comprised the image. Commonly used methods typically find the local features for each segment and label them using…

Computer Vision and Pattern Recognition · Computer Science 2016-08-19 Nasim Souly , Mubarak Shah
‹ Prev 1 2 3 10 Next ›