English
Related papers

Related papers: CORAL: COde RepresentAtion Learning with Weakly-Su…

200 papers

With the development of computational power and techniques for data collection, deep learning demonstrates a superior performance over most existing algorithms on visual benchmark data sets. Many efforts have been devoted to studying the…

Computer Vision and Pattern Recognition · Computer Science 2021-08-25 Yuanhong Xu , Qi Qian , Hao Li , Rong Jin , Juhua Hu

Obtaining enough labeled data to robustly train complex discriminative models is a major bottleneck in the machine learning pipeline. A popular solution is combining multiple sources of weak supervision using generative models. The…

Machine Learning · Computer Science 2017-09-11 Paroma Varma , Bryan He , Payal Bajaj , Imon Banerjee , Nishith Khandwala , Daniel L. Rubin , Christopher Ré

Code review is an integral part of any mature software development process, and identifying the best reviewer for a code change is a well-accepted problem within the software engineering community. Selecting a reviewer who lacks expertise…

Learning high-level causal representations together with a causal model from unstructured low-level data such as pixels is impossible from observational data alone. We prove under mild assumptions that this representation is however…

Machine Learning · Statistics 2022-10-12 Johann Brehmer , Pim de Haan , Phillip Lippe , Taco Cohen

Recent trends in natural language processing research and annotation tasks affirm a paradigm shift from the traditional reliance on a single ground truth to a focus on individual perspectives, particularly in subjective tasks. In scenarios…

Computation and Language · Computer Science 2024-04-18 Olufunke O. Sarumi , Béla Neuendorf , Joan Plepi , Lucie Flek , Jörg Schlötterer , Charles Welch

Collecting large-scale medical datasets with fine-grained annotations is time-consuming and requires experts. For this reason, weakly supervised learning aims at optimising machine learning models using weaker forms of annotations, such as…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Gabriele Valvano , Andrea Leo , Sotirios A. Tsaftaris

To analyze the scaling potential of deep tabular representation learning models, we introduce a novel Transformer-based architecture specifically tailored to tabular data and cross-table representation learning by utilizing table-specific…

Machine Learning · Computer Science 2023-10-02 Maximilian Schambach , Dominique Paul , Johannes S. Otterbach

Recent transformer-based approaches demonstrate promising results on relational scientific information extraction. Existing datasets focus on high-level description of how research is carried out. Instead we focus on the subtleties of how…

Computation and Language · Computer Science 2021-09-23 Ian H. Magnusson , Scott E. Friedman

Transfer learning is a proven technique in 2D computer vision to leverage the large amount of data available and achieve high performance with datasets limited in size due to the cost of acquisition or annotation. In 3D, annotation is known…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Jules Sanchez , Jean-Emmanuel Deschaud , François Goulette

The state-of-the-art named entity recognition (NER) systems are supervised machine learning models that require large amounts of manually annotated data to achieve high accuracy. However, annotating NER data by human is expensive and…

Computation and Language · Computer Science 2019-11-04 Jian Ni , Georgiana Dinu , Radu Florian

In software engineering, numerous studies have focused on the analysis of fine-grained logs, leading to significant innovations in areas such as refactoring, security, and code completion. However, no similar studies have been conducted for…

Supervised neural approaches are hindered by their dependence on large, meticulously annotated datasets, a requirement that is particularly cumbersome for sequential tasks. The quality of annotations tends to deteriorate with the transition…

We propose Corder, a self-supervised contrastive learning framework for source code model. Corder is designed to alleviate the need of labeled data for code retrieval and code summarization tasks. The pre-trained model of Corder can be used…

Software Engineering · Computer Science 2021-05-25 Nghi D. Q. Bui , Yijun Yu , Lingxiao Jiang

Computational notebooks such as Jupyter are popular for exploratory data analysis and insight finding. Despite the module-based structure, notebooks visually appear as a single thread of interleaved cells containing text, code,…

Human-Computer Interaction · Computer Science 2023-08-22 Chen Chen , Jane Hoffswell , Shunan Guo , Ryan Rossi , Yeuk-Yin Chan , Fan Du , Eunyee Koh , Zhicheng Liu

Deep neural networks are able to learn powerful representations from large quantities of labeled input data, however they cannot always generalize well across changes in input distributions. Domain adaptation algorithms have been proposed…

Computer Vision and Pattern Recognition · Computer Science 2016-07-07 Baochen Sun , Kate Saenko

In many applications, training machine learning models involves using large amounts of human-annotated data. Obtaining precise labels for the data is expensive. Instead, training with weak supervision provides a low-cost alternative. We…

Machine Learning · Computer Science 2022-02-09 Chidubem Arachie , Bert Huang

Computational notebooks are the primary coding tools for data scientists, but their code quality remains understudied and often poor. Given the importance of maintainability and reusability, enhancing code understandability is essential.…

Software Engineering · Computer Science 2025-06-19 Mojtaba Mostafavi Ghahfarokhi , Alireza Asadi , Arash Asgari , Bardia Mohammadi , Abbas Heydarnoori , Masih Beigi Rizi

Weak supervision has shown promising results in many natural language processing tasks, such as Named Entity Recognition (NER). Existing work mainly focuses on learning deep NER models only with weak supervision, i.e., without any human…

Computation and Language · Computer Science 2021-08-03 Haoming Jiang , Danqing Zhang , Tianyu Cao , Bing Yin , Tuo Zhao

We explore the value of weak labels in learning transferable representations for medical images. Compared to hand-labeled datasets, weak or inexact labels can be acquired in large quantities at significantly lower cost and can provide…

Computer Vision and Pattern Recognition · Computer Science 2021-08-05 Boon Peng Yap , Beng Koon Ng

Deep neural networks are gaining increasing popularity for the classic text classification task, due to their strong expressive power and less requirement for feature engineering. Despite such attractiveness, neural text classification…

Information Retrieval · Computer Science 2018-09-13 Yu Meng , Jiaming Shen , Chao Zhang , Jiawei Han
‹ Prev 1 2 3 10 Next ›