Related papers: Classifying Constructive Comments
Previous studies have shown that high-quality code comments assist developers in program comprehension and maintenance tasks. However, the semi-structured nature of comments, unclear conventions for writing good comments, and the lack of…
With the spread of online social networks, it is more and more difficult to monitor all the user-generated content. Automating the moderation process of the inappropriate exchange content on Internet has thus become a priority task. Methods…
Providing feedback on the argumentation of the learner is essential for developing critical thinking skills, however, it requires a lot of time and effort. To mitigate the overload on teachers, we aim to automate a process of providing…
With the growing prevalence of large language models, it is increasingly common to annotate datasets for machine learning using pools of crowd raters. However, these raters often work in isolation as individual crowdworkers. In this work,…
Computational models of argument quality (AQ) have focused primarily on assessing the overall quality or just one specific characteristic of an argument, such as its convincingness or its clarity. However, previous work has claimed that…
This report focuses on enhancing a binary code comment quality classification model by integrating generated code and comment pairs, to improve model accuracy. The dataset comprises 9048 pairs of code and comments written in the C…
Creativity is a complex, multi-faceted concept encompassing a variety of related aspects, abilities, properties and behaviours. If we wish to study creativity scientifically, then a tractable and well-articulated model of creativity is…
Providing constructive feedback to paper authors is a core component of peer review. With reviewers increasingly having less time to perform reviews, automated support systems are required to ensure high reviewing quality, thus making the…
This paper explores a novel method for enhancing binary classification models that assess code comment quality, leveraging Generative Artificial Intelligence to elevate model performance. By integrating 1,437 newly generated code-comment…
This paper presents a large-scale corpus for non-task-oriented dialogue response selection, which contains over 27K distinct prompts more than 82K responses collected from social media. To annotate this corpus, we define a 5-grade rating…
In this paper we introduce a framework for annotating a social media text corpora for various categories. Since, social media data is generated via individuals, it is important to annotate the text for the individuals demographic attributes…
Current news commenting systems are designed based on implicitly individualistic assumptions, where discussion is the result of a series of disconnected opinions. This often results in fragmented and polarized conversations that fail to…
The rapid expansion of online courses and social media has generated large volumes of unstructured learner-generated text. Understanding how learners construct knowledge in these spaces is crucial for analysing learning processes, informing…
Comment sections below online news articles enjoy growing popularity among readers. However, the overwhelming number of comments makes it infeasible for the average news consumer to read all of them and hinders engaging discussions. Most…
Many computer scientists use the aggregated answers of online workers to represent ground truth. Prior work has shown that aggregation methods such as majority voting are effective for measuring relatively objective features. For subjective…
Having a quality annotated corpus is essential especially for applied research. Despite the recent focus of Web science community on researching about cyberbullying, the community dose not still have standard benchmarks. In this paper, we…
The use of large language models like ChatGPT in code review offers promising efficiency gains but also raises concerns about correctness and safety. Existing evaluation methods for code review generation either rely on automatic…
We introduce a corpus of 7,032 sentences rated by human annotators for formality, informativeness, and implicature on a 1-7 scale. The corpus was annotated using Amazon Mechanical Turk. Reliability in the obtained judgments was examined by…
Positive feedback via likes and awards is central to online governance, yet which attributes of users' posts elicit rewards -- and how these vary across authors and communities -- remains unclear. To examine this, we combine…
Automated fact-checking based on machine learning is a promising approach to identify false information distributed on the web. In order to achieve satisfactory performance, machine learning methods require a large corpus with reliable…