Related papers: Classifying Constructive Comments

Speculative Analysis for Quality Assessment of Code Comments

Previous studies have shown that high-quality code comments assist developers in program comprehension and maintenance tasks. However, the semi-structured nature of comments, unclear conventions for writing good comments, and the lack of…

Software Engineering · Computer Science 2021-07-27 Pooja Rani

WAC: A Corpus of Wikipedia Conversations for Online Abuse Detection

With the spread of online social networks, it is more and more difficult to monitor all the user-generated content. Automating the moderation process of the inappropriate exchange content on Internet has thus become a priority task. Methods…

Computation and Language · Computer Science 2021-01-19 Noé Cecillon , Vincent Labatut , Richard Dufour , Georges Linares

TYPIC: A Corpus of Template-Based Diagnostic Comments on Argumentation

Providing feedback on the argumentation of the learner is essential for developing critical thinking skills, however, it requires a lot of time and effort. To mitigate the overload on teachers, we aim to automate a process of providing…

Computation and Language · Computer Science 2022-06-22 Shoichi Naito , Shintaro Sawada , Chihiro Nakagawa , Naoya Inoue , Kenshi Yamaguchi , Iori Shimizu , Farjana Sultana Mim , Keshav Singh , Kentaro Inui

Annotator in the Loop: A Case Study of In-Depth Rater Engagement to Create a Bridging Benchmark Dataset

With the growing prevalence of large language models, it is increasingly common to annotate datasets for machine learning using pools of crowd raters. However, these raters often work in isolation as individual crowdworkers. In this work,…

Computers and Society · Computer Science 2024-08-05 Sonja Schmer-Galunder , Ruta Wheelock , Scott Friedman , Alyssa Chvasta , Zaria Jalan , Emily Saltz

Creating a Domain-diverse Corpus for Theory-based Argument Quality Assessment

Computational models of argument quality (AQ) have focused primarily on assessing the overall quality or just one specific characteristic of an argument, such as its convincingness or its clarity. However, previous work has claimed that…

Computation and Language · Computer Science 2020-11-04 Lily Ng , Anne Lauscher , Joel Tetreault , Courtney Napoles

Enhancing Binary Code Comment Quality Classification: Integrating Generative AI for Improved Accuracy

This report focuses on enhancing a binary code comment quality classification model by integrating generated code and comment pairs, to improve model accuracy. The dataset comprises 9048 pairs of code and comments written in the C…

Software Engineering · Computer Science 2023-10-19 Rohith Arumugam S , Angel Deborah S

Modelling Creativity: Identifying Key Components through a Corpus-Based Approach

Creativity is a complex, multi-faceted concept encompassing a variety of related aspects, abilities, properties and behaviours. If we wish to study creativity scientifically, then a tractable and well-articulated model of creativity is…

Computation and Language · Computer Science 2017-02-08 Anna Jordanous , Bill Keller

The Good, the Bad and the Constructive: Automatically Measuring Peer Review's Utility for Authors

Providing constructive feedback to paper authors is a core component of peer review. With reviewers increasingly having less time to perform reviews, automated support systems are required to ensure high reviewing quality, thus making the…

Computation and Language · Computer Science 2025-09-23 Abdelrahman Sadallah , Tim Baumgärtner , Iryna Gurevych , Ted Briscoe

Enhancing Code Annotation Reliability: Generative AI's Role in Comment Quality Assessment Models

This paper explores a novel method for enhancing binary classification models that assess code comment quality, leveraging Generative Artificial Intelligence to elevate model performance. By integrating 1,437 newly generated code-comment…

Software Engineering · Computer Science 2024-10-30 Seetharam Killivalavan , Durairaj Thenmozhi

A Manually Annotated Chinese Corpus for Non-task-oriented Dialogue Systems

This paper presents a large-scale corpus for non-task-oriented dialogue response selection, which contains over 27K distinct prompts more than 82K responses collected from social media. To annotate this corpus, we define a 5-grade rating…

Computation and Language · Computer Science 2018-05-16 Jing Li , Yan Song , Haisong Zhang , Shuming Shi

A Framework for Generating Annotated Social Media Corpora with Demographics, Stance, Civility, and Topicality

In this paper we introduce a framework for annotating a social media text corpora for various categories. Since, social media data is generated via individuals, it is important to annotate the text for the individuals demographic attributes…

Computation and Language · Computer Science 2020-12-11 Shubhanshu Mishra , Daniel Collier

Fostering Collective Discourse: A Distributed Role-Based Approach to Online News Commenting

Current news commenting systems are designed based on implicitly individualistic assumptions, where discussion is the result of a series of disconnected opinions. This often results in fragmented and polarized conversations that fail to…

Human-Computer Interaction · Computer Science 2026-02-13 Yoojin Hong , Yersultan Doszhan , Joseph Seering

Analysing Knowledge Construction in Online Learning: Adapting the Interaction Analysis Model for Unstructured Large-Scale Discourse

The rapid expansion of online courses and social media has generated large volumes of unstructured learner-generated text. Understanding how learners construct knowledge in these spaces is crucial for analysing learning processes, informing…

Computation and Language · Computer Science 2025-12-17 Jindi Wang , Yidi Zhang , Zhaoxing Li , Pedro Bem Haja , Ioannis Ivrissimtzis , Zichen Zhao , Sebastian Stein

Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions

Comment sections below online news articles enjoy growing popularity among readers. However, the overwhelming number of comments makes it infeasible for the average news consumer to read all of them and hinders engaging discussions. Most…

Information Retrieval · Computer Science 2020-03-27 Julian Risch , Ralf Krestel

Toward Effective Automated Content Analysis via Crowdsourcing

Many computer scientists use the aggregated answers of online workers to represent ground truth. Prior work has shown that aggregation methods such as majority voting are effective for measuring relatively objective features. For subjective…

Computation and Language · Computer Science 2021-04-06 Jiele Wu , Chau-Wai Wong , Xinyan Zhao , Xianpeng Liu

A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research

Having a quality annotated corpus is essential especially for applied research. Despite the recent focus of Web science community on researching about cyberbullying, the community dose not still have standard benchmarks. In this paper, we…

Computation and Language · Computer Science 2018-05-25 Mohammadreza Rezvan , Saeedeh Shekarpour , Lakshika Balasuriya , Krishnaprasad Thirunarayan , Valerie Shalin , Amit Sheth

On Assessing the Relevance of Code Reviews Authored by Generative Models

The use of large language models like ChatGPT in code review offers promising efficiency gains but also raises concerns about correctness and safety. Existing evaluation methods for code review generation either rely on automatic…

Software Engineering · Computer Science 2025-12-18 Robert Heumüller , Frank Ortmeier

SQUINKY! A Corpus of Sentence-level Formality, Informativeness, and Implicature

We introduce a corpus of 7,032 sentences rated by human annotators for formality, informativeness, and implicature on a 1-7 scale. The corpus was annotated using Amazon Mechanical Turk. Reliability in the obtained judgments was examined by…

Computation and Language · Computer Science 2016-09-29 Shibamouli Lahiri

The Language of Approval: Identifying the Drivers of Positive Feedback Online

Positive feedback via likes and awards is central to online governance, yet which attributes of users' posts elicit rewards -- and how these vary across authors and communities -- remains unclear. To examine this, we combine…

Human-Computer Interaction · Computer Science 2026-02-03 Agam Goyal , Charlotte Lambert , Eshwar Chandrasekharan

A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking

Automated fact-checking based on machine learning is a promising approach to identify false information distributed on the web. In order to achieve satisfactory performance, machine learning methods require a large corpus with reliable…

Computation and Language · Computer Science 2019-11-05 Andreas Hanselowski , Christian Stab , Claudia Schulz , Zile Li , Iryna Gurevych