Related papers: Measuring Item Similarity in Introductory Programm…
Measuring plagiarism in programming assignments is an essential task to the educational procedure. This paper discusses the methods of plagiarism and its detection in introductory programming course assignments written in C++. A small…
Similarity between objects is multi-faceted and it can be easier for human annotators to measure it when the focus is on a specific aspect. We consider the problem of mapping objects into view-specific embeddings where the distance between…
We present a technique for estimating the similarity between objects such as movies or foods whose proper representation depends on human perception. Our technique combines a modest number of human similarity assessments to infer a pairwise…
Defining similarity measures is a requirement for some machine learning methods. One such method is case-based reasoning (CBR) where the similarity measure is used to retrieve the stored case or set of cases most similar to the query case.…
Similarity measures play a fundamental role in memory-based nearest neighbors approaches. They recommend items to a user based on the similarity of either items or users in a neighborhood. In this paper we argue that, although it keeps a…
Intuitively, the concept of similarity is the notion to measure an inexact matching between two entities of the same reference set. The notions of similarity and its close relative dissimilarity are widely used in many fields of Artificial…
Fairness is an emerging and challenging topic in recommender systems. In recent years, various ways of evaluating and therefore improving fairness have emerged. In this study, we examine existing evaluation measures of fairness in…
The paper considers a new quantitative-qualitative proximity measure for the features of information objects, where data enters a common information resource from several sources independently. The goal is to determine the possibility of…
We present a model to measure the similarity in appearance between different materials, which correlates with human similarity judgments. We first create a database of 9,000 rendered images depicting objects with varying materials, shape…
Similarity measures are fundamental tools for quantifying the alignment between artificial and biological systems. However, the diversity of similarity measures and their varied naming and implementation conventions makes it challenging to…
Measuring similarity between training examples is critical for curating high-quality and diverse pretraining datasets for language models. However, similarity is typically computed with a generic off-the-shelf embedding model that has been…
Measuring inter-dataset similarity is an important task in machine learning and data mining with various use cases and applications. Existing methods for measuring inter-dataset similarity are computationally expensive, limited, or…
Plagiarism in introductory programming courses is an enormous challenge for both students and institutions. For students, relying on the work of others too early in their academic development can make it impossible to acquire necessary…
This paper is a study of techniques for measuring and predicting visual fidelity. As visual stimuli we use polygonal models, and vary their fidelity with two different model simplification algorithms. We also group the stimuli into two…
Many machine learning algorithms represent input data with vector embeddings or discrete codes. When inputs exhibit compositional structure (e.g. objects built from parts or procedures from subroutines), it is natural to ask whether this…
Similarity measures play a central role in various data science application domains for a wide assortment of tasks. This guide describes a comprehensive set of prevalent similarity measures to serve both non-experts and professional.…
Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects…
Identifying relationships between items is a key task of an online recommender system, in order to help users discover items that are functionally complementary or visually compatible. In domains like clothing recommendation, this task is…
Time series are ubiquitous, and a measure to assess their similarity is a core part of many computational systems. In particular, the similarity measure is the most essential ingredient of time series clustering and classification systems.…
There has been much discussion recently about how fairness should be measured or enforced in classification. Individual Fairness [Dwork, Hardt, Pitassi, Reingold, Zemel, 2012], which requires that similar individuals be treated similarly,…