Related papers: Unsupervised authorship attribution

Text Classification For Authorship Attribution Analysis

Authorship attribution mainly deals with undecided authorship of literary texts. Authorship attribution is useful in resolving issues like uncertain authorship, recognize authorship of unknown texts, spot plagiarism so on. Statistical…

Digital Libraries · Computer Science 2013-10-21 M. Sudheep Elayidom , Chinchu Jose , Anitta Puthussery , Neenu K Sasi

Authorship clustering using multi-headed recurrent neural networks

A recurrent neural network that has been trained to separately model the language of several documents by unknown authors is used to measure similarity between the documents. It is able to find clues of common authorship even when the…

Computation and Language · Computer Science 2016-08-17 Douglas Bagnall

A Machine Learning Framework for Authorship Identification From Texts

Authorship identification is a process in which the author of a text is identified. Most known literary texts can easily be attributed to a certain author because they are, for example, signed. Yet sometimes we find unfinished pieces of…

Computation and Language · Computer Science 2019-12-24 Rahul Radhakrishnan Iyer , Carolyn Penstein Rose

Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised Person Re-Identification and Text Authorship Attribution

Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution. Recent self-supervised learning methods have shown to be effective when dealing with…

Computer Vision and Pattern Recognition · Computer Science 2023-07-03 Gabriel Bertocco , Antônio Theophilo , Fernanda Andaló , Anderson Rocha

Siamese Networks for Large-Scale Author Identification

Authorship attribution is the process of identifying the author of a text. Approaches to tackling it have been conventionally divided into classification-based ones, which work well for small numbers of candidate authors, and…

Computation and Language · Computer Science 2021-05-18 Chakaveh Saedi , Mark Dras

Authorship Verification - An Approach based on Random Forest

Authorship attribution, being an important problem in many areas in-cluding information retrieval, computational linguistics, law and journalism etc., has been identified as a subject of increasingly research interest in the re-cent years.…

Computation and Language · Computer Science 2016-08-01 Promita Maitra , Souvick Ghosh , Dipankar Das

Seeking the Truth Beyond the Data. An Unsupervised Machine Learning Approach

Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The…

Machine Learning · Statistics 2023-10-20 Dimitrios Saligkaras , Vasileios E. Papageorgiou

A Framework for Authorial Clustering of Shorter Texts in Latent Semantic Spaces

Authorial clustering involves the grouping of documents written by the same author or team of authors without any prior positive examples of an author's writing style or thematic preferences. For authorial clustering on shorter texts…

Computation and Language · Computer Science 2020-12-01 Rafi Trad , Myra Spiliopoulou

Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering

Authorship attribution (i.e., determining who is the author of a piece of source code) is an established research topic. State-of-the-art results for the authorship attribution problem look promising for the software engineering field,…

Software Engineering · Computer Science 2021-06-22 Egor Bogomolov , Vladimir Kovalenko , Yurii Rebryk , Alberto Bacchelli , Timofey Bryksin

Recognizing Handwriting Styles in a Historical Scanned Document Using Unsupervised Fuzzy Clustering

The forensic attribution of the handwriting in a digitized document to multiple scribes is a challenging problem of high dimensionality. Unique handwriting styles may be dissimilar in a blend of several factors including character size,…

Computer Vision and Pattern Recognition · Computer Science 2023-06-30 Sriparna Majumdar , Aaron Brick

Unsupervised Identification of Translationese

Translated texts are distinctively different from original ones, to the extent that supervised text classification methods can distinguish between them with high accuracy. These differences were proven useful for statistical machine…

Computation and Language · Computer Science 2016-09-13 Ella Rabinovich , Shuly Wintner

The Topic Confusion Task: A Novel Scenario for Authorship Attribution

Authorship attribution is the problem of identifying the most plausible author of an anonymous text from a set of candidate authors. Researchers have investigated same-topic and cross-topic scenarios of authorship attribution, which differ…

Computation and Language · Computer Science 2021-09-10 Malik H. Altakrori , Jackie Chi Kit Cheung , Benjamin C. M. Fung

Comparative study of Authorship Identification Techniques for Cyber Forensics Analysis

Authorship Identification techniques are used to identify the most appropriate author from group of potential suspects of online messages and find evidences to support the conclusion. Cybercriminals make misuse of online communication for…

Computers and Society · Computer Science 2014-02-20 Smita Nirkhi , R. V. Dharaskar

Separating the articles of authors with the same name

I describe a method to separate the articles of different authors with the same name. It is based on a distance between any two publications, defined in terms of the probability that they would have as many coincidences if they were drawn…

Digital Libraries · Computer Science 2007-05-23 Jose M. Soler

Can Authorship Attribution Models Distinguish Speakers in Speech Transcripts?

Authorship verification is the task of determining if two distinct writing samples share the same author and is typically concerned with the attribution of written text. In this paper, we explore the attribution of transcribed speech, which…

Computation and Language · Computer Science 2025-05-19 Cristina Aggazzotti , Nicholas Andrews , Elizabeth Allyn Smith

Authorship Identification of Source Code Segments Written by Multiple Authors Using Stacking Ensemble Method

Source code segment authorship identification is the task of identifying the author of a source code segment through supervised learning. It has vast importance in plagiarism detection, digital forensics, and several other law enforcement…

Software Engineering · Computer Science 2022-12-13 Parvez Mahbub , Naz Zarreen Oishie , S M Rafizul Haque

Semi-Supervised Constrained Clustering: An In-Depth Overview, Ranked Taxonomy and Future Research Directions

Clustering is a well-known unsupervised machine learning approach capable of automatically grouping discrete sets of instances with similar characteristics. Constrained clustering is a semi-supervised extension to this process that can be…

Machine Learning · Computer Science 2023-03-02 Germán González-Almagro , Daniel Peralta , Eli De Poorter , José-Ramón Cano , Salvador García

Authorship Attribution Based on Life-Like Network Automata

The authorship attribution is a problem of considerable practical and technical interest. Several methods have been designed to infer the authorship of disputed documents in multiple contexts. While traditional statistical methods based…

Computation and Language · Computer Science 2018-03-28 Jeaneth Machicao , Edilson A. Corrêa , Gisele H. B. Miranda , Diego R. Amancio , Odemir M. Bruno

Unsupervised clustering analysis: a multiscale complex networks approach

Unsupervised clustering, also known as natural clustering, stands for the classification of data according to their similarities. Here we study this problem from the perspective of complex networks. Mapping the description of data…

Data Analysis, Statistics and Probability · Physics 2012-08-22 Clara Granell , Sergio Gomez , Alex Arenas

On the "Calligraphy" of Books

Authorship attribution is a natural language processing task that has been widely studied, often by considering small order statistics. In this paper, we explore a complex network approach to assign the authorship of texts based on their…

Computation and Language · Computer Science 2017-08-08 Vanessa Q. Marinho , Henrique F. de Arruda , Thales S. Lima , Luciano F. Costa , Diego R. Amancio