English
Related papers

Related papers: Document Image Coding and Clustering for Script Di…

200 papers

The paper presents a new script classification method for the discrimination of the South Slavic medieval labels. It consists in the textural analysis of the script types. In the first step, each letter is coded by the equivalent script…

Computer Vision and Pattern Recognition · Computer Science 2015-09-08 Darko Brodic , Alessia Amelio , Zoran N. Milivojevic

Document segmentation is a method of rending the document into distinct regions. A document is an assortment of information and a standard mode of conveying information to others. Pursuance of data from documents involves ton of human…

Computer Vision and Pattern Recognition · Computer Science 2013-03-05 N. Priyadharshini , M. S. Vijaya

Document clustering is an unsupervised approach in which a large collection of documents (corpus) is subdivided into smaller, meaningful, identifiable, and verifiable sub-groups (clusters). Meaningful representation of documents and…

Information Retrieval · Computer Science 2014-12-08 Muhammad Rafi , Farnaz Amin , Mohammad Shahid Shaikh

Information on different fields which are collected by users requires appropriate management and organization to be structured in a standard way and retrieved fast and more easily. Document classification is a conventional method to…

Information Retrieval · Computer Science 2019-09-18 Madjid Khalilian , Shiva Hassanzadeh

Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next…

Information Retrieval · Computer Science 2011-12-30 Muhammad Rafi , M. Shahid Shaikh , Amir Farooq

Text Clustering is a text mining technique which divides the given set of text documents into significant clusters. It is used for organizing a huge number of text documents into a well-organized form. In the majority of the clustering…

Information Retrieval · Computer Science 2015-03-12 G. Hannah Grace , Kalyani Desikan

We propose a novel clustering pipeline to detect and characterize influence campaigns from documents. This approach clusters parts of document, detects clusters that likely reflect an influence campaign, and then identifies documents linked…

Computation and Language · Computer Science 2024-04-30 Zhengxiang Wang , Owen Rambow

The paper proposes an algorithm for the script recognition based on the texture characteristics. The image texture is achieved by coding each letter with the equivalent script type (number code) according to its position in the text line.…

Computer Vision and Pattern Recognition · Computer Science 2015-09-01 Darko Brodic , Zoran N. Milivojevic , Alessia Amelio

The analysis of historical documents is still a topical issue given the importance of information that can be extracted and also the importance given by the institutions to preserve their heritage. The main idea in order to characterize the…

Computer Vision and Pattern Recognition · Computer Science 2013-08-30 Nizar Zaghden , Remy Mullot , Mohamed Adel Alimi

The knowledge of source printer can help in printed text document authentication, copyright ownership, and provide important clues about the author of a fraudulent document along with his/her potential means and motives. Development of…

Multimedia · Computer Science 2019-06-18 Sharad Joshi , Nitin Khanna

Text line segmentation is one of the pre-stages of modern optical character recognition systems. The algorithmic approach proposed by this paper has been designed for this exact purpose. Its main characteristic is the combination of two…

Computer Vision and Pattern Recognition · Computer Science 2023-06-22 Pit Schneider

In this paper, we show how selecting and combining encodings of natural and mathematical language affect classification and clustering of documents with mathematical content. We demonstrate this by using sets of documents, sections, and…

Digital Libraries · Computer Science 2020-05-25 Philipp Scharpf , Moritz Schubotz , Abdou Youssef , Felix Hamborg , Norman Meuschke , Bela Gipp

Typography and layout lead to the hierarchical organisation of text in words, text lines, paragraphs. This inherent structure is a key property of text in any script and language, which has nonetheless been minimally leveraged by existing…

Computer Vision and Pattern Recognition · Computer Science 2024-10-30 Lluis Gomez , Dimosthenis Karatzas

A major computational burden, while performing document clustering, is the calculation of similarity measure between a pair of documents. Similarity measure is a function that assign a real number between 0 and 1 to a pair of documents,…

Information Retrieval · Computer Science 2012-08-20 Muhammad Rafi , Sundus Hassan , Mohammad Shahid Shaikh

The forensic attribution of the handwriting in a digitized document to multiple scribes is a challenging problem of high dimensionality. Unique handwriting styles may be dissimilar in a blend of several factors including character size,…

Computer Vision and Pattern Recognition · Computer Science 2023-06-30 Sriparna Majumdar , Aaron Brick

India is a multilingual multi-script country. In every state of India there are two languages one is state local language and the other is English. For example in Andhra Pradesh, a state in India, the document may contain text words in…

Computer Vision and Pattern Recognition · Computer Science 2012-05-11 Ankit Kumar , Tushar Patnaik , Vivek Kr Verma

Text Categorization is traditionally done by using the term frequency and inverse document frequency.This type of method is not very good because, some words which are not so important may appear in the document .The term frequency of…

Information Retrieval · Computer Science 2016-11-25 Srikanth Bethu , G Charless Babu , J Vinoda , E Priyadarshini , M Raghavendra rao

Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that…

Information Retrieval · Computer Science 2014-01-13 R. Jensi , Dr. G. Wiselin Jiji

Text clustering holds significant value across various domains due to its ability to identify patterns and group related information. Current approaches which rely heavily on a computed similarity measure between documents are often limited…

Information Retrieval · Computer Science 2025-04-09 Laurence Hirsch , Robin Hirsch , Bayode Ogunleye

Traditional clustering methods aim to group unlabeled data points based on their similarity to each other. However, clustering, in the absence of additional information, is an ill-posed problem as there may be many different, yet equally…

Computer Vision and Pattern Recognition · Computer Science 2025-09-10 Bingchen Zhao , Oisin Mac Aodha
‹ Prev 1 2 3 10 Next ›