English
Related papers

Related papers: Automatic Page Segmentation Without Decompressing …

200 papers

From the literature, it is demonstrated that performing text-line segmentation directly in the run-length compressed handwritten document images significantly reduces the computational time and memory space. In this paper, we investigate…

Computer Vision and Pattern Recognition · Computer Science 2019-09-12 Amarnath R , P. Nagabhushan , Mohammed Javed

With the rapid increase in the volume of Big data of this digital era, fax documents, invoices, receipts, etc are traditionally subjected to compression for the efficiency of data storage and transfer. However, in order to process these…

Computer Vision and Pattern Recognition · Computer Science 2014-10-15 Mohammed Javed , P. Nagabhushan , B. B. Chaudhuri

Segmentation of a text-document into lines, words and characters, which is considered to be the crucial pre-processing stage in Optical Character Recognition (OCR) is traditionally carried out on uncompressed documents, although most of the…

Computer Vision and Pattern Recognition · Computer Science 2014-04-01 Mohammed Javed , P. Nagabhushan , B. B. Chaudhuri

Line separators are used to segregate text-lines from one another in document image analysis. Finding the separator points at every line terminal in a document image would enable text-line segmentation. In particular, identifying the…

Computer Vision and Pattern Recognition · Computer Science 2017-08-21 Amarnath R , P. Nagabhushan

There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such…

Computer Vision and Pattern Recognition · Computer Science 2007-05-23 Laurence Likforman-Sulem , Abderrazak Zahour , Bruno Taconet

JPEG is one of the popular image compression algorithms that provide efficient storage and transmission capabilities in consumer electronics, and hence it is the most preferred image format over the internet world. In the present digital…

Computer Vision and Pattern Recognition · Computer Science 2019-07-30 Bulla Rajesh , Mohammed Javed , P Nagabhushan

-This paper presents a semi automatic method used to segment color documents into different uniform color plans. The practical application is dedicated to administrative documents segmentation. In these documents, like in many other cases,…

Computer Vision and Pattern Recognition · Computer Science 2016-09-28 Stéphane Bres , Véronique Eglin , Vincent Poulain

In this research work, we perform text line segmentation directly in compressed representation of an unconstrained handwritten document image. In this relation, we make use of text line terminal points which is the current state-of-the-art.…

Computer Vision and Pattern Recognition · Computer Science 2019-02-01 Amarnath R , P Nagabhushan

Text segmentation, the task of dividing a document into sections, is often a prerequisite for performing additional natural language processing tasks. Existing text segmentation methods have typically been developed and tested using clean,…

Computer Vision and Pattern Recognition · Computer Science 2023-12-21 Carol Anderson , Phil Crone

Extracting a block of interest referred to as segmenting a specified block in an image and studying its characteristics is of general research interest, and could be a challenging if such a segmentation task has to be carried out directly…

Computer Vision and Pattern Recognition · Computer Science 2014-02-19 Mohammed Javed , P. Nagabhushan , B. B. Chaudhuri

Text segmentation, the task of dividing a document into contiguous segments based on its semantic structure, is a longstanding challenge in language understanding. Previous work on text segmentation focused on unsupervised methods such as…

Computation and Language · Computer Science 2018-03-28 Omri Koshorek , Adir Cohen , Noam Mor , Michael Rotman , Jonathan Berant

Automatic detection of font size finds many applications in the area of intelligent OCRing and document image analysis, which has been traditionally practiced over uncompressed documents, although in real life the documents exist in…

Computer Vision and Pattern Recognition · Computer Science 2014-02-19 Mohammed Javed , P. Nagabhushan , B. B. Chaudhuri

Document segmentation is a method of rending the document into distinct regions. A document is an assortment of information and a standard mode of conveying information to others. Pursuance of data from documents involves ton of human…

Computer Vision and Pattern Recognition · Computer Science 2013-03-05 N. Priyadharshini , M. S. Vijaya

Page segmentation is a web page analysis process that divides a page into cohesive segments, such as sidebars, headers, and footers. Current page segmentation approaches use either the DOM, textual content, or rendering style information of…

Computer Vision and Pattern Recognition · Computer Science 2021-12-23 Mohammad Bajammal , Ali Mesbah

Programs for extracting structured information from text, namely information extractors, often operate separately on document segments obtained from a generic splitting operation such as sentences, paragraphs, k-grams, HTTP requests, and so…

Databases · Computer Science 2021-05-21 Johannes Doleschal , Benny Kimelfeld , Wim Martens , Frank Neven , Matthias Niewerth

Linear Text Segmentation is the task of automatically tagging text documents with topic shifts, i.e. the places in the text where the topics change. A well-established area of research in Natural Language Processing, drawing from…

Computation and Language · Computer Science 2024-11-26 Iacopo Ghinassi , Lin Wang , Chris Newell , Matthew Purver

Text segmentation is important for signaling a document's structure. Without segmenting a long document into topically coherent sections, it is difficult for readers to comprehend the text, let alone find important information. The problem…

Computation and Language · Computer Science 2022-11-01 Sangwoo Cho , Kaiqiang Song , Xiaoyang Wang , Fei Liu , Dong Yu

In this paper, we exploit the innate document segment structure for improving the extractive summarization task. We build two text segmentation models and find the most optimal strategy to introduce their output predictions in an extractive…

Computation and Language · Computer Science 2023-01-24 Lesly Miculicich , Benjamin Han

Text semantic segmentation involves partitioning a document into multiple paragraphs with continuous semantics based on the subject matter, contextual information, and document structure. Traditional approaches have typically relied on…

Computation and Language · Computer Science 2025-04-03 Tongke Ni , Yang Fan , Junru Zhou , Xiangping Wu , Qingcai Chen

Easy Read text is one of the main forms of access to information for people with reading difficulties. One of the key characteristics of this type of text is the requirement to split sentences into smaller grammatical segments, to…

Computation and Language · Computer Science 2025-07-21 Jesús Calleja , Thierry Etchegoyhen , David Ponce
‹ Prev 1 2 3 10 Next ›