English
Related papers

Related papers: S2Doc -- Spatial-Semantic Document Format

200 papers

We propose SelfDoc, a task-agnostic pre-training framework for document image understanding. Because documents are multimodal and are intended for sequential reading, our framework exploits the positional, textual, and visual information of…

Computer Vision and Pattern Recognition · Computer Science 2021-06-08 Peizhao Li , Jiuxiang Gu , Jason Kuen , Vlad I. Morariu , Handong Zhao , Rajiv Jain , Varun Manjunatha , Hongfu Liu

Document similarity is the problem of estimating the degree to which a given pair of documents has similar semantic content. An accurate document similarity measure can improve several enterprise relevant tasks such as document clustering,…

Computation and Language · Computer Science 2017-11-15 Gaurav Maheshwari , Priyansh Trivedi , Harshita Sahijwani , Kunal Jha , Sourish Dasgupta , Jens Lehmann

We introduce SmolDocling, an ultra-compact vision-language model targeting end-to-end document conversion. Our model comprehensively processes entire pages by generating DocTags, a new universal markup format that captures all page elements…

Documents are core carriers of information and knowl-edge, with broad applications in finance, healthcare, and scientific research. Tables, as the main medium for structured data, encapsulate key information and are among the most critical…

Computation and Language · Computer Science 2025-08-15 Xuan Li , Jialiang Dong , Raymond Wong

Designing adaptive documents that are visually appealing across various devices and for diverse viewers is a challenging task. This is due to the wide variety of devices and different viewer requirements and preferences. Alterations to a…

Human-Computer Interaction · Computer Science 2024-10-22 Yue Jiang , Christof Lutteroth , Rajiv Jain , Christopher Tensmeyer , Varun Manjunatha , Wolfgang Stuerzlinger , Vlad Morariu

Enterprise documents such as forms, invoices, receipts, reports, contracts, and other similar records, often carry rich semantics at the intersection of textual and spatial modalities. The visual cues offered by their complex layouts play a…

Computation and Language · Computer Science 2024-01-03 Dongsheng Wang , Natraj Raman , Mathieu Sibue , Zhiqiang Ma , Petr Babkin , Simerjot Kaur , Yulong Pei , Armineh Nourbakhsh , Xiaomo Liu

Document layout understanding is a field of study that analyzes the spatial arrangement of information in a document hoping to understand its structure and layout. Models such as LayoutLM (and its subsequent iterations) can understand…

Computation and Language · Computer Science 2025-01-13 Pablo Melendez , Clemens Havas

The ability to understand and answer questions over documents can be useful in many business and practical applications. However, documents often contain lengthy and diverse multimodal contents such as texts, figures, and tables, which are…

Computation and Language · Computer Science 2024-11-12 Yew Ken Chia , Liying Cheng , Hou Pong Chan , Chaoqun Liu , Maojia Song , Sharifah Mahani Aljunied , Soujanya Poria , Lidong Bing

In this paper, we present the SimDoc system, a simplification model considering simplicity, readability, and discourse aspects, such as coherence. In the past decade, the progress of the Text Simplification (TS) field has been mostly shown…

Computation and Language · Computer Science 2024-12-30 Laura Vásquez-Rodríguez , Nhung T. H. Nguyen , Piotr Przybyła , Matthew Shardlow , Sophia Ananiadou

Document chunking is a critical task in natural language processing (NLP) that involves dividing a document into meaningful segments. Traditional methods often rely solely on semantic analysis, ignoring the spatial layout of elements, which…

Computation and Language · Computer Science 2025-01-13 Prashant Verma

Documents serve as a crucial and indispensable medium for everyday workplace tasks. However, understanding, interacting and creating such documents on today's planar interfaces without any intelligent support are challenging due to our…

Human-Computer Interaction · Computer Science 2024-11-19 Chen Chen

Data warehouse store and provide access to large volume of historical data supporting the strategic decisions of organisations. Data warehouse is based on a multidimensional model which allow to express user's needs for supporting the…

Databases · Computer Science 2012-08-02 Saida Aissi , Mohamed Salah Gouider

Developing document understanding models at enterprise scale requires large, diverse, and well-annotated datasets spanning a wide range of document types. However, collecting such data is prohibitively expensive due to privacy constraints,…

This paper represents an approach to creating global knowledge systems, using new philosophy and infrastructure of global distributed semantic network (frame knowledge representation system) based on the space-time database construction.…

Information Theory · Computer Science 2007-07-16 A. A. Prikhod'ko , N. A. Prikhod'ko

Archived collections of documents (like newspaper and web archives) serve as important information sources in a variety of disciplines, including Digital Humanities, Historical Science, and Journalism. However, the absence of efficient and…

Information Retrieval · Computer Science 2021-07-30 Pavlos Fafalios , Vaibhav Kasturia , Wolfgang Nejdl

We present the sTeX+ system, a user-driven advancement of sTeX - a semantic extension of LaTeX that allows for producing high-quality PDF documents for (proof)reading and printing, as well as semantic XML/OMDoc documents for the Web or…

Software Engineering · Computer Science 2010-06-24 Andrea Kohlhase , Michael Kohlhase , Christoph Lange

We propose MultiDoc2Dial, a new task and dataset on modeling goal-oriented dialogues grounded in multiple documents. Most previous works treat document-grounded dialogue modeling as a machine reading comprehension task based on a single…

Computation and Language · Computer Science 2022-05-04 Song Feng , Siva Sankalp Patel , Hui Wan , Sachindra Joshi

We present a framework to analyze color documents of complex layout. In addition, no assumption is made on the layout. Our framework combines in a content-driven bottom-up approach two different sources of information: textual and spatial.…

Computation and Language · Computer Science 2007-05-23 Marco Aiello , Christof Monz , Leon Todoran

Visual document understanding (VDU) has rapidly advanced with the development of powerful multi-modal language models. However, these models typically require extensive document pre-training data to learn intermediate representations and…

Computer Vision and Pattern Recognition · Computer Science 2024-11-06 Souhail Bakkali , Sanket Biswas , Zuheng Ming , Mickaël Coustaty , Marçal Rusiñol , Oriol Ramos Terrades , Josep Lladós
‹ Prev 1 2 3 10 Next ›