English
Related papers

Related papers: READ: Recursive Autoencoders for Document Layout G…

200 papers

While the generation of document layouts has been extensively explored, comprehensive document generation encompassing both layout and content presents a more complex challenge. This paper delves into this advanced domain, proposing a novel…

Computer Vision and Pattern Recognition · Computer Science 2024-06-13 Sanket Biswas , Rajiv Jain , Vlad I. Morariu , Jiuxiang Gu , Puneet Mathur , Curtis Wigington , Tong Sun , Josep Lladós

Document parsing from scanned images into structured formats remains a significant challenge due to its complexly intertwined elements such as text paragraphs, figures, formulas, and tables. Existing supervised fine-tuning methods often…

Computation and Language · Computer Science 2025-10-21 Baode Wang , Biao Wu , Weizhen Li , Meng Fang , Zuming Huang , Jun Huang , Haozhe Wang , Yanjie Liang , Ling Chen , Wei Chu , Yuan Qi

We introduce a novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures. Our key insight is that 3D shapes are effectively characterized by their hierarchical organization of parts, which…

Graphics · Computer Science 2017-05-16 Jun Li , Kai Xu , Siddhartha Chaudhuri , Ersin Yumer , Hao Zhang , Leonidas Guibas

Analyzing the layout of a document to identify headers, sections, tables, figures etc. is critical to understanding its content. Deep learning based approaches for detecting the layout structure of document images have been promising.…

Computer Vision and Pattern Recognition · Computer Science 2022-07-26 Natraj Raman , Sameena Shah , Manuela Veloso

Reading order detection is the cornerstone to understanding visually-rich documents (e.g., receipts and forms). Unfortunately, no existing work took advantage of advanced deep learning models because it is too laborious to annotate a large…

Computation and Language · Computer Science 2021-08-30 Zilong Wang , Yiheng Xu , Lei Cui , Jingbo Shang , Furu Wei

Document reconstruction constitutes a significant facet of document analysis and recognition, a field that has been progressively accruing interest within the scholarly community. A multitude of these researchers employ an array of document…

Computer Vision and Pattern Recognition · Computer Science 2025-07-09 Xin Li , Mingming Gong , Yunfei Wu , Jianxin Dai , Antai Guo , Xinghua Jiang , Haoyu Cao , Yinsong Liu , Deqiang Jiang , Xing Sun

Automated parsing of scanned documents into richly structured, machine-readable formats remains a critical bottleneck in Document AI, as traditional multi-stage pipelines suffer from error propagation and limited adaptability to diverse…

Computer Vision and Pattern Recognition · Computer Science 2025-10-22 Baode Wang , Biao Wu , Weizhen Li , Meng Fang , Zuming Huang , Jun Huang , Haozhe Wang , Yanjie Liang , Ling Chen , Wei Chu , Yuan Qi

Document layout analysis involves understanding the arrangement of elements within a document. This paper navigates the complexities of understanding various elements within document images, such as text, images, tables, and headings. The…

Computer Vision and Pattern Recognition · Computer Science 2024-05-02 Tahira Shehzadi , Didier Stricker , Muhammad Zeshan Afzal

We present an approach for adapting convolutional neural networks for object recognition and classification to scientific literature layout detection (SLLD), a shared subtask of several information extraction problems. Scientific…

Computer Vision and Pattern Recognition · Computer Science 2020-10-23 Huichen Yang , William H. Hsu

In recent years, the use of multi-modal pre-trained Transformers has led to significant advancements in visually-rich document understanding. However, existing models have mainly focused on features such as text and vision while neglecting…

Computation and Language · Computer Science 2023-08-16 Qiwei Li , Zuchao Li , Xiantao Cai , Bo Du , Hai Zhao

Controllable layout generation aims to create plausible visual arrangements of element bounding boxes within a graphic design according to certain optional constraints, such as the type or position of a specific component. While recent…

Computer Vision and Pattern Recognition · Computer Science 2025-06-04 Yuxuan Wu , Le Wang , Sanping Zhou , Mengnan Liu , Gang Hua , Haoxiang Li

Despite significant progress on current state-of-the-art image generation models, synthesis of document images containing multiple and complex object layouts is a challenging task. This paper presents a novel approach, called DocSynth, to…

Computer Vision and Pattern Recognition · Computer Science 2021-07-07 Sanket Biswas , Pau Riba , Josep Lladós , Umapada Pal

While large language models (LLMs) demonstrate impressive capabilities, their reliance on parametric knowledge often leads to factual inaccuracies. Retrieval-Augmented Generation (RAG) mitigates this by leveraging external documents, yet…

Computation and Language · Computer Science 2025-10-07 Lingnan Xu , Chong Feng , Kaiyuan Zhang , Liu Zhengyong , Wenqiang Xu , Fanqing Meng

The number of published PDF documents has increased exponentially in recent decades. There is a growing need to make their rich content discoverable to information retrieval tools. In this paper, we present a novel approach to document…

Question answering over visually rich documents (VRDs) requires reasoning not only over isolated content but also over documents' structural organization and cross-page dependencies. However, conventional retrieval-augmented generation…

Computation and Language · Computer Science 2026-03-03 Zhivar Sourati , Zheng Wang , Marianne Menglin Liu , Yazhe Hu , Mengqing Guo , Sujeeth Bharadwaj , Kyu Han , Tao Sheng , Sujith Ravi , Morteza Dehghani , Dan Roth

Document layout understanding is a field of study that analyzes the spatial arrangement of information in a document hoping to understand its structure and layout. Models such as LayoutLM (and its subsequent iterations) can understand…

Computation and Language · Computer Science 2025-01-13 Pablo Melendez , Clemens Havas

Deep generative models have been used in recent years to learn coherent latent representations in order to synthesize high-quality images. In this work, we propose a neural network to learn a generative model for sampling consistent indoor…

Computer Vision and Pattern Recognition · Computer Science 2020-08-24 Pulak Purkait , Christopher Zach , Ian Reid

Different layouts can characterize different aspects of the same graph. Finding a "good" layout of a graph is thus an important task for graph visualization. In practice, users often visualize a graph in multiple layouts by using different…

Social and Information Networks · Computer Science 2019-10-16 Oh-Hyun Kwon , Kwan-Liu Ma

Designers craft and edit graphic designs in a layer representation, but layer-based editing becomes impossible once composited into a raster image. In this work, we propose LayerD, a method to decompose raster graphic designs into layers…

Graphics · Computer Science 2025-09-30 Tomoyuki Suzuki , Kang-Jun Liu , Naoto Inoue , Kota Yamaguchi

Book covers are intentionally designed and provide an introduction to a book. However, they typically require professional skills to design and produce the cover images. Thus, we propose a generative neural network that can produce book…

Computer Vision and Pattern Recognition · Computer Science 2021-06-16 Wensheng Zhang , Yan Zheng , Taiga Miyazono , Seiichi Uchida , Brian Kenji Iwana
‹ Prev 1 2 3 10 Next ›