English
Related papers

Related papers: GVdoc: Graph-based Visual Document Classification

200 papers

The ability of a document classifier to handle inputs that are drawn from a distribution different from the training distribution is crucial for robust deployment and generalizability. The RVL-CDIP corpus is the de facto standard benchmark…

Computer Vision and Pattern Recognition · Computer Science 2023-01-19 Stefan Larson , Gordon Lim , Yutong Ai , David Kuang , Kevin Leach

Visual document understanding (VDU) has rapidly advanced with the development of powerful multi-modal language models. However, these models typically require extensive document pre-training data to learn intermediate representations and…

Computer Vision and Pattern Recognition · Computer Science 2024-11-06 Souhail Bakkali , Sanket Biswas , Zuheng Ming , Mickaël Coustaty , Marçal Rusiñol , Oriol Ramos Terrades , Josep Lladós

Graph embedding provides a feasible methodology to conduct pattern classification for graph-structured data by mapping each data into the vectorial space. Various pioneering works are essentially coding method that concentrates on a…

Machine Learning · Computer Science 2022-10-04 Xue Liu , Dan Sun , Xiaobo Cao , Hao Ye , Wei Wei

In reliable decision-making systems based on machine learning, models have to be robust to distributional shifts or provide the uncertainty of their predictions. In node-level problems of graph learning, distributional shifts can be…

Machine Learning · Computer Science 2023-11-02 Gleb Bazhenov , Denis Kuznedelev , Andrey Malinin , Artem Babenko , Liudmila Prokhorenkova

Learning universal graph representations across heterogeneous domains is difficult because graph datasets differ in topology, node-attribute semantics, feature dimensions, and even attribute availability. We propose GraphVec, a…

Machine Learning · Computer Science 2026-05-08 Qi Feng , Jicong Fan

Learning to generate graphs is challenging as a graph is a set of pairwise connected, unordered nodes encoding complex combinatorial structures. Recently, several works have proposed graph generative models based on normalizing flows or…

Machine Learning · Computer Science 2023-06-21 Xiaohui Chen , Yukun Li , Aonan Zhang , Li-Ping Liu

Document intelligence as a relatively new research topic supports many business applications. Its main task is to automatically read, understand, and analyze documents. However, due to the diversity of formats (invoices, reports, forms,…

Computer Vision and Pattern Recognition · Computer Science 2022-10-25 Zhenrong Zhang , Jiefeng Ma , Jun Du , Licheng Wang , Jianshu Zhang

Document image classification remains a popular research area because it can be commercialized in many enterprise applications across different industries. Recent advancements in large pre-trained computer vision and language models and…

Computer Vision and Pattern Recognition · Computer Science 2021-06-28 Jaya Krishna Mandivarapu , Eric Bunch , Qian You , Glenn Fung

Graph embedding is a transformation of nodes of a graph into a set of vectors. A~good embedding should capture the graph topology, node-to-node relationship, and other relevant information about the graph, its subgraphs, and nodes. If these…

Social and Information Networks · Computer Science 2022-06-22 Arash Dehghan-Kooshkghazi , Bogumił Kamiński , Łukasz Kraiński , Paweł Prałat , François Théberge

Numerous pre-training techniques for visual document understanding (VDU) have recently shown substantial improvements in performance across a wide range of document tasks. However, these pre-trained VDU models cannot guarantee continued…

Computer Vision and Pattern Recognition · Computer Science 2023-06-06 Jiabang He , Yi Hu , Lei Wang , Xing Xu , Ning Liu , Hui Liu , Heng Tao Shen

Graph learning has been crucial to many real-world tasks, but they are often studied with a closed-world assumption, with all possible labels of data known a priori. To enable effective graph learning in an open and noisy environment, it is…

Machine Learning · Computer Science 2025-08-04 Weijie Guan , Haohui Wang , Jian Kang , Lihui Liu , Dawei Zhou

Document structure analysis, such as zone segmentation and table recognition, is a complex problem in document processing and is an active area of research. The recent success of deep learning in solving various computer vision and machine…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Shah Rukh Qasim , Hassan Mahmood , Faisal Shafait

Text classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification.…

Computation and Language · Computer Science 2018-11-14 Liang Yao , Chengsheng Mao , Yuan Luo

Most existing named entity recognition (NER) approaches are based on sequence labeling models, which focus on capturing the local context dependencies. However, the way of taking one sentence as input prevents the modeling of non-sequential…

Computation and Language · Computer Science 2021-06-03 Zanbo Wang , Wei Wei , Xianling Mao , Shanshan Feng , Pan Zhou , Zhiyong He , Sheng Jiang

Lifelong graph learning deals with the problem of continually adapting graph neural network (GNN) models to changes in evolving graphs. We address two critical challenges of lifelong graph learning in this work: dealing with new classes and…

Machine Learning · Computer Science 2023-05-10 Lukas Galke , Iacopo Vagliano , Benedikt Franke , Tobias Zielke , Marcel Hoffmann , Ansgar Scherp

Graph Convolutional Networks (GCNs) have shown strong performance in learning text representations for various tasks such as text classification, due to its expressive power in modeling graph structure data (e.g., a literature citation…

Computation and Language · Computer Science 2023-05-12 Zhibin Lu , Qianqian Xie , Benyou Wang , Jian-yun Nie

Accurate classification of multi-modal financial documents, containing text, tables, charts, and images, is crucial but challenging. Traditional text-based approaches often fail to capture the complex multi-modal nature of these documents.…

Information Retrieval · Computer Science 2024-06-05 Anjanava Biswas , Wrick Talukdar

In light of the recent success of Graph Neural Networks (GNNs) and their ability to perform inference on complex data structures, many studies apply GNNs to the task of text classification. In most previous methods, a heterogeneous graph,…

Machine Learning · Computer Science 2024-10-29 Yassine Abbahaddou , Johannes F. Lutzeyer , Michalis Vazirgiannis

Given the success of Graph Neural Networks (GNNs) for structure-aware machine learning, many studies have explored their use for text classification, but mostly in specific domains with limited data characteristics. Moreover, some…

Computation and Language · Computer Science 2024-01-23 Margarita Bugueño , Gerard de Melo

In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and…

Machine Learning · Computer Science 2025-10-31 Shuo Jiang , Jie Hu , Christopher L. Magee , Jianxi Luo
‹ Prev 1 2 3 10 Next ›