Related papers: A Multi-Modal Multilingual Benchmark for Document …

Multilingual and cross-lingual document classification: A meta-learning approach

The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in limited-resource setting…

Computation and Language · Computer Science 2021-04-27 Niels van der Heijden , Helen Yannakoudakis , Pushkar Mishra , Ekaterina Shutova

Document AI: Benchmarks, Models and Applications

Document AI, or Document Intelligence, is a relatively new research topic that refers to the techniques for automatically reading, understanding, and analyzing business documents. It is an important research direction for natural language…

Computation and Language · Computer Science 2021-11-17 Lei Cui , Yiheng Xu , Tengchao Lv , Furu Wei

A Systematic Comparison of Architectures for Document-Level Sentiment Classification

Documents are composed of smaller pieces - paragraphs, sentences, and tokens - that have complex relationships between one another. Sentiment classification models that take into account the structure inherent in these documents have a…

Computation and Language · Computer Science 2022-02-03 Jeremy Barnes , Vinit Ravishankar , Lilja Øvrelid , Erik Velldal

Multilingual Hierarchical Attention Networks for Document Classification

Hierarchical attention networks have recently achieved remarkable performance for document classification in a given language. However, when multilingual document collections are considered, training such models separately for each language…

Computation and Language · Computer Science 2017-09-18 Nikolaos Pappas , Andrei Popescu-Belis

Beyond Document Page Classification: Design, Datasets, and Challenges

This paper highlights the need to bring document classification benchmarking closer to real-world applications, both in the nature of data tested ($X$: multi-channel, multi-paged, multi-industry; $Y$: class distributions and label set…

Computer Vision and Pattern Recognition · Computer Science 2023-11-01 Jordy Van Landeghem , Sanket Biswas , Matthew B. Blaschko , Marie-Francine Moens

A Comparison of Approaches to Document-level Machine Translation

Document-level machine translation conditions on surrounding sentences to produce coherent translations. There has been much recent work in this area with the introduction of custom model architectures and decoding algorithms. This paper…

Computation and Language · Computer Science 2021-01-28 Zhiyi Ma , Sergey Edunov , Michael Auli

Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification

We consider zero-shot cross-lingual transfer in legal topic classification using the recent MultiEURLEX dataset. Since the original dataset contains parallel documents, which is unrealistic for zero-shot cross-lingual transfer, we develop a…

Computation and Language · Computer Science 2022-06-09 Stratos Xenouleas , Alexia Tsoukara , Giannis Panagiotakis , Ilias Chalkidis , Ion Androutsopoulos

DLUE: Benchmarking Document Language Understanding

Understanding documents is central to many real-world tasks but remains a challenging topic. Unfortunately, there is no well-established consensus on how to comprehensively evaluate document understanding abilities, which significantly…

Computation and Language · Computer Science 2023-05-17 Ruoxi Xu , Hongyu Lin , Xinyan Guan , Xianpei Han , Yingfei Sun , Le Sun

A Corpus for Multilingual Document Classification in Eight Languages

Cross-lingual document classification aims at training a document classifier on resources in one language and transferring it to a different language without any additional resources. Several approaches have been proposed in the literature…

Computation and Language · Computer Science 2018-05-28 Holger Schwenk , Xian Li

DocSplit: A Comprehensive Benchmark Dataset and Evaluation Approach for Document Packet Recognition and Splitting

Document understanding in real-world applications often requires processing heterogeneous, multi-page document packets containing multiple documents stitched together. Despite recent advances in visual document understanding, the…

Computation and Language · Computer Science 2026-02-19 Md Mofijul Islam , Md Sirajus Salekin , Nivedha Balakrishnan , Vincil C. Bishop , Niharika Jain , Spencer Romo , Bob Strahan , Boyi Xie , Diego A. Socolinsky

VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification

Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream task. In this paper, we approach the document classification problem…

Computer Vision and Pattern Recognition · Computer Science 2023-05-12 Souhail Bakkali , Zuheng Ming , Mickael Coustaty , Marçal Rusiñol , Oriol Ramos Terrades

Document classification methods

Information on different fields which are collected by users requires appropriate management and organization to be structured in a standard way and retrieved fast and more easily. Document classification is a conventional method to…

Information Retrieval · Computer Science 2019-09-18 Madjid Khalilian , Shiva Hassanzadeh

Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis

Document AI aims to automatically analyze documents by leveraging natural language processing and computer vision techniques. One of the major tasks of Document AI is document layout analysis, which structures document pages by interpreting…

Computation and Language · Computer Science 2023-08-31 Sotirios Kastanas , Shaomu Tan , Yi He

M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation

Document translation poses a challenge for Neural Machine Translation (NMT) systems. Most document-level NMT systems rely on meticulously curated sentence-level parallel data, assuming flawless extraction of text from documents along with…

Computation and Language · Computer Science 2024-06-13 Benjamin Hsu , Xiaoyu Liu , Huayang Li , Yoshinari Fujinuma , Maria Nadejde , Xing Niu , Yair Kittenplon , Ron Litman , Raghavendra Pappagari

Comparative Study of Long Document Classification

The amount of information stored in the form of documents on the internet has been increasing rapidly. Thus it has become a necessity to organize and maintain these documents in an optimum manner. Text classification algorithms study the…

Computation and Language · Computer Science 2022-02-22 Vedangi Wagh , Snehal Khandve , Isha Joshi , Apurva Wani , Geetanjali Kale , Raviraj Joshi

Image search using multilingual texts: a cross-modal learning approach between image and text

Multilingual (or cross-lingual) embeddings represent several languages in a unique vector space. Using a common embedding space enables for a shared semantic between words from different languages. In this paper, we propose to embed images…

Computer Vision and Pattern Recognition · Computer Science 2019-05-15 Maxime Portaz , Hicham Randrianarivo , Adrien Nivaggioli , Estelle Maudet , Christophe Servan , Sylvain Peyronnet

MultiEURLEX -- A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer

We introduce MULTI-EURLEX, a new multilingual dataset for topic classification of legal documents. The dataset comprises 65k European Union (EU) laws, officially translated in 23 languages, annotated with multiple labels from the EUROVOC…

Computation and Language · Computer Science 2021-09-08 Ilias Chalkidis , Manos Fergadiotis , Ion Androutsopoulos

GlobalDoc: A Cross-Modal Vision-Language Framework for Real-World Document Image Retrieval and Classification

Visual document understanding (VDU) has rapidly advanced with the development of powerful multi-modal language models. However, these models typically require extensive document pre-training data to learn intermediate representations and…

Computer Vision and Pattern Recognition · Computer Science 2024-11-06 Souhail Bakkali , Sanket Biswas , Zuheng Ming , Mickaël Coustaty , Marçal Rusiñol , Oriol Ramos Terrades , Josep Lladós

Efficient Classification of Long Documents Using Transformers

Several methods have been proposed for classifying long textual documents using Transformers. However, there is a lack of consensus on a benchmark to enable a fair comparison among different approaches. In this paper, we provide a…

Computation and Language · Computer Science 2022-03-23 Hyunji Hayley Park , Yogarshi Vyas , Kashif Shah

M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?

We investigate a critical yet under-explored question in Large Vision-Language Models (LVLMs): Do LVLMs genuinely comprehend interleaved image-text in the document? Existing document understanding benchmarks often assess LVLMs using…

Computer Vision and Pattern Recognition · Computer Science 2025-03-31 Haolong Yan , Kaijun Tan , Yeqing Shen , Xin Huang , Zheng Ge , Xiangyu Zhang , Si Li , Daxin Jiang