English
Related papers

Related papers: CDLM: Cross-Document Language Modeling

200 papers

In this paper, we introduce Cross-View Language Modeling, a simple and effective pre-training framework that unifies cross-lingual and cross-modal pre-training with shared architectures and objectives. Our approach is motivated by a key…

Computation and Language · Computer Science 2023-06-13 Yan Zeng , Wangchunshu Zhou , Ao Luo , Ziming Cheng , Xinsong Zhang

Recently, diffusion models have excelled in image generation tasks and have also been applied to neural language processing (NLP) for controllable text generation. However, the application of diffusion models in a cross-lingual setting is…

Computation and Language · Computer Science 2023-08-01 Linyao Chen , Aosong Feng , Boming Yang , Zihui Li

The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks. In this work, we propose extending this idea by pre-training a generic multi-document…

Computation and Language · Computer Science 2023-05-25 Avi Caciularu , Matthew E. Peters , Jacob Goldberger , Ido Dagan , Arman Cohan

Currently, large language models (LLMs) predominantly focus on the text modality. To enable more natural human-AI interaction, speech LLMs are emerging, but building effective end-to-end speech LLMs remains challenging due to limited data…

Computation and Language · Computer Science 2026-04-14 Yan Zhou , Qingkai Fang , Yun Hong , Yang Feng

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts. The unified view helps us to better…

Computation and Language · Computer Science 2021-04-08 Zewen Chi , Li Dong , Furu Wei , Nan Yang , Saksham Singhal , Wenhui Wang , Xia Song , Xian-Ling Mao , Heyan Huang , Ming Zhou

Masked language modeling (MLM) has been widely used for pre-training effective bidirectional representations, but incurs substantial training costs. In this paper, we propose a novel concept-based curriculum masking (CCM) method to…

Computation and Language · Computer Science 2022-12-16 Mingyu Lee , Jun-Hyung Park , Junho Kim , Kang-Min Kim , SangKeun Lee

Diffusion Language Models (DLMs) offer a promising parallel generation paradigm but suffer from slow inference due to numerous refinement steps and the inability to use standard KV caching. We introduce CDLM (Consistency Diffusion Language…

Machine Learning · Computer Science 2026-02-23 Minseo Kim , Chenfeng Xu , Coleman Hooper , Harman Singh , Ben Athiwaratkun , Ce Zhang , Kurt Keutzer , Amir Gholami

The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences. In this paper, we introduce denoising word alignment as a new cross-lingual pre-training task.…

Computation and Language · Computer Science 2021-09-14 Zewen Chi , Li Dong , Bo Zheng , Shaohan Huang , Xian-Ling Mao , Heyan Huang , Furu Wei

Enterprise documents such as forms, invoices, receipts, reports, contracts, and other similar records, often carry rich semantics at the intersection of textual and spatial modalities. The visual cues offered by their complex layouts play a…

Computation and Language · Computer Science 2024-01-03 Dongsheng Wang , Natraj Raman , Mathieu Sibue , Zhiqiang Ma , Petr Babkin , Simerjot Kaur , Yulong Pei , Armineh Nourbakhsh , Xiaomo Liu

This paper proposes LayoutLLM, a more flexible document analysis method for understanding imaged documents. Visually Rich Document Understanding tasks, such as document image classification and information extraction, have gained…

Computation and Language · Computer Science 2024-03-22 Masato Fujitake

Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion. Existing pretraining…

Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure. Existing language models generally fail to account for discourse structure, but…

Computation and Language · Computer Science 2016-02-23 Yangfeng Ji , Trevor Cohn , Lingpeng Kong , Chris Dyer , Jacob Eisenstein

In this paper, we introduce DOCmT5, a multilingual sequence-to-sequence language model pretrained with large scale parallel documents. While previous approaches have focused on leveraging sentence-level parallel data, we try to build a…

Computation and Language · Computer Science 2022-05-06 Chia-Hsuan Lee , Aditya Siddhant , Viresh Ratnakar , Melvin Johnson

The development of Long-Context Large Language Models (LLMs) has markedly advanced natural language processing by facilitating the process of textual data across long documents and multiple corpora. However, Long-Context LLMs still face two…

Computation and Language · Computer Science 2024-10-10 Jingyang Deng , Zhengyang Shen , Boyang Wang , Lixin Su , Suqi Cheng , Ying Nie , Junfeng Wang , Dawei Yin , Jinwen Ma

Masked language modeling (MLM) is one of the key sub-tasks in vision-language pretraining. In the cross-modal setting, tokens in the sentence are masked at random, and the model predicts the masked tokens given the image and the text. In…

Computation and Language · Computer Science 2021-09-07 Yonatan Bitton , Gabriel Stanovsky , Michael Elhadad , Roy Schwartz

Multimodal pre-training with text, layout, and image has made significant progress for Visually Rich Document Understanding (VRDU), especially the fixed-layout documents such as scanned document images. While, there are still a large number…

Computation and Language · Computer Science 2022-03-14 Junlong Li , Yiheng Xu , Lei Cui , Furu Wei

In this paper, we propose CLMSM, a domain-specific, continual pre-training framework, that learns from a large set of procedural recipes. CLMSM uses a Multi-Task Learning Framework to optimize two objectives - a) Contrastive Learning using…

Computation and Language · Computer Science 2023-10-24 Abhilash Nandy , Manav Nitin Kapadnis , Pawan Goyal , Niloy Ganguly

Text-rich document understanding (TDU) requires comprehensive analysis of documents containing substantial textual content and complex layouts. While Multimodal Large Language Models (MLLMs) have achieved fast progress in this domain,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-20 Wenhui Liao , Jiapeng Wang , Hongliang Li , Chengyu Wang , Jun Huang , Lianwen Jin

Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream task. In this paper, we approach the document classification problem…

Computer Vision and Pattern Recognition · Computer Science 2023-05-12 Souhail Bakkali , Zuheng Ming , Mickael Coustaty , Marçal Rusiñol , Oriol Ramos Terrades

Large language models (LLMs) have demonstrated strong performance in sentence-level machine translation, but scaling to document-level translation remains challenging, particularly in modeling long-range dependencies and discourse phenomena…

Computation and Language · Computer Science 2025-08-29 Miguel Moura Ramos , Patrick Fernandes , Sweta Agrawal , André F. T. Martins
‹ Prev 1 2 3 10 Next ›