English
Related papers

Related papers: DocReLM: Mastering Document Retrieval with Languag…

200 papers

Information retrieval systems are crucial for enabling effective access to large document collections. Recent approaches have leveraged Large Language Models (LLMs) to enhance retrieval performance through query augmentation, but often rely…

Information Retrieval · Computer Science 2025-04-15 Pengcheng Jiang , Jiacheng Lin , Lang Cao , Runchu Tian , SeongKu Kang , Zifeng Wang , Jimeng Sun , Jiawei Han

Scientific paper retrieval is essential for supporting literature discovery and research. While dense retrieval methods demonstrate effectiveness in general-purpose tasks, they often fail to capture fine-grained scientific concepts that are…

Information Retrieval · Computer Science 2025-10-07 Yunyi Zhang , Ruozhen Yang , Siqi Jiao , SeongKu Kang , Jiawei Han

Large Language Models (LLMs) have shown strong capabilities in document re-ranking, a key component in modern Information Retrieval (IR) systems. However, existing LLM-based approaches face notable limitations, including ranking…

Information Retrieval · Computer Science 2025-10-03 Pinhuan Wang , Zhiqiu Xia , Chunhua Liao , Feiyi Wang , Hang Liu

Despite the dramatic progress in Large Language Model (LLM) development, LLMs often provide seemingly plausible but not factual information, often referred to as hallucinations. Retrieval-augmented LLMs provide a non-parametric approach to…

Computation and Language · Computer Science 2023-11-09 Sai Munikoti , Anurag Acharya , Sridevi Wagle , Sameera Horawalavithana

The exponential growth of scientific literature in PDF format necessitates advanced tools for efficient and accurate document understanding, summarization, and content optimization. Traditional methods fall short in handling complex layouts…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Kun Qian , Wenjie Li , Tianyu Sun , Wenhong Wang , Wenhan Luo

Large language models (LLMs) have gained significant attention in various fields but prone to hallucination, especially in knowledge-intensive (KI) tasks. To address this, retrieval-augmented generation (RAG) has emerged as a popular…

Computation and Language · Computer Science 2024-04-23 Xiaoxi Li , Zhicheng Dou , Yujia Zhou , Fangchao Liu

Document retrieval is an important task for search and Retrieval-Augmented Generation (RAG) applications. Large Language Models (LLMs) have contributed to improving the accuracy of text-based document retrieval. However, documents with…

Information Retrieval · Computer Science 2025-05-22 Radek Osmulski , Gabriel de Souza P. Moreira , Ronay Ak , Mengyao Xu , Benedikt Schifferer , Even Oldridge

Recently, there has been a growing interest among large language model (LLM) developers in LLM-based document reading systems, which enable users to upload their own documents and pose questions related to the document contents, going…

Computation and Language · Computer Science 2024-07-16 Anni Zou , Wenhao Yu , Hongming Zhang , Kaixin Ma , Deng Cai , Zhuosheng Zhang , Hai Zhao , Dong Yu

Statutory law retrieval is a typical problem in legal language processing, that has various practical applications in law engineering. Modern deep learning-based retrieval methods have achieved significant results for this problem. However,…

Computation and Language · Computer Science 2024-10-17 Hai-Long Nguyen , Tan-Minh Nguyen , Duc-Minh Nguyen , Thi-Hai-Yen Vuong , Ha-Thanh Nguyen , Xuan-Hieu Phan

The Retrieval-Augmented Language Model (RALM) has shown remarkable performance on knowledge-intensive tasks by incorporating external knowledge during inference, which mitigates the factual hallucinations inherited in large language models…

Computation and Language · Computer Science 2024-12-20 Yuan Xia , Jingbo Zhou , Zhenhui Shi , Jun Chen , Haifeng Huang

Large language models (LLMs) inherently display hallucinations since the precision of generated texts cannot be guaranteed purely by the parametric knowledge they include. Although retrieval-augmented generation (RAG) systems enhance the…

Artificial Intelligence · Computer Science 2025-02-18 Bingyu Wan , Fuxi Zhang , Zhongpeng Qi , Jiayi Ding , Jijun Li , Baoshi Fan , Yijia Zhang , Jun Zhang

Recent regulatory initiatives like the European AI Act and relevant voices in the Machine Learning (ML) community stress the need to describe datasets along several key dimensions for trustworthy AI, such as the provenance processes and…

Digital Libraries · Computer Science 2024-05-27 Joan Giner-Miguelez , Abel Gómez , Jordi Cabot

Scientific progress depends on researchers' ability to synthesize the growing body of literature. Can large language models (LMs) assist scientists in this task? We introduce OpenScholar, a specialized retrieval-augmented LM that answers…

Large language models (LLMs) are incredible and versatile tools for text-based tasks that have enabled countless, previously unimaginable, applications. Retrieval models, in contrast, have not yet seen such capable general-purpose models…

Information Retrieval · Computer Science 2025-09-10 Julian Killingback , Hamed Zamani

This paper presents a procedure to retrieve subsets of relevant documents from large text collections for Content Analysis, e.g. in social sciences. Document retrieval for this purpose needs to take account of the fact that analysts often…

Information Retrieval · Computer Science 2017-07-12 Gregor Wiedemann , Andreas Niekler

The rapid advancement of Large Language Models (LLMs) has led to a multitude of application opportunities. One traditional task for Information Retrieval systems is the summarization and classification of texts, both of which are important…

Computation and Language · Computer Science 2025-02-25 Gautam Kishore Shahi , Oliver Hummel

Utilizing large language models (LLMs) for document reranking has been a popular and promising research direction in recent years, many studies are dedicated to improving the performance and efficiency of using LLMs for reranking. Besides,…

Information Retrieval · Computer Science 2025-04-11 Qi Liu , Haozhe Duan , Yiqun Chen , Quanfeng Lu , Weiwei Sun , Jiaxin Mao

Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring…

Computation and Language · Computer Science 2020-02-21 Kelvin Guu , Kenton Lee , Zora Tung , Panupong Pasupat , Ming-Wei Chang

Large Language Model (LLM) pre-training exhausts an ever growing compute budget, yet recent research has demonstrated that careful document selection enables comparable model quality with only a fraction of the FLOPs. Inspired by efforts…

Computation and Language · Computer Science 2024-06-10 Xiang Kong , Tom Gunter , Ruoming Pang

The text retrieval is the task of retrieving similar documents to a search query, and it is important to improve retrieval accuracy while maintaining a certain level of retrieval speed. Existing studies have reported accuracy improvements…

Information Retrieval · Computer Science 2023-11-15 Yuichi Sasazawa , Kenichi Yokote , Osamu Imaichi , Yasuhiro Sogawa
‹ Prev 1 2 3 10 Next ›