Related papers: Enhancing Content-And-Structure Information Retrie…

Hybrid XML Retrieval: Combining Information Retrieval and a Native XML Database

This paper investigates the impact of three approaches to XML retrieval: using Zettair, a full-text information retrieval system; using eXist, a native XML database; and using a hybrid system that takes full article answers from Zettair and…

Information Retrieval · Computer Science 2007-05-23 Jovan Pehcevski , James A. Thom , Anne-Marie Vercoustre

Interpr\'etation vague des contraintes structurelles pour la RI dans des corpus de documents XML - \'Evaluation d'une m\'ethode approch\'ee de RI structur\'ee

We propose specific data structures designed to the indexing and retrieval of information elements in heterogeneous XML data bases. The indexing scheme is well suited to the management of various contextual searches, expressed either at a…

Information Retrieval · Computer Science 2008-12-18 Eugen Popovici , Gilbas Ménier , Pierre-François Marteau

XML Information Retrieval Systems: A Survey

The continuous growth in the XML information repositories has been matched by increasing efforts in development of XML retrieval systems, in large parts aiming at supporting content-oriented XML retrieval. These systems exploit the…

Information Retrieval · Computer Science 2011-11-29 Awny Sayed

Element Retrieval using Namespace Based on keyword search over XML Documents

Querying over XML elements using keyword search is steadily gaining popularity. The traditional similarity measure is widely employed in order to effectively retrieve various XML documents. A number of authors have already proposed…

Information Retrieval · Computer Science 2010-12-20 Yang Wang , Zhikui Chen , Xiaodi Huang

Ontology-driven personalized information retrieval for XML documents

This paper addresses the challenge of improving information retrieval from semi-structured eXtensible Markup Language (XML) documents. Traditional information retrieval systems (IRS) often overlook user-specific needs and return identical…

Information Retrieval · Computer Science 2026-03-24 Ounnaci Iddir , Ahmed-ouamer Rachid , Tai Dinh

Mining Semi-structured Data

The need for discovering knowledge from XML documents according to both structure and content features has become challenging, due to the increase in application contexts for which handling both structure and content information in XML data…

Databases · Computer Science 2015-04-17 Olfa Arfaoui , Minyar Sassi Hidri

XML Information Retrieval:An overview

Locating and distilling the valuable relevant information continued to be the major challenges of Information Retrieval (IR) Systems owing to the explosive growth of online web information. These challenges can be considered the XML…

Information Retrieval · Computer Science 2014-10-29 Suma D. , U. Dinesh Acharya , Geetha M. , Raviraja Holla M

A Systematic Framework for Enterprise Knowledge Retrieval: Leveraging LLM-Generated Metadata to Enhance RAG Systems

In enterprise settings, efficiently retrieving relevant information from large and complex knowledge bases is essential for operational productivity and informed decision-making. This research presents a systematic empirical framework for…

Information Retrieval · Computer Science 2026-04-01 Pranav Pushkar Mishra , Kranti Prakash Yeole , Ramyashree Keshavamurthy , Mokshit Bharat Surana , Fatemeh Sarayloo

Holistic evaluation of XML queries with structural preferences on an annotated strong dataguide

With the emergence of XML as de facto format for storing and exchanging information over the Internet, the search for ever more innovative and effective techniques for their querying is a major and current concern of the XML database…

Databases · Computer Science 2019-06-20 Maurice Tchoupé Tchendji , Adolphe Gaius Nkuefone , Thomas Tébougang Tchendji

REXEL: An End-to-end Model for Document-Level Relation Extraction and Entity Linking

Extracting structured information from unstructured text is critical for many downstream NLP applications and is traditionally achieved by closed information extraction (cIE). However, existing approaches for cIE suffer from two…

Computation and Language · Computer Science 2024-04-22 Nacime Bouziani , Shubhi Tyagi , Joseph Fisher , Jens Lehmann , Andrea Pierleoni

Exp\'{e}riences de classification d'une collection de documents XML de structure homog\`{e}ne

This paper presents some experiments in clustering homogeneous XMLdocuments to validate an existing classification or more generally anorganisational structure. Our approach integrates techniques for extracting knowledge from documents with…

Information Retrieval · Computer Science 2007-05-23 Thierry Despeyroux , Yves Lechevallier , Brigitte Trousse , Anne-Marie Vercoustre

Experiments in Clustering Homogeneous XML Documents to Validate an Existing Typology

This paper presents some experiments in clustering homogeneous XMLdocuments to validate an existing classification or more generally anorganisational structure. Our approach integrates techniques for extracting knowledge from documents with…

Information Retrieval · Computer Science 2007-05-23 Thierry Despeyroux , Yves Lechevallier , Brigitte Trousse , Anne-Marie Vercoustre

Towards an All-Purpose Content-Based Multimedia Information Retrieval System

The growth of multimedia collections - in terms of size, heterogeneity, and variety of media types - necessitates systems that are able to conjointly deal with several forms of media, especially when it comes to searching for particular…

Multimedia · Computer Science 2019-02-12 Ralph Gasser , Luca Rossetto , Heiko Schuldt

A Flexible Structured-based Representation for XML Document Mining

This paper reports on the INRIA group's approach to XML mining while participating in the INEX XML Mining track 2005. We use a flexible representation of XML documents that allows taking into account the structure only or both the structure…

Information Retrieval · Computer Science 2007-05-23 Anne-Marie Vercoustre , Mounir Fegas , Saba Gul , Yves Lechevallier

Improving Retrieval in Theme-specific Applications using a Corpus Topical Taxonomy

Document retrieval has greatly benefited from the advancements of large-scale pre-trained language models (PLMs). However, their effectiveness is often limited in theme-specific applications for specialized areas or industries, due to…

Information Retrieval · Computer Science 2024-03-08 SeongKu Kang , Shivam Agarwal , Bowen Jin , Dongha Lee , Hwanjo Yu , Jiawei Han

Users and Assessors in the Context of INEX: Are Relevance Dimensions Relevant?

The main aspects of XML retrieval are identified by analysing and comparing the following two behaviours: the behaviour of the assessor when judging the relevance of returned document components; and the behaviour of users when interacting…

Information Retrieval · Computer Science 2019-05-01 Jovan Pehcevski , James A. Thom , Anne-Marie Vercoustre

TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents

Recently, automatically extracting information from visually rich documents (e.g., tickets and resumes) has become a hot and vital research topic due to its widespread commercial value. Most existing methods divide this task into two…

Computer Vision and Pattern Recognition · Computer Science 2022-07-15 Zhanzhan Cheng , Peng Zhang , Can Li , Qiao Liang , Yunlu Xu , Pengfei Li , Shiliang Pu , Yi Niu , Fei Wu

Structural Consistency: Enabling XML Keyword Search to Eliminate Spurious Results Consistently

XML keyword search is a user-friendly way to query XML data using only keywords. In XML keyword search, to achieve high precision without sacrificing recall, it is important to remove spurious results not intended by the user. Efforts to…

Databases · Computer Science 2009-11-24 Ki-Hoon Lee , Kyu-Young Whang , Wook-Shin Han , Min-Soo Kim

XML Data Integrity Based on Concatenated Hash Function

Data integrity is the fundamental for data authentication. A major problem for XML data authentication is that signed XML data can be copied to another document but still keep signature valid. This is caused by XML data integrity…

Software Engineering · Computer Science 2009-06-23 Baolong Liu , Joan Lu , Jim Yip

A Feature Analysis for Multimodal News Retrieval

Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the…

Computation and Language · Computer Science 2020-10-02 Golsa Tahmasebzadeh , Sherzod Hakimov , Eric Müller-Budack , Ralph Ewerth