Related papers: DescribeX: A Framework for Exploring and Querying …

Generating Concise and Readable Summaries of XML Documents

XML has become the de-facto standard for data representation and exchange, resulting in large scale repositories and warehouses of XML data. In order for users to understand and explore these large collections, a summarized, bird's eye view…

Information Retrieval · Computer Science 2009-10-14 Maya Ramanath , Kondreddi Sarath Kumar , Georgiana Ifrim

Fast and Tiny Structural Self-Indexes for XML

XML document markup is highly repetitive and therefore well compressible using dictionary-based methods such as DAGs or grammars. In the context of selectivity estimation, grammar-compressed trees were used before as synopsis for structural…

Databases · Computer Science 2010-12-30 Sebastian Maneth , Tom Sebastian

XTQ: A Declarative Functional XML Query Language

Various query languages have been proposed to extract and restructure information in XML documents. These languages, usually claiming to be declarative, mainly consider the conjunctive relationships among data elements. In order to present…

Programming Languages · Computer Science 2014-06-06 Xuhui Li , Mengchi Liu , Shanfeng Zhu , Arif Ghafoor

EquiX--A Search and Query Language for XML

EquiX is a search language for XML that combines the power of querying with the simplicity of searching. Requirements for such languages are discussed and it is shown that EquiX meets the necessary criteria. Both a graph-based abstract…

Databases · Computer Science 2007-05-23 Sara Cohen , Yaron Kanza , Yakov Kogan , Werner Nutt , Yehoshua Sagiv , Alexander Serebrenik

Processing XML for Domain Specific Languages

XML is a standard and universal language for representing information. XML processing is supported by two key frameworks: DOM and SAX. SAX is efficient, but leaves the developer to encode much of the processing. This paper introduces a…

Formal Languages and Automata Theory · Computer Science 2015-06-11 Tony Clark

EquiX---A Search and Query Language for XML

EquiX is a search language for XML that combines the power of querying with the simplicity of searching. Requirements for such languages are discussed and it is shown that EquiX meets the necessary criteria. Both a graphical abstract syntax…

Databases · Computer Science 2007-05-23 Sara Cohen , Yaron Kanza , Yakov Kogan , Werner Nutt , Yehoshua Sagiv , Alexander Serebrenik

Explaining Documents' Relevance to Search Queries

We present GenEx, a generative model to explain search results to users beyond just showing matches between query and document words. Adding GenEx explanations to search results greatly impacts user satisfaction and search performance.…

Information Retrieval · Computer Science 2021-11-03 Razieh Rahimi , Youngwoo Kim , Hamed Zamani , James Allan

Path Summaries and Path Partitioning in Modern XML Databases

We study the applicability of XML path summaries in the context of current-day XML databases. We find that summaries provide an excellent basis for optimizing data access methods, which furthermore mixes very well with path-partitioned…

Databases · Computer Science 2007-05-23 Andrei Arion , Angela Bonifati , Ioana Manolescu , Andrea Pugliese

SurveyX: Academic Survey Automation via Large Language Models

Large Language Models (LLMs) have demonstrated exceptional comprehension capabilities and a vast knowledge base, suggesting that LLMs can serve as efficient tools for automated survey generation. However, recent research related to…

Computation and Language · Computer Science 2025-02-28 Xun Liang , Jiawei Yang , Yezhaohui Wang , Chen Tang , Zifan Zheng , Shichao Song , Zehao Lin , Yebin Yang , Simin Niu , Hanyu Wang , Bo Tang , Feiyu Xiong , Keming Mao , Zhiyu li

Finding XPath Bugs in XML Document Processors via Differential Testing

Extensible Markup Language (XML) is a widely used file format for data storage and transmission. Many XML processors support XPath, a query language that enables the extraction of elements from XML documents. These systems can be affected…

Software Engineering · Computer Science 2024-01-11 Shuxin Li , Manuel Rigger

A Flexible Structured-based Representation for XML Document Mining

This paper reports on the INRIA group's approach to XML mining while participating in the INEX XML Mining track 2005. We use a flexible representation of XML documents that allows taking into account the structure only or both the structure…

Information Retrieval · Computer Science 2007-05-23 Anne-Marie Vercoustre , Mounir Fegas , Saba Gul , Yves Lechevallier

Automated Text Summarization Base on Lexicales Chain and graph Using of WordNet and Wikipedia Knowledge Base

The technology of automatic document summarization is maturing and may provide a solution to the information overload problem. Nowadays, document summarization plays an important role in information retrieval. With a large volume of…

Information Retrieval · Computer Science 2012-04-10 Mohsen Pourvali , Mohammad Saniee Abadeh

A Divide-and-Conquer Approach to the Summarization of Long Documents

We present a novel divide-and-conquer method for the neural summarization of long documents. Our method exploits the discourse structure of the document and uses sentence similarity to split the problem into an ensemble of smaller…

Computation and Language · Computer Science 2020-09-24 Alexios Gidiotis , Grigorios Tsoumakas

Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

Multi-document summarization is a challenging task for which there exists little large-scale datasets. We propose Multi-XScience, a large-scale multi-document summarization dataset created from scientific articles. Multi-XScience introduces…

Computation and Language · Computer Science 2020-10-28 Yao Lu , Yue Dong , Laurent Charlin

Fast In-Memory XPath Search over Compressed Text and Tree Indexes

A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicates can efficiently be implemented using a compressed…

Databases · Computer Science 2011-10-06 A. Arroyuelo , F. Claude , S. Maneth , V. Mäkinen , G. Navarro , K. Nguyen , J. Siren , N. Välimäki

Unfolding the Structure of a Document using Deep Learning

Understanding and extracting of information from large documents, such as business opportunities, academic articles, medical documents and technical reports, poses challenges not present in short documents. Such large documents may be…

Computation and Language · Computer Science 2019-10-10 Muhammad Mahbubur Rahman , Tim Finin

Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization

In an era where digital text is proliferating at an unprecedented rate, efficient summarization tools are becoming indispensable. While Large Language Models (LLMs) have been successfully applied in various NLP tasks, their role in…

Computation and Language · Computer Science 2024-08-29 Léo Hemamou , Mehdi Debiane

On Generating Extended Summaries of Long Documents

Prior work in document summarization has mainly focused on generating short summaries of a document. While this type of summary helps get a high-level view of a given document, it is desirable in some cases to know more detailed information…

Computation and Language · Computer Science 2020-12-29 Sajad Sotudeh , Arman Cohan , Nazli Goharian

Secure Querying of Recursive XML Views: A Standard XPath-based Technique

Most state-of-the art approaches for securing XML documents allow users to access data only through authorized views defined by annotating an XML grammar (e.g. DTD) with a collection of XPath expressions. To prevent improper disclosure of…

Cryptography and Security · Computer Science 2011-12-13 Houari Mahfoud , Abdessamad Imine

StructSum: Summarization via Structured Representations

Abstractive text summarization aims at compressing the information of a long source document into a rephrased, condensed summary. Despite advances in modeling techniques, abstractive summarization models still suffer from several key…

Computation and Language · Computer Science 2021-02-17 Vidhisha Balachandran , Artidoro Pagnoni , Jay Yoon Lee , Dheeraj Rajagopal , Jaime Carbonell , Yulia Tsvetkov