Related papers: Infer XPath

ExPath: Targeted Pathway Inference for Biological Knowledge Bases via Graph Learning and Explanation

Retrieving targeted pathways in biological knowledge bases, particularly when incorporating wet-lab experimental data, remains a challenging task and often requires downstream analyses and specialized expertise. In this paper, we frame this…

Machine Learning · Computer Science 2026-04-14 Rikuto Kotoge , Ziwei Yang , Zheng Chen , Yushun Dong , Yasuko Matsubara , Jimeng Sun , Yasushi Sakurai

Chart-to-Text: Generating Natural Language Descriptions for Charts by Adapting the Transformer Model

Information visualizations such as bar charts and line charts are very popular for exploring data and communicating insights. Interpreting and making sense of such visualizations can be challenging for some people, such as those who are…

Computation and Language · Computer Science 2020-12-01 Jason Obeid , Enamul Hoque

Identifying Web Tables - Supporting a Neglected Type of Content on the Web

The abundance of the data in the Internet facilitates the improvement of extraction and processing tools. The trend in the open data publishing encourages the adoption of structured formats like CSV and RDF. However, there is still a…

Information Retrieval · Computer Science 2016-08-08 Mikhail Galkin , Dmitry Mouromtsev , Sören Auer

FoundWright: A System to Help People Re-find Pages from Their Web-history

Re-finding information is an essential activity, however, it can be difficult when people struggle to express what they are looking for. Through a need-finding survey, we first seek opportunities for improving re-finding experiences, and…

Human-Computer Interaction · Computer Science 2023-05-16 Haekyu Park , Gonzalo Ramos , Jina Suh , Christopher Meek , Rachel Ng , Mary Czerwinski

Multi-Field Adaptive Retrieval

Document retrieval for tasks such as search and retrieval-augmented generation typically involves datasets that are unstructured: free-form text without explicit internal structure in each document. However, documents can have a structured…

Information Retrieval · Computer Science 2025-04-18 Millicent Li , Tongfei Chen , Benjamin Van Durme , Patrick Xia

Dynamic Discovery of Type Classes and Relations in Semantic Web Data

The continuing development of Semantic Web technologies and the increasing user adoption in the recent years have accelerated the progress incorporating explicit semantics with data on the Web. With the rapidly growing RDF (Resource…

Databases · Computer Science 2019-03-12 Serkan Ayvaz , Mehmet Aydar

End-to-End Goal-Driven Web Navigation

We propose a goal-driven web navigation as a benchmark task for evaluating an agent with abilities to understand natural language and plan on partially observed environments. In this challenging task, an agent navigates through a website,…

Artificial Intelligence · Computer Science 2016-05-23 Rodrigo Nogueira , Kyunghyun Cho

Simple and Effective Relation-Based Approaches To XPath and XSLT Type Checking (Technical Report, Bad Honnef 2015)

XPath is a language for addressing parts of an XML document. We give an abstract interpretation of XPath expressions in terms of relations on document node types. Node-set-related XPath language constructs are mapped straightforwardly onto…

Programming Languages · Computer Science 2019-05-20 Baltasar Trancón y Widemann , Markus Lepper

Schemaless Queries over Document Tables with Dependencies

Unstructured enterprise data such as reports, manuals and guidelines often contain tables. The traditional way of integrating data from these tables is through a two-step process of table detection/extraction and mapping the table layouts…

Databases · Computer Science 2019-11-22 Mustafa Canim , Cristina Cornelio , Arun Iyengar , Ryan Musa , Mariano Rodrigez Muro

WebMap -- Large Language Model-assisted Semantic Link Induction in the Web

Carrying out research tasks is only inadequately supported, if not hindered, by current web search engines. This paper therefore proposes functional extensions of WebMap, a semantically induced overlay linking structure on the web to…

Information Retrieval · Computer Science 2025-04-15 Shiraj Pokharel , Georg P. Roßrucker , Mario M. Kubek

XTreePath: A generalization of XPath to handle real world structural variation

We discuss a key problem in information extraction which deals with wrapper failures due to changing content templates. A good proportion of wrapper failures are due to HTML templates changing to cause wrappers to become incompatible after…

Information Retrieval · Computer Science 2017-12-29 Joseph Paul Cohen , Wei Ding , Abraham Bagherjeiran

Towards Semantically Enhanced Data Understanding

In the field of machine learning, data understanding is the practice of getting initial insights in unknown datasets. Such knowledge-intensive tasks require a lot of documentation, which is necessary for data scientists to grasp the meaning…

Databases · Computer Science 2018-06-14 Markus Schröder , Christian Jilek , Jörn Hees , Andreas Dengel

Automatic Text Document Summarization using Semantic-based Analysis

Since the advent of the web, the amount of data on wen has been increased several million folds. In recent years web data generated is more than data stored for years. One important data format is text. To answer user queries over the…

Information Retrieval · Computer Science 2018-11-19 Chandra Shekhar Yadav

Reagent: Converting Ordinary Webpages into Interactive Software Agents

We introduce Reagent, a technology that readily converts ordinary webpages containing structured data into software agents with which one can interact naturally, via a combination of speech and pointing. Previous efforts to make webpage…

Human-Computer Interaction · Computer Science 2018-10-30 Mathew Peveler , Jeffery Kephart , Hui Su

CHARTER: heatmap-based multi-type chart data extraction

The digital conversion of information stored in documents is a great source of knowledge. In contrast to the documents text, the conversion of the embedded documents graphics, such as charts and plots, has been much less explored. We…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Joseph Shtok , Sivan Harary , Ophir Azulai , Adi Raz Goldfarb , Assaf Arbelle , Leonid Karlinsky

StruBERT: Structure-aware BERT for Table Search and Matching

A large amount of information is stored in data tables. Users can search for data tables using a keyword-based query. A table is composed primarily of data values that are organized in rows and columns providing implicit structural…

Information Retrieval · Computer Science 2022-03-29 Mohamed Trabelsi , Zhiyu Chen , Shuo Zhang , Brian D. Davison , Jeff Heflin

Conceptual Analysis of Hypertext

In this chapter tools and techniques from the mathematical theory of formal concept analysis are applied to hypertext systems in general, and the World Wide Web in particular. Various processes for the conceptual structuring of hypertext…

Artificial Intelligence · Computer Science 2018-10-18 Robert E. Kent , Christian Neuss

A Graph Representation of Semi-structured Data for Web Question Answering

The abundant semi-structured data on the Web, such as HTML-based tables and lists, provide commercial search engines a rich information source for question answering (QA). Different from plain text passages in Web documents, Web tables and…

Computation and Language · Computer Science 2020-10-15 Xingyao Zhang , Linjun Shou , Jian Pei , Ming Gong , Lijie Wen , Daxin Jiang

Fast In-Memory XPath Search over Compressed Text and Tree Indexes

A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicates can efficiently be implemented using a compressed…

Databases · Computer Science 2011-10-06 A. Arroyuelo , F. Claude , S. Maneth , V. Mäkinen , G. Navarro , K. Nguyen , J. Siren , N. Välimäki

Towards a Natural Language Query Processing System

Tackling the information retrieval gap between non-technical database end-users and those with the knowledge of formal query languages has been an interesting area of data management and analytics research. The use of natural language…

Information Retrieval · Computer Science 2020-09-29 Chantal Montgomery , Haruna Isah , Farhana Zulkernine