Related papers: Content-Based Table Retrieval for Web Queries

Semantic Table Retrieval using Keyword and Table Queries

Tables on the Web contain a vast amount of knowledge in a structured form. To tap into this valuable resource, we address the problem of table retrieval: answering an information need with a ranked list of tables. We investigate this…

Information Retrieval · Computer Science 2021-05-14 Shuo Zhang , Krisztian Balog

Scientific Table Search Using Keyword Queries

Tables are common and important in scientific documents, yet most text-based document search systems do not capture structures and semantics specific to tables. How to bridge different types of mismatch between keywords queries and…

Information Retrieval · Computer Science 2017-07-13 Kyle Yingkai Gao , Jamie Callan

Ad Hoc Table Retrieval using Semantic Similarity

We introduce and address the problem of ad hoc table retrieval: answering a keyword query with a ranked list of tables. This task is not only interesting on its own account, but is also being used as a core component in many other…

Information Retrieval · Computer Science 2018-03-09 Shuo Zhang , Krisztian Balog

Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval

Retrieving relevant tables containing the necessary information to accurately answer a given question over tables is critical to open-domain question-answering (QA) systems. Previous methods assume the answer to such a question can be found…

Information Retrieval · Computer Science 2025-01-13 Peter Baile Chen , Yi Zhang , Dan Roth

Answering Table Queries on the Web using Column Keywords

We present the design of a structured search engine which returns a multi-column table in response to a query consisting of keywords describing each of its columns. We answer such queries by exploiting the millions of tables on the Web…

Databases · Computer Science 2017-07-07 Rakesh Pimplikar , Sunita Sarawagi

NLCTables: A Dataset for Marrying Natural Language Conditions with Table Discovery

With the growing abundance of repositories containing tabular data, discovering relevant tables for in-depth analysis remains a challenging task. Existing table discovery methods primarily retrieve desired tables based on a query table or…

Information Retrieval · Computer Science 2025-04-23 Lingxi Cui , Huan Li , Ke Chen , Lidan Shou , Gang Chen

A Social Search Model for Large Scale Social Networks

With the rise of social networks, information on the internet is no longer solely organized by web pages. Rather, content is generated and shared among users and organized around their social relations on social networks. This presents new…

Information Retrieval · Computer Science 2020-05-12 Yunzhong He , Wenyuan Li , Liang-Wei Chen , Gabriel Forgues , Xunlong Gui , Sui Liang , Bo Hou

Identifying Web Tables - Supporting a Neglected Type of Content on the Web

The abundance of the data in the Internet facilitates the improvement of extraction and processing tools. The trend in the open data publishing encourages the adoption of structured formats like CSV and RDF. However, there is still a…

Information Retrieval · Computer Science 2016-08-08 Mikhail Galkin , Dmitry Mouromtsev , Sören Auer

TARGET: Benchmarking Table Retrieval for Generative Tasks

The data landscape is rich with structured data, often of high value to organizations, driving important applications in data analysis and machine learning. Recent progress in representation learning and generative models for such data has…

Information Retrieval · Computer Science 2025-05-20 Xingyu Ji , Parker Glenn , Aditya G. Parameswaran , Madelon Hulsebos

A Graph Representation of Semi-structured Data for Web Question Answering

The abundant semi-structured data on the Web, such as HTML-based tables and lists, provide commercial search engines a rich information source for question answering (QA). Different from plain text passages in Web documents, Web tables and…

Computation and Language · Computer Science 2020-10-15 Xingyao Zhang , Linjun Shou , Jian Pei , Ming Gong , Lijie Wen , Daxin Jiang

An Annotated Corpus of Webtables for Information Extraction Tasks

Information Extraction is a well-researched area of Natural Language Processing with applications in web search and question answering concerned with identifying entities and relationships between them as expressed in a given context,…

Information Retrieval · Computer Science 2020-11-17 Erin Macdonald , Denilson Barbosa

ModelTables: A Corpus of Tables about Models

We present ModelTables, a benchmark of tables in Model Lakes that captures the structured semantics of performance and configuration tables often overlooked by text only retrieval. The corpus is built from Hugging Face model cards, GitHub…

Databases · Computer Science 2025-12-19 Zhengyuan Dong , Victor Zhong , Renée J. Miller

PIPER: Content-Based Table Search via profiling and LLM-Generated Pseudoqueries

The rapid growth of tabular datasets in data lakes, data spaces, and open data portals makes effective dataset search essential for reuse and analysis. Existing search systems rely mainly on metadata, which is often incomplete or low…

Information Retrieval · Computer Science 2026-05-19 Riccardo Terrenzi , Matteo Falconi , Serkan Ayvaz , Pierluigi Plebani

Latent table discovery by semantic relationship extraction between unrelated sets of entity sets of structured data sources

Querying is one of the basic functionality expected from a database system. Query efficiency is adversely affected by increase in the number of participating tables. Also, querying based on syntax largely limits the gamut of queries a…

Databases · Computer Science 2011-04-08 Gowri Shankar Ramaswamy , F Sagayaraj Francis

Representations for Question Answering from Documents with Tables and Text

Tables in Web documents are pervasive and can be directly used to answer many of the queries searched on the Web, motivating their integration in question answering. Very often information presented in tables is succinct and hard to…

Computation and Language · Computer Science 2021-01-27 Vicky Zayats , Kristina Toutanova , Mari Ostendorf

Text Assisted Insight Ranking Using Context-Aware Memory Network

Extracting valuable facts or informative summaries from multi-dimensional tables, i.e. insight mining, is an important task in data analysis and business intelligence. However, ranking the importance of insights remains a challenging and…

Computation and Language · Computer Science 2018-11-15 Qi Zeng , Liangchen Luo , Wenhao Huang , Yang Tang

Recommending Related Tables

Tables are an extremely powerful visual and interactive tool for structuring and manipulating data, making spreadsheet programs one of the most popular computer applications. In this paper we introduce and address the task of recommending…

Information Retrieval · Computer Science 2019-07-26 Shuo Zhang , Krisztian Balog

TableBank: A Benchmark Dataset for Table Detection and Recognition

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet. Existing research for image-based table detection and recognition usually…

Computer Vision and Pattern Recognition · Computer Science 2020-07-07 Minghao Li , Lei Cui , Shaohan Huang , Furu Wei , Ming Zhou , Zhoujun Li

A semantic association page rank algorithm for web search engines

The majority of Semantic Web search engines retrieve information by focusing on the use of concepts and relations restricted to the query provided by the user. By trying to guess the implicit meaning between these concepts and relations,…

Information Retrieval · Computer Science 2012-11-28 Manuel Rojas

Open Domain Question Answering over Tables via Dense Retrieval

Recent advances in open-domain QA have led to strong models based on dense retrieval, but only focused on retrieving textual passages. In this work, we tackle open-domain QA over tables for the first time, and show that retrieval can be…

Computation and Language · Computer Science 2021-06-10 Jonathan Herzig , Thomas Müller , Syrine Krichene , Julian Martin Eisenschlos