English

Boosting Search Performance Using Query Variations

Information Retrieval 2020-11-11 v2

Abstract

Rank fusion is a powerful technique that allows multiple sources of information to be combined into a single result set. However, to date fusion has not been regarded as being cost-effective in cases where strict per-query efficiency guarantees are required, such as in web search. In this work we propose a novel solution to rank fusion by splitting the computation into two parts -- one phase that is carried out offline to generate pre-computed centroid answers for queries with broadly similar information needs, and then a second online phase that uses the corresponding topic centroid to compute a result page for each query. We explore efficiency improvements to classic fusion algorithms whose costs can be amortized as a pre-processing step, and can then be combined with re-ranking approaches to dramatically improve effectiveness in multi-stage retrieval systems with little efficiency overhead at query time. Experimental results using the ClueWeb12B collection and the UQV100 query variations demonstrate that centroid-based approaches allow improved retrieval effectiveness at little or no loss in query throughput or latency, and with reasonable pre-processing requirements. We additionally show that queries that do not match any of the pre-computed clusters can be accurately identified and efficiently processed in our proposed ranking pipeline.

Keywords

Cite

@article{arxiv.1811.06147,
  title  = {Boosting Search Performance Using Query Variations},
  author = {Rodger Benham and Joel Mackenzie and Alistair Moffat and J. Shane Culpepper},
  journal= {arXiv preprint arXiv:1811.06147},
  year   = {2020}
}

Comments

Published in ACM TOIS, 2019