Effective Contrastive Weighting for Dense Query Expansion

Xiao Wang, Sean MacAvaney, Craig Macdonald, Iadh Ounis

Main: Information Retrieval and Text Mining Main-poster Paper

Poster Session 7: Information Retrieval and Text Mining (Poster)
Conference Room: Frontenac Ballroom and Queen's Quay
Conference Time: July 12, 11:00-12:30 (EDT) (America/Toronto)
Global Time: July 12, Poster Session 7 (15:00-16:30 UTC)
Keywords: passage retrieval, dense retrieval
TLDR: Verbatim queries submitted to search engines often do not sufficiently describe the user's search intent. Pseudo-relevance feedback (PRF) techniques, which modify a query's representation using the top-ranked documents, have been shown to overcome such inadequacies and improve retrieval effectivenes...
You can open the #paper-P3549 channel in a separate window.
Abstract: Verbatim queries submitted to search engines often do not sufficiently describe the user's search intent. Pseudo-relevance feedback (PRF) techniques, which modify a query's representation using the top-ranked documents, have been shown to overcome such inadequacies and improve retrieval effectiveness for both lexical methods (e.g., BM25) and dense methods (e.g., ANCE, ColBERT). For instance, the recent ColBERT-PRF approach heuristically chooses new embeddings to add to the query representation using the inverse document frequency (IDF) of the underlying tokens. However, this heuristic potentially ignores the valuable context encoded by the embeddings. In this work, we present a contrastive solution that learns to select the most useful embeddings for expansion. More specifically, a deep language model-based contrastive weighting model, called CWPRF, is trained to learn to discriminate between relevant and non-relevant documents for semantic search. Our experimental results show that our contrastive weighting model can aid to select useful expansion embeddings and outperform various baselines. In particular, CWPRF can improve nDCG@10 by upto to 4.1\% compared to an existing PRF approach for ColBERT while maintaining its efficiency.