CORE: Cooperative Training of Retriever-Reranker for Effective Dialogue Response Selection

Chongyang Tao; Jiazhan Feng; Tao Shen; Chang Liu; Juntao Li; Xiubo Geng; Daxin Jiang

CORE: Cooperative Training of Retriever-Reranker for Effective Dialogue Response Selection

Chongyang Tao, Jiazhan Feng, Tao Shen, Chang Liu, Juntao Li, Xiubo Geng, Daxin Jiang

📝 Paper

Anthology

Underline 📺 Watch Video on Underline Add to Favorites

Main: Dialogue and Interactive Systems Main-poster Paper

Session 1: Dialogue and Interactive Systems (Virtual Poster)

Conference Room: Pier 7&8

Conference Time: July 10, 11:00-12:30 (EDT) (America/Toronto)

Global Time: July 10, Session 1 (15:00-16:30 UTC)

Keywords: retrieval

TLDR: Establishing retrieval-based dialogue systems that can select appropriate responses from the pre-built index has gained increasing attention. Recent common practice is to construct a two-stage pipeline with a fast retriever (e.g., bi-encoder) for first-stage recall followed by a smart response reran...

You can open the #paper-P4585 channel in a separate window.

Abstract: Establishing retrieval-based dialogue systems that can select appropriate responses from the pre-built index has gained increasing attention. Recent common practice is to construct a two-stage pipeline with a fast retriever (e.g., bi-encoder) for first-stage recall followed by a smart response reranker (e.g., cross-encoder) for precise ranking. However, existing studies either optimize the retriever and reranker in independent ways, or distill the knowledge from a pre-trained reranker into the retriever in an asynchronous way, leading to sub-optimal performance of both modules. Thus, an open question remains about how to train them for a better combination of the best of both worlds. To this end, we present a cooperative training of the response retriever and the reranker whose parameters are dynamically optimized by the ground-truth labels as well as list-wise supervision signals from each other. As a result, the two modules can learn from each other and evolve together throughout the training. Experimental results on two benchmarks demonstrate the superiority of our method.