Towards Better Entity Linking with Multi-View Enhanced Distillation

Yi Liu; Yuan Tian; Jianxun Lian; xinlong wang; Yanan Cao; Fang Fang; Wen Zhang; Haizhen Huang; Weiwei Deng; Qi Zhang

Towards Better Entity Linking with Multi-View Enhanced Distillation

Yi Liu, Yuan Tian, Jianxun Lian, xinlong wang, Yanan Cao, Fang Fang, Wen Zhang, Haizhen Huang, Weiwei Deng, Qi Zhang

📝 Paper

Anthology

Underline 🪧 Poster 📺 Watch Video on Underline Add to Favorites

Main: Information Extraction Main-poster Paper

Session 1: Information Extraction (Virtual Poster)

Conference Room: Pier 7&8

Conference Time: July 10, 11:00-12:30 (EDT) (America/Toronto)

Global Time: July 10, Session 1 (15:00-16:30 UTC)

Keywords: entity linking/disambiguation

TLDR: Dense retrieval is widely used for entity linking to retrieve entities from large-scale knowledge bases. Mainstream techniques are based on a dual-encoder framework, which encodes mentions and entities independently and calculates their relevances via rough interaction metrics, resulting in difficul...

You can open the #paper-P3243 channel in a separate window.

Abstract: Dense retrieval is widely used for entity linking to retrieve entities from large-scale knowledge bases. Mainstream techniques are based on a dual-encoder framework, which encodes mentions and entities independently and calculates their relevances via rough interaction metrics, resulting in difficulty in explicitly modeling multiple mention-relevant parts within entities to match divergent mentions. Aiming at learning entity representations that can match divergent mentions, this paper proposes a Multi-View Enhanced Distillation (MVD) framework, which can effectively transfer knowledge of multiple fine-grained and mention-relevant parts within entities from cross-encoders to dual-encoders. Each entity is split into multiple views to avoid irrelevant information being over-squashed into the mention-relevant view. We further design cross-alignment and self-alignment mechanisms for this framework to facilitate fine-grained knowledge distillation from the teacher model to the student model. Meanwhile, we reserve a global-view that embeds the entity as a whole to prevent dispersal of uniform information. Experiments show our method achieves state-of-the-art performance on several entity linking benchmarks.