Learning Dynamic Contextualised Word Embeddings via Template-based Temporal Adaptation

Xiaohang Tang; Yi Zhou; Danushka Bollegala

Learning Dynamic Contextualised Word Embeddings via Template-based Temporal Adaptation

Xiaohang Tang, Yi Zhou, Danushka Bollegala

📝 Paper

Anthology

Underline 🪧 Poster 📺 Watch Video on Underline Add to Favorites

Main: Information Extraction Main-oral Paper

Session 5: Information Extraction (Oral)

Conference Room: Metropolitan Centre

Conference Time: July 11, 16:15-17:45 (EDT) (America/Toronto)

Global Time: July 11, Session 5 (20:15-21:45 UTC)

Keywords: lexical semantic change

TLDR: Dynamic contextualised word embeddings (DCWEs) represent the temporal semantic variations of words. We propose a method for learning DCWEs by time-adapting a pretrained Masked Language Model (MLM) using time-sensitive templates. Given two snapshots $C_1$ and $C_2$ of a corpus taken respectively at t...

You can open the #paper-P2130 channel in a separate window.

Abstract: Dynamic contextualised word embeddings (DCWEs) represent the temporal semantic variations of words. We propose a method for learning DCWEs by time-adapting a pretrained Masked Language Model (MLM) using time-sensitive templates. Given two snapshots $C_1$ and $C_2$ of a corpus taken respectively at two distinct timestamps $T_1$ and $T_2$, we first propose an unsupervised method to select (a) \emph{pivot} terms related to both $C_1$ and $C_2$, and (b) \emph{anchor} terms that are associated with a specific pivot term in each individual snapshot. We then generate prompts by filling manually compiled templates using the extracted pivot and anchor terms. Moreover, we propose an automatic method to learn time-sensitive templates from $C_1$ and $C_2$, without requiring any human supervision. Next, we use the generated prompts to adapt a pretrained MLM to $T_2$ by fine-tuning using those prompts. Multiple experiments show that our proposed method significantly reduces the perplexity of test sentences in $C_2$, outperforming the current state-of-the-art.