Predicting Numerals in Text Using Nearest Neighbor Language Models
Taku Sakamoto, Akiko Aizawa
Findings: Semantics: Sentence-level Semantics, Textual Inference, and Other Areas Findings Paper
Session 1: Semantics: Sentence-level Semantics, Textual Inference, and Other Areas (Virtual Poster)
Conference Room: Pier 7&8
Conference Time: July 10, 11:00-12:30 (EDT) (America/Toronto)
Global Time: July 10, Session 1 (15:00-16:30 UTC)
Keywords:
reasoning
TLDR:
Commonsense about quantitative properties is essential for a deep understanding of texts containing numerals.
However, naive language models (LMs) treat numerals as string tokens; therefore, they lack an understanding of the magnitudes of numerals, resulting in a difficulty in acquiring the commonse...
You can open the
#paper-P5255
channel in a separate window.
Abstract:
Commonsense about quantitative properties is essential for a deep understanding of texts containing numerals.
However, naive language models (LMs) treat numerals as string tokens; therefore, they lack an understanding of the magnitudes of numerals, resulting in a difficulty in acquiring the commonsense.
In this study, we apply the $k$-nearest neighbor LM ($k$NN-LM) to the masked numeral prediction (MNP) task, which measures the quantitative commonsense of LMs.
$k$NN-LM extends pre-trained neural LMs with the $k$-nearest neighbor ($k$NN) search.
Since it can utilize patterns that appear in the datastore for prediction, we expect an improvement in numeral prediction accuracy, which is associated with a high rate of occurrence of out-of-vocabulary (OOV) words.
Through experiments, we verified that the retrieval-based method is effective for fine-grained predictions of numerals from context, especially for the OOV numerals.
We also compared two different context spans for context representations to improve the accuracy of $k$NN search by using only the words that are closely related to the masked numeral: the mask and its surrounding words, and the mask and its subsequent words.
Our results reveal that using only the embeddings of mask tokens for numerals in $k$NN search is the most effective approach for realizing MNP tasks.