Sakura at SemEval-2023 Task 2: Data Augmentation via Translation

Alberto Poncelas; Maksim Tkachenko; Ohnmar Htun

Sakura at SemEval-2023 Task 2: Data Augmentation via Translation

Alberto Poncelas, Maksim Tkachenko, Ohnmar Htun

Add to Favorites

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 2: multiconer ii multilingual complex named entity recognition Paper

TLDR: We demonstrate a simple yet effective approach to augmenting training data for multilingual named entity recognition using translations. The named entity spans from the original sentences are transferred to translations via word alignment and then filtered with the baseline recognizer. The proposed

RocketChat
Abstract

You can open the #paper-SemEval_263 channel in a separate window.

Abstract: We demonstrate a simple yet effective approach to augmenting training data for multilingual named entity recognition using translations. The named entity spans from the original sentences are transferred to translations via word alignment and then filtered with the baseline recognizer. The proposed approach outperforms the baseline XLM-Roberta on the multilingual dataset.