Sakura at SemEval-2023 Task 2: Data Augmentation via Translation
Alberto Poncelas, Maksim Tkachenko, Ohnmar Htun
The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 2: multiconer ii multilingual complex named entity recognition Paper
TLDR:
We demonstrate a simple yet effective approach to augmenting training data for multilingual named entity recognition using translations. The named entity spans from the original sentences are transferred to translations via word alignment and then filtered with the baseline recognizer. The proposed
You can open the
#paper-SemEval_263
channel in a separate window.
Abstract:
We demonstrate a simple yet effective approach to augmenting training data for multilingual named entity recognition using translations. The named entity spans from the original sentences are transferred to translations via word alignment and then filtered with the baseline recognizer. The proposed approach outperforms the baseline XLM-Roberta on the multilingual dataset.