Towards automatically extracting morphosyntactical error patterns from L1-L2 parallel dependency treebanks
Arianna Masciolini, Elena Volodina, Dana Dannlls
18th Workshop on Innovative Use of NLP for Building Educational Applications Paper
TLDR:
L1-L2 parallel dependency treebanks are UD-annotated corpora of learner sentences paired with correction hypotheses. Automatic morphosyntactical annotation has the potential to remove the need for explicit manual error tagging and improve interoperability, but makes it more challenging to locate gra
You can open the
#paper-BEA_88
channel in a separate window.
Abstract:
L1-L2 parallel dependency treebanks are UD-annotated corpora of learner sentences paired with correction hypotheses. Automatic morphosyntactical annotation has the potential to remove the need for explicit manual error tagging and improve interoperability, but makes it more challenging to locate grammatical errors in the resulting datasets. We therefore propose a novel method for automatically extracting morphosyntactical error patterns and perform a preliminary bilingual evaluation of its first implementation through a similar example retrieval task. The resulting pipeline is also available as a prototype CALL application.