DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue

William Held, Christopher Hidey, Fei Liu, Eric Y Zhu, Rahul Goel, Diyi Yang, Rushin Shah

Main: Multilingualism and Cross-Lingual NLP Main-poster Paper

Poster Session 2: Multilingualism and Cross-Lingual NLP (Poster)
Conference Room: Frontenac Ballroom and Queen's Quay
Conference Time: July 10, 14:00-15:30 (EDT) (America/Toronto)
Global Time: July 10, Poster Session 2 (18:00-19:30 UTC)
Keywords: code-switching, multilingualism, cross-lingual transfer, mutlilingual representations, multilingual evaluation
Languages: spanish-english codemixing, hindi-english codemixing, spanish, french, german, hindi, thai
TLDR: Modern virtual assistants use internal semantic parsing engines to convert user utterances to actionable commands. However, prior work has demonstrated multilingual models are less robust for semantic parsing compared to other tasks. In global markets such as India and Latin America, robust multilin...
You can open the #paper-P2291 channel in a separate window.
Abstract: Modern virtual assistants use internal semantic parsing engines to convert user utterances to actionable commands. However, prior work has demonstrated multilingual models are less robust for semantic parsing compared to other tasks. In global markets such as India and Latin America, robust multilingual semantic parsing is critical as codeswitching between languages is prevalent for bilingual users. In this work we dramatically improve the zero-shot performance of a multilingual and codeswitched semantic parsing system using two stages of multilingual alignment. First, we show that contrastive alignment pretraining improves \textit{both} English performance and transfer efficiency. We then introduce a constrained optimization approach for hyperparameter-free adversarial alignment during finetuning. Our Doubly Aligned Multilingual Parser (DAMP) improves mBERT transfer performance by 3x, 6x, and 81x on the Spanglish, Hinglish and Multilingual Task Oriented Parsing benchmarks respectively and outperforms XLM-R and mT5-Large using 3.2x fewer parameters.