oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimes

Daniel Campos, Alexandre Marques, Mark Kurtz, Cheng Xiang Zhai

The Fourth Workshop on Simple and Efficient Natural Language Processing Long Paper

TLDR:
You can open the #paper-SustaiNLP_4 channel in a separate window.
Abstract: