Lessons on Parameter Sharing across Layers in Transformers

Sho Takase, Shun Kiyono

The Fourth Workshop on Simple and Efficient Natural Language Processing Long Paper

TLDR:
You can open the #paper-SustaiNLP_6 channel in a separate window.
Abstract: