Lessons on Parameter Sharing across Layers in Transformers
Sho Takase, Shun Kiyono
The Fourth Workshop on Simple and Efficient Natural Language Processing Long Paper
TLDR:
You can open the
#paper-SustaiNLP_6
channel in a separate window.
Abstract: