Unsupervised Subtitle Segmentation with Masked Language Models

David Ponce, Thierry Etchegoyhen, Victor Ruiz

Main: NLP Applications Main-poster Paper

Poster Session 4: NLP Applications (Poster)
Conference Room: Frontenac Ballroom and Queen's Quay
Conference Time: July 11, 11:00-12:30 (EDT) (America/Toronto)
Global Time: July 11, Poster Session 4 (15:00-16:30 UTC)
Languages: spanish, french, german, italian, dutch, portuguese, romanian
TLDR: We describe a novel unsupervised approach to subtitle segmentation, based on pretrained masked language models, where line endings and subtitle breaks are predicted according to the likelihood of punctuation to occur at candidate segmentation points. Our approach obtained competitive results in term...
You can open the #paper-P3273 channel in a separate window.
Abstract: We describe a novel unsupervised approach to subtitle segmentation, based on pretrained masked language models, where line endings and subtitle breaks are predicted according to the likelihood of punctuation to occur at candidate segmentation points. Our approach obtained competitive results in terms of segmentation accuracy across metrics, while also fully preserving the original text and complying with length constraints. Although supervised models trained on in-domain data and with access to source audio information can provide better segmentation accuracy, our approach is highly portable across languages and domains and may constitute a robust off-the-shelf solution for subtitle segmentation.