Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk

jianquan li; XiangBo Wu; Xiaokang Liu; Qianqian Xie; Prayag Tiwari; Benyou Wang

Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk

jianquan li, XiangBo Wu, Xiaokang Liu, Qianqian Xie, Prayag Tiwari, Benyou Wang

📝 Paper

Anthology

Underline 📺 Watch Video on Underline Add to Favorites

Main: Resources and Evaluation Main-poster Paper

Session 7: Resources and Evaluation (Virtual Poster)

Conference Room: Pier 7&8

Conference Time: July 12, 11:00-12:30 (EDT) (America/Toronto)

Global Time: July 12, Session 7 (15:00-16:30 UTC)

Keywords: corpus creation

Languages: chinese

TLDR: Language is the principal tool for human communication, in which humor is one of the most attractive parts. Producing natural language like humans using computers, a.k.a, Natural Language Generation (NLG), has been widely used for dialogue systems, chatbots, machine translation, as well as computer-...

You can open the #paper-P596 channel in a separate window.

Abstract: Language is the principal tool for human communication, in which humor is one of the most attractive parts. Producing natural language like humans using computers, a.k.a, Natural Language Generation (NLG), has been widely used for dialogue systems, chatbots, machine translation, as well as computer-aid creation e.g., idea generations, scriptwriting. However, the humor aspect of natural language is relatively under-investigated, especially in the age of pre-trained language models. In this work, we aim to preliminarily test *whether NLG can generate humor as humans do*. We build a largest dataset consisting of numerous **C**hinese **C**omical **C**rosstalk scripts (called **C**\^3 in short), which is for a popular Chinese performing art called `Xiangsheng' or `相声' since 1800s. We benchmark various generation approaches including training-from-scratch Seq2seq, fine-tuned middle-scale PLMs, and large-scale PLMs (with and without fine-tuning). Moreover, we also conduct a human assessment, showing that 1) *large-scale pretraining largely improves crosstalk generation quality*; and 2) *even the scripts generated from the best PLM is far from what we expect*. We conclude humor generation could be largely improved using large-scaled PLMs, but it is still in its infancy. The data and benchmarking code are publicly available in [https://github.com/anonNo2/crosstalk-generation](https://github.com/anonNo2/crosstalk-generation).