Narrative-Understanding

Organizers: Nader Akoury, Faeze Brahman, Khyathi Chandu, Snigdha Chaturvedi, Elizabeth Clark, Mohit Iyyer

This is the 5th iteration of the Narrative Understanding Workshop, which brings together an interdisciplinary group of researchers from AI, ML, NLP, Computer Vision and other related fields, as well as scholars from the humanities to discuss methods to improve automatic narrative understanding capabilities. The workshop will consist of talks from invited speakers, a panel of researchers and writers, and talks and posters from accepted papers.
You can open the #workshop-Narrative-Understanding channel in separate windows.

Workshop Papers

Evaluation Metrics for Depth and Flow of Knowledge in Non-fiction Narrative Texts
Authors: Sachin Pawar, Girish Palshikar, Ankita Jain, Mahesh Singh, Mahesh Rangarajan, Aman Agarwal, Vishal Kumar, Karan Singh

In this paper, we describe the problem of automatically evaluating quality of knowledge expressed in a non-fiction narrative text. We focus on a specific type of documents where each document describes a certain technical problem and its solution. The goal is not only to evaluate the quality of knowledge in such a document, but also to automatically suggest possible improvements to the writer so that a better knowledge-rich document is produced. We propose new evaluation metrics to evaluate quality of knowledge contents as well as flow of different types of sentences. The suggestions for improvement are generated based on these metrics. The proposed metrics are completely unsupervised in nature and they are derived from a set of simple corpus statistics. We demonstrate the effectiveness of the proposed metrics as compared to other existing baseline metrics in our experiments.

Go to Paper
Modeling Readers' Appreciation of Literary Narratives Through Sentiment Arcs and Semantic Profiles
Authors: Pascale Moreira, Yuri Bizzoni, Kristoffer Nielbo, Ida Marie Lassen, Mads Thomsen

Predicting literary quality and reader appreciation of narrative texts are highly complex challenges in quantitative and computational literary studies due to the fluid definitions of quality and the vast feature space that can be considered when modeling a literary work. This paper investigates the potential of sentiment arcs combined with topical-semantic profiling of literary narratives as indicators for their literary quality. Our experiments focus on a large corpus of 19th and 20the century English language literary fiction, using GoodReads' ratings as an imperfect approximation of the diverse range of reader evaluations and preferences. By leveraging a stacked ensemble of regression models, we achieve a promising performance in predicting average readers' scores, indicating the potential of our approach in modeling literary quality.

Go to Paper
Word Category Arcs in Literature Across Languages and Genres
Authors: Winston Wu, Lu Wang, Rada Mihalcea

Word category arcs measure the progression of word usage across a story. Previous work on arcs has explored structural and psycholinguistic arcs through the course of narratives, but so far it has been limited to \textbackslash{}textit\{English\} narratives and a narrow set of word categories covering binary emotions and cognitive processes. In this paper, we expand over previous work by (1) introducing a novel, general approach to quantitatively analyze word usage arcs for any word category through a combination of clustering and filtering; and (2) exploring narrative arcs in literature in eight different languages across multiple genres. Through multiple experiments and analyses, we quantify the nature of narratives across languages, corroborating existing work on monolingual narrative arcs as well as drawing new insights about the interpretation of arcs through correlation analyses.

Go to Paper
The Candide model: How narratives emerge where observations meet beliefs
Authors: Paul Van Eecke, Lara Verheyen, Tom Willaert, Katrien Beuls

This paper presents the Candide model as a computational architecture for modelling human-like, narrative-based language understanding. The model starts from the idea that narratives emerge through the process of interpreting novel linguistic observations, such as utterances, paragraphs and texts, with respect to previously acquired knowledge and beliefs. Narratives are personal, as they are rooted in past experiences, and constitute perspectives on the world that might motivate different interpretations of the same observations. Concretely, the Candide model operationalises this idea by dynamically modelling the belief systems and background knowledge of individual agents, updating these as new linguistic observations come in, and exposing them to a logic reasoning engine that reveals the possible sources of divergent interpretations. Apart from introducing the foundational ideas, we also present a proof-of-concept implementation that demonstrates the approach through a number of illustrative examples.

Go to Paper
What is Wrong with Language Models that Can Not Tell a Story?
Authors: Ivan Yamshchikov, Alexey Tikhonov

In this position paper, we contend that advancing our understanding of narrative and the effective generation of longer, subjectively engaging texts is crucial for progress in modern Natural Language Processing (NLP) and potentially the broader field of Artificial Intelligence. We highlight the current lack of appropriate datasets, evaluation methods, and operational concepts necessary for initiating work on narrative processing.

Go to Paper
Transferring Procedural Knowledge across Commonsense Tasks
Authors: Yifan Jiang, Filip Ilievski, Kaixin Ma

Stories about everyday situations are an essential part of human communication, motivating the need to develop AI agents that can reliably understand these stories. Despite the long list of supervised methods for story completion and procedural understanding, current AI has no mechanisms to automatically track and explain procedures in unseen stories. To bridge this gap, we study the ability of AI models to transfer procedural knowledge to novel narrative tasks in a transparent manner. We design LEAP: a comprehensive framework that integrates state-of-the-art modeling architectures, training regimes, and augmentation strategies based on both natural and synthetic stories. To address the lack of densely annotated training data, we devise a robust automatic labeler based on few-shot prompting to enhance the augmented data. Our experiments with in- and out-of-domain tasks reveal insights into the interplay of different architectures, training regimes, and augmentation strategies. LEAP's labeler has a clear positive impact on out-of-domain datasets, while the resulting dense annotation provides native explainability

Go to Paper
Unsupervised Task Graph Generation from Instructional Video Transcripts
Authors: Lajanugen Logeswaran, Sungryull Sohn, Yunseok Jang, Moontae Lee, Honglak Lee

This work explores the problem of generating task graphs of real-world activities. Different from prior formulations, we consider a setting where text transcripts of instructional videos performing a real-world activity (e.g., making coffee) are provided and the goal is to identify the key steps relevant to the task as well as the dependency relationship between these key steps. We propose a novel task graph generation approach that combines the reasoning capabilities of instruction-tuned language models along with clustering and ranking components to generate accurate task graphs in a completely unsupervised manner. We show that the proposed approach generates more accurate task graphs compared to a supervised learning approach on tasks from the Procel and CrossTask datasets.

Go to Paper
Novalign: Neural Cross-Lingual Sentence Alignment for Novels
Authors: Francesco Molfese, Andrei Stefan Bejgu, Simone Tedeschi, Roberto Navigli

Sentence alignment -- establishing links between corresponding sentences in two relateddocuments -- is as important in paraphrase generation as it is in machine translation. Despite its applicability, its benefits are often overlooked in the context of narrative understanding. For instance, it can be leveraged in cross-lingual story analysis and cultural analytics. This includes identifying similarities and differences between narratives across languages, or understanding narrative structures in a more comprehensive way. To bridge this gap, we introduce a novel methodology for sentence alignment designed specifically for novels. In particular, we propose Novalign, an end-to-end, fully-neural architecture that maps source and target sentences based on their contextualized sentence embeddings. We extensively evaluate Novalign on a new, multilingual dataset derived from the Opus project consisting of 20 language pairs, and demonstrate that our model achieves state-of-the-art performance. To ensure reproducibility, we release our code and model checkpoints at omitted.link.

Go to Paper
Story Settings: A Dataset
Authors: Kaley Rittichier

Understanding the settings of a given story has long been viewed as an essential component of understanding the story at large. This significance is not only underscored in academic literary analysis but also in kindergarten education. However, despite this significance, it has received relatively little attention regarding computational analyses of stories. This paper presents a dataset of 2,302 time period setting labeled works and 6,991 location setting labeled works. This dataset aims to help with Cultural Analytics of literary works but may also aid in time-period-related questions within literary Q\textbackslash{}\&A systems.

Go to Paper
An Analysis of Reader Engagement in Literary Fiction through Eye Tracking and Linguistic Features
Authors: Rose Neis, Karin De Langis, Zae Myung Kim, Dongyeop Kang

Capturing readers' engagement in fiction is a challenging but important aspect of narrative understanding. In this study, we collected 23 readers' reactions to 2 short stories through eye tracking, sentence-level annotations, and an overall engagement scale survey. We analyzed the significance of various qualities of the text in predicting how engaging a reader is likely to find it. As enjoyment of fiction is highly contextual, we also investigated individual differences in our data. Furthering our understanding of what captivates readers in fiction will help better inform models used in creative narrative generation and collaborative writing tools.

Go to Paper
Identifying Visual Depictions of Animate Entities in Narrative Comics: An Annotation Study
Authors: Lauren Edlin, Joshua Reiss

Animate entities in narrative comics stories are expressed through a number of visual representations across panels. Identifying these entities is necessary for recognizing characters and analysing narrative affordances unique to comics, and integrating these with linguistic reference annotation, however an annotation process for animate entity identification has not received adequate attention. This research explores methods for identifying animate entities visually in comics using annotation experiments. Two rounds of inter-annotator agreement experiments are run: the first asks annotators to outline areas on comic pages using a Polygon segmentation tool, and the second prompts annotators to assign each outlined entity's animacy type to derive a quantitative measure of agreement. The first experiment results show that Polygon-based outlines successfully produce a qualitative measure of agreement; the second experiment supports that animacy status is best conceptualised as a graded, rather than binary, concept.

Go to Paper
Mrs. Dalloway Said She Would Segment the Chapters Herself
Authors: Peiqi Sui, Lin Wang, Sil Hamilton, Thorsten Ries, Kelvin Wong, Stephen Wong

This paper proposes a sentiment-centric pipeline to perform unsupervised plot extraction on non-linear novels like Virginia Woolf's Mrs. Dalloway, a novel widely considered to be "plotless. Combining transformer-based sentiment analysis models with statistical testing, we model sentiment's rate-of-change and correspondingly segment the novel into emotionally self-contained units qualitatively evaluated to be meaningful surrogate pseudo-chapters. We validate our findings by evaluating our pipeline as a fully unsupervised text segmentation model, achieving a F-1 score of 0.643 (regional) and 0.214 (exact) in chapter break prediction on a validation set of linear novels with existing chapter structures. In addition, we observe notable differences between the distributions of predicted chapter lengths in linear and non-linear fictional narratives, with the latter exhibiting significantly greater variability. Our results hold significance for narrative researchers appraising methods for extracting plots from non-linear novels.

Go to Paper
Composition and Deformance: Measuring Imageability with a Text-to-Image Model
Authors: Si Wu, David Smith

Although psycholinguists and psychologists have long studied the tendency of linguistic strings to evoke mental images in hearers or readers, most computational studies have applied this concept of imageability only to isolated words. Using recent developments in text-to-image generation models, such as DALLE mini, we propose computational methods that use generated images to measure the imageability of both single English words and connected text. We sample text prompts for image generation from three corpora: human-generated image captions, news article sentences, and poem lines. We subject these prompts to different deformances to examine the model's ability to detect changes in imageability caused by compositional change. We find high correlation between the proposed computational measures of imageability and human judgments of individual words. We also find the proposed measures more consistently respond to changes in compositionality than baseline approaches. We discuss possible effects of model training and implications for the study of compositionality in text-to-image models.

Go to Paper
Dramatic Conversation Disentanglement
Authors: Kent Chang, Danica Chen, David Bamman

We present a new dataset for studying conversation disentanglement in movies and TV series. While previous work has focused on conversation disentanglement in IRC chatroom dialogues, movies and TV shows provide a space for studying complex pragmatic patterns of floor and topic change in face-to-face multi-party interactions. In this work, we draw on theoretical research in sociolinguistics, sociology, and film studies to operationalize a conversational thread (including the notion of a floor change) in dramatic texts, and use that definition to annotate a dataset of 10,033 dialogue turns (comprising 2,209 threads) from 831 movies. We compare the performance of several disentanglement models on this dramatic dataset, and apply the best-performing model to disentangle 808 movies. We see that, contrary to expectation, average thread lengths do not decrease significantly over the past 40 years, and characters portrayed by actors who are women, while underrepresented, initiate more new conversational threads relative to their speaking time.

Go to Paper
Echoes from Alexandria: A Large Resource for Multilingual Book Summarization
Authors: Alessandro Scir, Simone Conia, Simone Ciciliano, Roberto Navigli

In recent years, research in text summarization has mainly focused on the news domain,where texts are typically short and have strong layout features. The task of full-book summarization presents additional challenges which are hard to tackle with current resources, due to their limited size and availability in English only. To overcome these limitations, we present "Echoes from Alexandria, or in shortened form, "Echoes", a large resource for multilingual book summarization. Echoes features three novel datasets: i) Echo-Wiki, for multilingual book summarization, ii) Echo-XSum,for extremely-compressive multilingual book summarization, and iii) Echo-FairySum, for extractive book summarization. To the best of our knowledge, Echoes with its thousands of books and summaries is the largest resource, and the first to be multilingual, featuring 5 languages and 25 language pairs. In addition to Echoes, we also introduce a newextractive-then-abstractive baseline, and, supported by our experimental results and manual analysis of the summaries generated, we argue that this baseline is more suitable for book summarization than purely-abstractive approaches. We release our resource and software at https://github.com/Babelscape/echoes-from-alexandria in the hope of fostering innovative research in multilingual book summarization.

Go to Paper
Narrative Cloze as a Training Objective: Towards Modeling Stories Using Narrative Chain Embeddings
Authors: Hans Ole Hatzel, Chris Biemann

We present a novel approach to modeling narratives using narrative chain embeddings.A new dataset of narrative chains extracted from German news texts is presented.With neural methods, we produce models for both German and English that achieve state-of-the-art performance on the Multiple Choice Narrative Cloze task.Subsequently, we perform an extrinsic evaluation of the embeddings our models produce and show that they perform rather poorly in identifying narratively similar texts. We explore some of the reasons for this underperformance and discuss the upsides of our approach.We provide an outlook on alternative ways to model narratives, as well as techniques for evaluating such models.

Go to Paper
What's New? Identifying the Unfolding of New Events in a Narrative
Authors: Seyed Mahed Mousavi, Shohei Tanaka, Gabriel Roccabruna, Koichiro Yoshino, Satoshi Nakamura, Giuseppe Riccardi

Narratives include a rich source of events unfolding over time and context. Automatic understanding of these events provides a summarised comprehension of the narrative for further computation (such as reasoning). In this paper, we study the Information Status (IS) of the events and propose a novel challenging task: the automatic identification of new events in a narrative. We define an event as a triplet of subject, predicate, and object. The event is categorized as new with respect to the discourse context and whether it can be inferred through commonsense reasoning. We annotated a publicly available corpus of narratives with the new events at sentence level using human annotators. We present the annotation protocol and study the quality of the annotation and the difficulty of the task. We publish the annotated dataset, annotation materials, and machine learning baseline models for the task of new event extraction for narrative understanding.

Go to Paper
Emotion and Modifier in Henry Rider Haggard's Novels
Authors: Salim Sazzed

In recent years, there has been a growing scholarly interest in employing quantitative methods to analyze literary texts, as they offer unique insights, theories, and interpretations. In light of this, the current study employs quantitative analysis to examine the fiction written by the renowned British adventure novelist, Sir Henry Rider Haggard. Specifically, the study aims to investigate the affective content and prevalence of distinctive linguistic features in six of Haggard's most distinguished works. We evaluate dominant emotional states at the sentence level as well as investigate the deployment of specific linguistic features such as modifiers and deontic modals, and collocated terms. Through sentence-level emotion analysis the findings reveal a notable prevalence of "joy"-related emotions across the novels. Furthermore, the study observes that intensifiers are employed more commonly than the mitigators as modifiers and the collocated terms of modifiers exhibit high similarity across the novels. By integrating quantitative analyses with qualitative assessments, this study presents a novel perspective on the patterns of emotion and specialized grammatical features in some of Haggard's most celebrated literary works.

Go to Paper
Synopses of Movie Narratives: a Video-Language Dataset for Story Understanding
Authors: Yidan Sun, Qin Chao, Yangfeng Ji, Boyang Li

Despite recent advances of AI, story understanding remains an open and under-investigated problem. We collect, preprocess, and publicly release a video-language story dataset, Synopses of Movie Narratives (SyMoN), containing 5,193 video summaries of popular movies and TV series with a total length of 869 hours. SyMoN captures naturalistic storytelling videos made by human creators and intended for a human audience. As a prototypical and naturalistic story dataset, SyMoN features high coverage of multimodal story events. Its use of storytelling techniques cause cross-domain semantic gaps that provide appropriate challenges to existing models. We establish benchmarks on video-text retrieval and zero-shot alignment on movie summary videos, which showcase the importance of in-domain data and long-term memory in story understanding. With SyMoN, we hope to lay the groundwork for progress in multimodal story understanding.

Go to Paper

ACL 2023

Back to Top

© 2023 Association for Computational Linguistics