Learning Action Conditions from Instructional Manuals for Instruction Understanding

Te-Lin Wu; Caiqi ZHANG; Qingyuan Hu; Alexander Spangher; Nanyun Peng

Learning Action Conditions from Instructional Manuals for Instruction Understanding

Te-Lin Wu, Caiqi ZHANG, Qingyuan Hu, Alexander Spangher, Nanyun Peng

📝 Paper

Anthology

Underline 🪧 Poster 🧑‍🏫 Slides 📺 Watch Video on Underline Add to Favorites

Main: Semantics: Sentence-level Semantics, Textual Inference, and Other Areas Main-poster Paper

Poster Session 1: Semantics: Sentence-level Semantics, Textual Inference, and Other Areas (Poster)

Conference Room: Frontenac Ballroom and Queen's Quay

Conference Time: July 10, 11:00-12:30 (EDT) (America/Toronto)

Global Time: July 10, Poster Session 1 (15:00-16:30 UTC)

Keywords: reasoning

TLDR: The ability to infer pre- and postconditions of an action is vital for comprehending complex instructions, and is essential for applications such as autonomous instruction-guided agents and assistive AI that supports humans to perform physical tasks. In this work, we propose a task dubbed action con...

You can open the #paper-P5640 channel in a separate window.

Abstract: The ability to infer pre- and postconditions of an action is vital for comprehending complex instructions, and is essential for applications such as autonomous instruction-guided agents and assistive AI that supports humans to perform physical tasks. In this work, we propose a task dubbed action condition inference, which extracts mentions of preconditions and postconditions of actions in instructional manuals. We propose a weakly supervised approach utilizing automatically constructed large-scale training instances from online instructions, and curate a densely human-annotated and validated dataset to study how well the current NLP models do on the proposed task. We design two types of models differ by whether contextualized and global information is leveraged, as well as various combinations of heuristics to construct the weak supervisions. Our experiments show a > 20\% F1-score improvement with considering the entire instruction contexts and a >~6\% F1-score benefit with the proposed heuristics. However, the best performing model is still well-behind human performance.