Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA
Elias Stengel-Eskin, Jimena Guallar-Blasco, Yi Zhou, Benjamin Van Durme
Main: Language Grounding to Vision, Robotics, and Beyond Main-oral Paper
Session 2: Language Grounding to Vision, Robotics, and Beyond (Oral)
Conference Room: Pier 4&5
Conference Time: July 10, 14:00-15:30 (EDT) (America/Toronto)
Global Time: July 10, Session 2 (18:00-19:30 UTC)
Keywords:
vision question answering
TLDR:
Natural language is ambiguous. Resolving ambiguous questions is key to successfully answering them.
Focusing on questions about images, we create a dataset of ambiguous examples. We annotate these, grouping answers by the underlying question they address and rephrasing the question for each group to...
You can open the
#paper-P2123
channel in a separate window.
Abstract:
Natural language is ambiguous. Resolving ambiguous questions is key to successfully answering them.
Focusing on questions about images, we create a dataset of ambiguous examples. We annotate these, grouping answers by the underlying question they address and rephrasing the question for each group to reduce ambiguity.
Our analysis reveals a linguistically-aligned ontology of reasons for ambiguity in visual questions.
We then develop an English question-generation model which we demonstrate via automatic and human evaluation produces less ambiguous questions.
We further show that the question generation objective we use allows the model to integrate answer group information without any direct supervision.