ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems
Sarik Ghazarian, Yijia Shao, Rujun Han, Aram Galstyan, Nanyun Peng
Main: Dialogue and Interactive Systems Main-poster Paper
Poster Session 6: Dialogue and Interactive Systems (Poster)
Conference Room: Frontenac Ballroom and Queen's Quay
Conference Time: July 12, 09:00-10:30 (EDT) (America/Toronto)
Global Time: July 12, Poster Session 6 (13:00-14:30 UTC)
Keywords:
evaluation and metrics
TLDR:
Commonsense reasoning is omnipresent in human communications and thus is an important feature for open-domain dialogue systems.
However, evaluating commonsense in dialogue systems is still an open challenge. We take the first step by focusing on {\em event commonsense} that considers events and the...
You can open the
#paper-P5639
channel in a separate window.
Abstract:
Commonsense reasoning is omnipresent in human communications and thus is an important feature for open-domain dialogue systems.
However, evaluating commonsense in dialogue systems is still an open challenge. We take the first step by focusing on {\em event commonsense} that considers events and their relations, and is crucial in both dialogues and general commonsense reasoning.
We propose \textbf{ACCENT}, an event commonsense evaluation metric empowered by commonsense knowledge bases (CSKBs). ACCENT first extracts event-relation tuples from a dialogue, and then evaluates the response by scoring the tuples in terms of their compatibility with the CSKB.
To evaluate ACCENT, we construct the first public event commonsense evaluation dataset for open-domain dialogues.
Our experiments show that ACCENT is an efficient metric for event commonsense evaluation, which achieves higher correlations with human judgments than existing baselines.