[SRW] EvoGrad: An Online Platform for an Evolving Winograd Schema Challenge using Adversarial Human Perturbations

Jing Han Sun, Ali Emami

Student Research Workshop Srw Paper

Session 7: Student Research Workshop (Poster)
Conference Room: Frontenac Ballroom and Queen's Quay
Conference Time: July 12, 11:00-12:30 (EDT) (America/Toronto)
Global Time: July 12, Session 7 (15:00-16:30 UTC)
TLDR: Transformer-based language models have been recently showcasing impressive performance on a number of common-sense reasoning tasks such as the Winograd Schema Challenge (WSC) while continuing to struggle on task instances that are either slightly re-worded or perturbed. In the following paper, we wi...
You can open the #paper-S98 channel in a separate window.
Abstract: Transformer-based language models have been recently showcasing impressive performance on a number of common-sense reasoning tasks such as the Winograd Schema Challenge (WSC) while continuing to struggle on task instances that are either slightly re-worded or perturbed. In the following paper, we wish to address these issues by re-framing the WSC using a never-ending learning, human-in-the-loop scenario devised specifically for perturbed pronoun co-reference resolution problems. We introduce EvoGrad, an open-source, user-friendly platform for the continual evaluation and development of models, based on iterations of human-adversarial perturbations. Given that common-sense knowledge varies cross-culturally and through time, our platform allows for the communal contribution towards an evolving task that is both inclusive and accessible to wider societies. In addition, we propose a novel mechanism to develop new task instances, and define a new metric to measure model stability on such dynamic tasks, called the Minimum Error Depth. We show that models fine-tuned on a small iteration of EvoGrad have their performance boosted on WSC-based tasks; this indicates a promising synergy between the acquisition of common sense and the never-ending learning paradigm.