Labels are not necessary: Assessing peer-review helpfulness using domain adaptation based on self-training

Chengyuan Liu, Divyang Doshi, Muskaan Bhargava, Ruixuan Shang, Jialin Cui, Dongkuan Xu, Edward Gehringer

18th Workshop on Innovative Use of NLP for Building Educational Applications Paper

TLDR: A peer-assessment system allows students to provide feedback on each other's work. An effective peer assessment system urgently requires helpful reviews to facilitate students to make improvements and progress. Automated evaluation of review helpfulness, with the help of deep learning models and nat
You can open the #paper-BEA_24 channel in a separate window.
Abstract: A peer-assessment system allows students to provide feedback on each other's work. An effective peer assessment system urgently requires helpful reviews to facilitate students to make improvements and progress. Automated evaluation of review helpfulness, with the help of deep learning models and natural language processing techniques, gains much interest in the field of peer assessment. However, collecting labeled data with the "helpfulness" tag to build these prediction models remains challenging. A straightforward solution would be using a supervised learning algorithm to train a prediction model on a similar domain and apply it to our peer review domain for inference. But naively doing so can degrade the model performance in the presence of the distributional gap between domains. Such a distributional gap can be effectively addressed by Domain Adaptation (DA). Self-training has recently been shown as a powerful branch of DA to address the distributional gap. The first goal of this study is to evaluate the performance of self-training-based DA in predicting the helpfulness of peer reviews as well as the ability to overcome the distributional gap. Our second goal is to propose an advanced self-training framework to overcome the weakness of the existing self-training by tailoring knowledge distillation and noise injection, to further improve the model performance and better address the distributional gap.