KnowComp at SemEval-2023 Task 7: Fine-tuning Pre-trained Language Models for Clinical Trial Entailment Identification

Weiqi Wang; Baixuan Xu; Tianqing Fang; Lirong Zhang; Yangqiu Song

KnowComp at SemEval-2023 Task 7: Fine-tuning Pre-trained Language Models for Clinical Trial Entailment Identification

Weiqi Wang, Baixuan Xu, Tianqing Fang, Lirong Zhang, Yangqiu Song

Add to Favorites

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 7: multi-evidence natural language inference for clinical trial data Paper

TLDR: In this paper, we present our system for the textual entailment identification task as a subtask of the SemEval-2023 Task 7: Multi-evidence Natural Language Inference for Clinical Trial Data.The entailment identification task aims to determine whether a medical statement affirms a valid entailment g

RocketChat
Abstract

You can open the #paper-SemEval_1 channel in a separate window.

Abstract: In this paper, we present our system for the textual entailment identification task as a subtask of the SemEval-2023 Task 7: Multi-evidence Natural Language Inference for Clinical Trial Data.The entailment identification task aims to determine whether a medical statement affirms a valid entailment given a clinical trial premise or forms a contradiction with it.Since the task is inherently a text classification task, we propose a system that performs binary classification given a statement and its associated clinical trial.Our proposed system leverages a human-defined prompt to aggregate the information contained in the statement, section name, and clinical trials.Pre-trained language models are then finetuned on the prompted input sentences to learn to discriminate the inference relation between the statement and clinical trial.To validate our system, we conduct extensive experiments with a wide variety of pre-trained language models.Our best system is built on DeBERTa-v3-large, which achieves an F1 score of 0.764 and secures the fifth rank in the official leaderboard.Further analysis indicates that leveraging our designed prompt is effective, and our model suffers from a low recall.Our code and pre-trained models are available at [https://github.com/HKUST-KnowComp/NLI4CT](https://github.com/HKUST-KnowComp/NLI4CT).