BpHigh at SemEval-2023 Task 7: Can Fine-tuned Cross-encoders Outperform GPT-3.5 in NLI Tasks on Clinical Trial Data?

Bhavish Pahwa; Bhavika Pahwa

BpHigh at SemEval-2023 Task 7: Can Fine-tuned Cross-encoders Outperform GPT-3.5 in NLI Tasks on Clinical Trial Data?

Bhavish Pahwa, Bhavika Pahwa

Add to Favorites

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 7: multi-evidence natural language inference for clinical trial data Paper

TLDR: Many nations and organizations have begun collecting and storing clinical trial records for storage and analytical purposes so that medical and clinical practitioners can refer to them on a centralized database over the internet and stay updated with the current clinical information. The amount of c

RocketChat
Abstract

You can open the #paper-SemEval_292 channel in a separate window.

Abstract: Many nations and organizations have begun collecting and storing clinical trial records for storage and analytical purposes so that medical and clinical practitioners can refer to them on a centralized database over the internet and stay updated with the current clinical information. The amount of clinical trial records have gone through the roof, making it difficult for many medical and clinical practitioners to stay updated with the latest information. To help and support medical and clinical practitioners, there is a need to build intelligent systems that can update them with the latest information in a byte-sized condensed format and, at the same time, leverage their understanding capabilities to help them make decisions. This paper describes our contribution to SemEval 2023 Task 7: Multi-evidence Natural Language Inference for Clinical Trial Data (NLI4CT). Our results show that there is still a need to build domain-specific models as smaller transformer-based models can be finetuned on that data and outperform foundational large language models like GPT-3.5. We also demonstrate how the performance of GPT-3.5 can be increased using few-shot prompting by leveraging the semantic similarity of the text samples and the few-shot train snippets. We will also release our code and our models on open source hosting platforms, GitHub and HuggingFace.