Large Language Models as Instructors: A Study on Multilingual Clinical Entity Extraction
Simon Meoni, Eric De la Clergerie, Theo Ryffel
BioNLP and BioNLP-ST 2023 Short paper Paper
TLDR:
In clinical and other specialized domains, data are scarce due to their confidential nature. This lack of data is a major problem when fine-tuning language models.Nevertheless, very large language models (LLMs) are promising for the medical domain but cannot be used directly in healthcare facilities
You can open the
#paper-BioNLP_20
channel in a separate window.
Abstract:
In clinical and other specialized domains, data are scarce due to their confidential nature. This lack of data is a major problem when fine-tuning language models.Nevertheless, very large language models (LLMs) are promising for the medical domain but cannot be used directly in healthcare facilities due to data confidentiality issues. We explore an approach of annotating training data with LLMs to train smaller models more adapted to our problem. We show that this method yields promising results for information extraction tasks.