Biomedical Document Classification with Literature Graph Representations of Bibliographies and Entities
Ryuki Ida, Makoto Miwa, Yutaka Sasaki
BioNLP and BioNLP-ST 2023 Long paper Paper
TLDR:
This paper proposes a new document classification method that incorporates the representations of a literature graph created from bibliographic and entity information.Recently, document classification performance has been significantly improved with large pre-trained language models; however, there
You can open the
#paper-BioNLP_53
channel in a separate window.
Abstract:
This paper proposes a new document classification method that incorporates the representations of a literature graph created from bibliographic and entity information.Recently, document classification performance has been significantly improved with large pre-trained language models; however, there still remain documents that are difficult to classify. External information, such as bibliographic information, citation links, descriptions of entities, and medical taxonomies, has been considered one of the keys to dealing with such documents in document classification. Although several document classification methods using external information have been proposed, they only consider limited relationships, e.g., word co-occurrence and citation relationships. However, there are multiple types of external information.To overcome the limitation of the conventional use of external information, we propose a document classification model that simultaneously considers bibliographic and entity information to deeply model the relationships among documents using the representations of the literature graph.The experimental results show that our proposed method outperforms existing methods on two document classification datasets in the biomedical domain with the help of the literature graph.