Osasun-arloko entitate izendunen etiketatzea
DOI:
https://doi.org/10.26876/ikergazte.v.03.12Keywords:
Named Entity Recognition, language models, Wikidata, medicineAbstract
This work has a double objective: on the one hand, it identifies named entities using language models based on transformers and, on the other hand, it links the identified clinical entities with the diseases and symptoms of the Wikidata knowledge base. To identify the entities, experiments have been performed on the MedMentions biomedical corpus with a generalpre-trained language mode˜n BERT (BERT small) and two specialised BERTs (BiomedNLP-PubMedBERT and BioBERT). When assessing whether a succession of tokens constitutes a medical entity, an F1 value of 0.819 was obtained, while assessing the specific class to which the entity belongs, an F1 value of 0.62 was obtained. In addition, a recall close to 50% has been achieved in the first attempt to associate Wikidata to known entities using the Levenhstein distance.
License
Copyright (c) 2023 IkerGazte. Nazioarteko ikerketa euskaraz

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
