Osasun-arloko entitate izendunen etiketatzea

Paula Ontalvilla; Aitziber Atutxa; Maite Oronoz

doi:10.26876/ikergazte.v.03.12

Authors

Paula Ontalvilla University of the Basque Country, UPV/EHU
Aitziber Atutxa University of the Basque Country, UPV/EHU
Maite Oronoz University of the Basque Country, UPV/EHU

DOI:

https://doi.org/10.26876/ikergazte.v.03.12

Keywords:

Named Entity Recognition, language models, Wikidata, medicine

Abstract

This work has a double objective: on the one hand, it identifies named entities using language models based on transformers and, on the other hand, it links the identified clinical entities with the diseases and symptoms of the Wikidata knowledge base. To identify the entities, experiments have been performed on the MedMentions biomedical corpus with a generalpre-trained language mode˜n BERT (BERT small) and two specialised BERTs (BiomedNLP-PubMedBERT and BioBERT). When assessing whether a succession of tokens constitutes a medical entity, an F1 value of 0.819 was obtained, while assessing the specific class to which the entity belongs, an F1 value of 0.62 was obtained. In addition, a recall close to 50% has been achieved in the first attempt to associate Wikidata to known entities using the Levenhstein distance.

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Osasun-arloko entitate izendunen etiketatzea

Authors

DOI:

Keywords:

Abstract

License

Downloads

Published

How to Cite

Conference Proceedings Volume

Section

Categories

eISSN-zutabe

Language

BAIONAKO EGOITZA SOZIALA

BILBOKO EGOITZA SOZIALA

EIBARKO EGOITZA AKADEMIKOA

IRUÑEKO EGOITZA SOZIALA