Euskarazko lehen C1 ebaluatzaile automatikoa

Authors

  • Ekhi Azurmendi HiTZ Center -- Ixa, University of the Basque Country(UPV/EHU)
  • Xabier Arregi HiTZ Center -- Ixa, University of the Basque Country(UPV/EHU)
  • Oier Lopez de Lacalle HiTZ Center -- Ixa, University of the Basque Country(UPV/EHU)

DOI:

https://doi.org/10.26876/ikergazte.vi.03.15

Keywords:

artificial intelligence, machine learning, language models, automatic essay evaluation

Abstract

In this article, we have developed an automatic evaluator that determines whether texts written in Basque meet the C1 level. To train the system, we used 10,000 transcribed essays obtained through an agreement between HABE and HiTZ. To analyze the potential impact of essay topics, we designed the training in two ways: using texts from only one exam period and using texts from two exam periods. To establish baselines, we trained two Language Models for Basque, RoBERTa and Latxa, and then worked on different techniques to address data scarcity, prevent system overfitting, and improve performance: EDA, SCL, and regularization. Finally, we conducted analyses of different system behaviors to measure model calibration and the impact of artifacts.

Downloads

Published

2025-05-30

How to Cite

Azurmendi, E., Arregi, X., & de Lacalle, O. L. (2025). Euskarazko lehen C1 ebaluatzaile automatikoa. IkerGazte. Nazioarteko Ikerketa Euskaraz, 3, 125–132. https://doi.org/10.26876/ikergazte.vi.03.15