Hizkuntzetarako neurona espezifikoak LLMetan?

Authors

  • Ixak Sarasua Orai NLP Teknologiak.
  • Xabier Saralegi University of the Basque Country (UPV/EHU)

DOI:

https://doi.org/10.26876/ikergazte.vi.03.28

Keywords:

LLM, Basque, low-resourced language, mechanistic interpretation, specific neuron

Abstract

Large Language Models (LLM) are revolutionary artificial intelligence neural networks with billions of parameters. This study investigates language-specific neurons in LLMs, focusing on their adaptation to Basque. Using the Language Activation Probability Entropy (LAPE) metric, we identify neurons in Llama-3.1-8B and its Basque-adapted variant (Llama-eus-8B) that are specialized for Basque, French, Spanish, and English. Experiments reveal that language-specific neurons predominantly cluster in final layers, with Basque showing the highest count. Perplexity analysis indicates that deactivating these neurons disproportionately affects their target languages when these are not predominant in the model, validating their specificity. These findings highlight the interplay between language adaptation and neuron distribution, offering insights for optimizing LLMs for low-resource languages.

Downloads

Published

2025-05-30

How to Cite

Sarasua, I., & Saralegi, X. (2025). Hizkuntzetarako neurona espezifikoak LLMetan?. IkerGazte. Nazioarteko Ikerketa Euskaraz, 3, 227–233. https://doi.org/10.26876/ikergazte.vi.03.28