Molekulen propietateen iragarpena sare neuronalen bitartez datu-urritasun egoeretan
DOI:
https://doi.org/10.26876/ikergazte.vi.05.12Keywords:
Recurrent Neural Network, SMILES, data scarcityAbstract
We present a Recurrent Neural Network (RNN) that predicts molecular properties only based on the molecular structure. The SMILES representations of the molecular structures are fed into the algorithm as an input. In general, Artificial Neural Networks work well when they have plenty of input data available, but they perform poorly under data scarcity scenarios. In this work, we specially focus on giving a solution to the problem of data scarcity and we have analyzed different approaches to tackle it. Our hypothesis is that training the model with similar data will improve the results. The analyzed similarities are of distinct nature. On the one hand, we have considered string similarities of the SMILES encodings. On the other hand, we have computed the similarities of the feature vectors.
License
Copyright (c) 2025 IkerGazte. Nazioarteko ikerketa euskaraz

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
