Transfer learning for classifying Spanish and english text by clinical specialties

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

2 Scopus citations

Abstract

Transfer learning has demonstrated its potential in natural language processing tasks, where models have been pre-trained on large corpora and then tuned to specific tasks. We applied pre-trained transfer models to a Spanish biomedical document classification task. The main goal is to analyze the performance of text classification by clinical specialties using state-of-the-art language models for Spanish, and compared them with the results using corresponding models in English and with the most important pre-trained model for the biomedical domain. The outcomes present interesting perspectives on the performance of language models that are pre-trained for a particular domain. In particular, we found that BioBERT achieved better results on Spanish texts translated into English than the general domain model in Spanish and the state-of-the-art multilingual model.

Original languageEnglish
Title of host publicationPublic Health and Informatics
Subtitle of host publicationProceedings of MIE 2021
PublisherIOS Press
Pages377-381
Number of pages5
ISBN (Electronic)9781643681856
ISBN (Print)9781643681849
DOIs
StatePublished - 01 Jul 2021

Keywords

  • Classification
  • Natural Language Processing
  • Spanish
  • Transfer learning

Fingerprint

Dive into the research topics of 'Transfer learning for classifying Spanish and english text by clinical specialties'. Together they form a unique fingerprint.

Cite this