Resumen
Extracting valuable knowledge from Electronic Health Records (EHR) represents a challenging task due to the presence of both structured and unstructured data, including codified fields, images and test results. Narrative text in particular contains a variety of notes which are diverse in language and detail, as well as being full of ad hoc terminology, including acronyms and jargon, which is especially challenging in non-English EHR, where there is a dearth of annotated corpora or trained case sets. This paper proposes an approach for NER and concept attribute labeling for EHR that takes into consideration the contextual words around the entity of interest to determine its sense. The approach proposes a composition method of three different NER methods, together with the analysis of the context (neighboring words) using an ensemble classification model. This contributes to disambiguate NER, as well as labeling the concept as confirmed, negated, speculative, pending or antecedent. Results show an improvement of the recall and a limited impact on precision for the NER process.
Idioma original | Inglés |
---|---|
Título de la publicación alojada | Data Analytics in Medicine |
Subtítulo de la publicación alojada | Concepts, Methodologies, Tools, and Applications |
Editorial | IGI Global |
Páginas | 325-339 |
Número de páginas | 15 |
Volumen | 1 |
ISBN (versión digital) | 9781799812050 |
ISBN (versión impresa) | 9781799812043 |
DOI | |
Estado | Publicada - 06 dic. 2019 |