A novel corpus of children’s impaired speech

Producción: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

This paper introduces the acquisition, evaluation and baseline Automatic Speech Recognition (ASR) experiments of a
novel corpus containing speech from a set of impaired and
unimpaired young speakers. A group of 14 speakers with different speech disorders have uttered several sessions over a
57-word vocabulary in Spanish to gather more than 3 hours
of speech. In addition to this work, a parallel corpus of
speech from unimpaired young speakers has been recorded
with more than 6 hours of speech with the same vocabulary.
The impaired speech corpus has been evaluated through a
manual labeling to detect the mispronunciations made by
the speakers, and the outcome of this work show that 17.31%
of the phonemes have been either mispronounced or deleted
in an isolated work task. A baseline evaluation of the performance of an state-of-the-art ASR system shows a 35.02%
of Word Error Rate (WER) when using Speaker Independent models based on adult speech. This WER is reduced
to 27.60% using models based on children speech and further reduced to 15.35% using speaker dependent models.
Finally, experiments on connected speech show how ASR
performance degrades on 4 impaired speakers on the transition from isolated words to connected speech due to the
language impairments of the speakers and the coarticulation
in connected speech.
Idioma originalInglés
Título de la publicación alojadaProceedings of the 2008 Workshop on Children, Computer and Interaction, Chania, Greece
EstadoPublicada - 2008
Publicado de forma externa

Huella

Profundice en los temas de investigación de 'A novel corpus of children’s impaired speech'. En conjunto forman una huella única.

Citar esto