Skip to main navigation Skip to search Skip to main content

A novel corpus of children’s impaired speech

  • University of Zaragoza

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper introduces the acquisition, evaluation and baseline Automatic Speech Recognition (ASR) experiments of a
novel corpus containing speech from a set of impaired and
unimpaired young speakers. A group of 14 speakers with different speech disorders have uttered several sessions over a
57-word vocabulary in Spanish to gather more than 3 hours
of speech. In addition to this work, a parallel corpus of
speech from unimpaired young speakers has been recorded
with more than 6 hours of speech with the same vocabulary.
The impaired speech corpus has been evaluated through a
manual labeling to detect the mispronunciations made by
the speakers, and the outcome of this work show that 17.31%
of the phonemes have been either mispronounced or deleted
in an isolated work task. A baseline evaluation of the performance of an state-of-the-art ASR system shows a 35.02%
of Word Error Rate (WER) when using Speaker Independent models based on adult speech. This WER is reduced
to 27.60% using models based on children speech and further reduced to 15.35% using speaker dependent models.
Finally, experiments on connected speech show how ASR
performance degrades on 4 impaired speakers on the transition from isolated words to connected speech due to the
language impairments of the speakers and the coarticulation
in connected speech.
Original languageEnglish
Title of host publicationProceedings of the 2008 Workshop on Children, Computer and Interaction, Chania, Greece
StatePublished - 2008
Externally publishedYes

Fingerprint

Dive into the research topics of 'A novel corpus of children’s impaired speech'. Together they form a unique fingerprint.

Cite this