Performance of a SCFG-based language model with training data sets of increasing size

Joan Andreu Sánchez, José Miguel Benedí, Diego Linares

Producción: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

In this paper, a hybrid language model which combines a word-based n-gram and a category-based Stochastic Context-Free Grammar (SCFG) is evaluated for training data sets of increasing size. Different estimation algorithms for learning SCFGs in General Format and in Chomsky Normal Form are considered. Experiments on the UPenn Treebank corpus are reported. These experiments have been carried out in terms of the test set perplexity and the word error rate in a speech recognition experiment.

Idioma originalInglés
Páginas (desde-hasta)586-594
Número de páginas9
PublicaciónLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen3523
N.ºII
DOI
EstadoPublicada - 2005
EventoSecond Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2005 - Estoril, Portugal
Duración: 07 jun. 200509 jun. 2005

Huella

Profundice en los temas de investigación de 'Performance of a SCFG-based language model with training data sets of increasing size'. En conjunto forman una huella única.

Citar esto