Hermann Ney, head of the Chair of Human Language Technology and Pattern Recognition, Lehrstuhl für Informatik VI, at the RWTH Aachen-University, for his support and his interest. To my parents, Ildikó and GáborAcknowledgments First of all I would like to thank my supervisor, Prof. 1.ĭiese Dissertation ist auf den Internetseiten der Hochschulbibliothek online verfügbar. An average WER reduction of 2 % relative was obtained on the NIST Hub-5 dev2001 and eval2002 databases. The best integration approach from the single-pass system experiments was implemented in a multi-pass system for large vocabulary testing on the Switchboard database. Best results overall came from a more complex system combining a multiframe voicing feature window with the MFCC plus third differential features using linear discriminant analysis and optimizing the number of voicing feature frames. Promising early results were obtained in a simple system concatenating the voicing features with MFCC features and optimizing the voicing feature window duration. We explored several alternatives to integrate the voicing features into SRI’s DECIPHER system. The voicing features computed are the normalized autocorrelation peak and a newly proposed entropy of the high-order cepstrum. The voicing feature front end parameters are optimized for recognition accuracy. We augment the Mel cepstral (MFCC) feature representation with voicing features from an independent front end.
0 Comments
Leave a Reply. |