Vom Sprachsignal zum Wort/From the Speech Signal to the Word

Henning Reetz

Mapping the continuous and highly variable speech signal to discrete words in the memory is not a trivial problem automatic and human speech perception. The course will cover methods used in today's automatic speech recognition systems, like Hidden-Markov models (HMMs) and artificial neural nets (ANNs), which rely on stochastic models and which can operate without phonetic and linguistic knowledge. Next to these, the course will introduce alternative knowledge based approaches that are partly oriented towards human processes in speech perception and speech production.

The relevant material is introduced without an expectation of prior mathematical or technical knowledge and the issues will be covered in non-mathematical language.

Literature

Stochastical methods:
Becchetti, C. and L. P. Ricotti (1999)
Speech Recognition - Theory and C++ Implementation. Chichester: John Wiley & Sons.

De Mori, R., (Ed.) (1998)
Spoken Dialogues with Computers. London: Academic Press.

Jelinek, F. (1997)
Statistical Methods for Speech Recognition. Cambridge: MIT Press.

O'Shaughnessy, D. (2000)
The Handbook of Phonetic Sciences. Piscataway: IEEE Press.


Knowledge based approaches:
Ainsworth, W. A. (1997)
"Some approaches to automatic speech recognition." In W. J. Hardcastle and J. Laver The Handbook of Phonetic Sciences. Oxford: Blackwell: 721-743.

Fohr, D., J.-P. Haton, and Y. Laprie (1994)
"Knowledge-based techniques in acoustic-phonetic decoding of speech: interest and limitations," International Journal of Pattern Recognition and Artificial Intelligence 8: 133-153.

Glass, J. R. and V. W. Zue (1994)
"Speech recognition, automatic: knowledge based methods." In R. E. Asher and J. M. Y. Simpson The Encyclopedia of Language and Linguistics. Oxford: Pergamon Press, Vol. 8: 4231-4241.