CSE542 will cover three main topics, each to be covered over a 4-week period
- Introduction to the collection and analysis of speech data for speech processing:
Includes a brief introduction to corpus linguistics. Students will learn the range and types of spoken language collections, and will learn how to analyze speech data using the Praat tool.
- Introduction to speech recognition:
Students will learn basic technologies for speech recognition, using the Hidden Markov Model Toolkit (HTK).
- Introduction to concatenative text-to-speech synthesis:
Students will learn the basics of text-to-speech synthesis (TTS), as well as current technologies for concatenative TTS. The TTS system Festival (or its Java version, FreeTTS) will be used.
- Integration of speech recognition and TTS into other technologies (by means of, e.g., VoiceXML and/or the speech SDKs under development by Microsoft, Sun (Java), and IBM) will also be discussed.