Speech recognition systems do not normally make use of information signalled by prosody, i.e. the segment duration and the fundamental frequency contour of the speech signal. Rather, in current statistical approaches to the speech recognition problem, the acoustic manifestations of prosody is more or less considered as disturbances. In more advanced applications for speech recognition, such as speech-to-speech translation systems, it is obvious that the information conveyed by prosody has to be detected in the source language, mapped onto the target language and then generated by the speech synthesizer of the target language. The linguistic information signalled by prosody is syntactic structure, semantic interpretation and sentence emphasis. Moreover, in languages such as Swedish, with tonal accents, there are word and phrase pairs that are only distinguishable by means of intonation contour. In pure tone languages, the inclusion of prosody is crucial for speech recognition systems. Besides syntactic and semantic information, prosody also mirrors para-linguistic properties such as sex and attitude etc. Speech-to-speech translation systems that will not transfer this type of information will be of limited value for person-to-person communication.