Development of an Amharic Text-to-Speech System Using Cepstral Method

Publication TypeConference Paper
Year of Publication2009
AuthorsAnberbir, T., and Takara Tomio
BooktitleProceedings of the First Workshop on Language Technologies for African Languages (AfLaT 2009)
PublisherAssociation for Computational Linguistics
LocationAthens, Greece
EditorDe Pauw, Guy, de Schryver Gilles-Maurice, and Levin Lori

This paper presents a speech synthesis system for Amharic language and describes and how the important prosodic features of the language were modeled in the system. The developed Amharic Text-to-Speech system (AmhTTS) is parametric and rule based that employs a cepstral method. The system uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The intelligibility and naturalness of the system was evaluated by word and sentence listening tests respectively and we achieved 98% correct rates for words and an average Mean Opinion Score (MOS) of 3.2 (which is categorized as good) for sentences listening tests. The synthesized speech has high intelligibility and moderate naturalness. Comparing with previous similar study, our system produced considerably similar quality speech with a fairly good prosody. In particular our system is mainly suitable for building new languages with little modification.

