Modeling Improved Syllabification Algorithm for Amharic

Publication TypeConference Paper
Year of Publication2011
AuthorsHailu, Nirayo, and Hailemariam Sebsibe
BooktitleAGIS11 - Action Week for Global Information Sharing (AfLaT2011 Breakout Session)
LocationAddis Ababa, Ethiopia

We have done a thesis work with the title Modeling Improved Syllabification Algorithm. In this work, a rule-based automatic syllabification Algorithm for Amharic language using linguistic implementation notions is designed following the Maximal Onset and Sonority Hierarchy principles. Amharic is one of the most dominant Semitic Language next to Arabic spoken in the eastern part of Africa, Ethiopia. It is a syllabic language in which every grapheme represents consonant-vowel assimilation. However, while reading a text in Amharic, all the CV syllables are not uttered as expected and hence the syllables in the text are not the CV sequence seen in the grapheme sequence. Epenthesis and gemination are a major challenge in Amharic grapheme-to-phoneme conversion because of the failure of Amharic orthography to show epenthetic vowel and geminated consonants. This limits the performance of many Amharic speech systems (such as Text-To-Speech and Automatic Speech Recognition) and other natural language applications. After a thorough study of the syllable structure, identification of linguistic syllabification rules and a survey of the relevant literature, a set of rules were identified and used to design syllabification algorithm. The system was implemented and tested using carefully selected Amharic words. The overall performance of the system gave rise to 98.1% word accuracy rate with very high sensitivity of insertion of epenthesis.