Human Language Technologies for Ethiopian Languages: Challenges and Future Directions

TitleHuman Language Technologies for Ethiopian Languages: Challenges and Future Directions
Publication TypeConference Paper
Year of Publication2011
AuthorsAbate, Solomon Teferra, Ephrem Binyam, Yifru Enchalew, Tilahun Kassa, Hagos Lemlem, Abubeker Mohammed-hussen, and Girma Taye
BooktitleAGIS11 - Action Week for Global Information Sharing (AfLaT2011 Breakout Session)
LocationAddis Ababa, Ethiopia

Research on HLT for Ethiopian languages started recently (in the late 1990s) and most of the researches have been done as a partial fulfillment of the requirement for a degree. Most of them are not published in any of the international publications. The attempts are, however, encouraging and valuable for the development of Ethiopian HLT covering a lot of HLT areas such as Optical Character Recognition (OCR), Text-to-speech, Automatic Speech Recognition, Spell Checker, POS tagging, Stemmer, Morphology, Parser, Thesaurus, Machine Translation, Summarization, Categorization and Extraction. But, they are very limited in language coverage and research depth. Out of more than 80 Ethiopian languages, only 3-5 languages have been considered. With regard to research depth most of the attempts are development of prototype systems and none of them has reached to the level of benchmark evaluation.

A closer look into the research attempts that are made locally (made by Masters students) and abroad (that are published in different proceedings and journals) shows the challenges that hindered the development of HLT for Ethiopian languages. To mention a few: lack of language resources, insufficient knowledge for either technological or linguistic analysis, lack of integration (research results in an area, eg POS tagging, is not applied for the research attempt in another area, eg. Parsing, which could itself be applied in other areas like Machine Translation), lack of consolidation and continuity of the research attempts, absence of national research plan as HLT road-map, lack of sustainable and coordinated research fund.

In this talk, we will present the review of the previous research attempts, experiences of different countries (where HLT is well developed and started to contribute for development) and forward strategies for the advancement of HLT research in Ethiopian languages as well as its contribution to the development of the country.