A Number to Yorùbá Text Transcription System

Submitted by Guy on Fri, 2011-12-09 08:03

Title	A Number to Yorùbá Text Transcription System
Publication Type	Conference Paper
Year of Publication	2011
Authors	Olawale, Akinade Olugbenga, and Odejobi Odetunji Ajadi
Booktitle	AGIS11 - Action Week for Global Information Sharing (AfLaT2011 Breakout Session)
Location	Addis Ababa, Ethiopia
Abstract	An important task in the high level text to speech synthesis process is text normalization in which textual anomalies such as symbols, numeral and abbreviation are expanded into their textual forms. The expansion of numbers in the text is a key task in such applications because numeral, mostly represented as Arabic numerals, occurs more frequently in texts. In this talk we shall discuss our research in which we examined the knowledge and computation underlying the expansion of cardinal numbers to their Standard Yoruba (SY) textual equivalence. Furthermore, we shall discuss the design and implementation of software for realizing the task. We shall show that the generation of the lexical equivalence for the SY numeral system, which is generally vigesimal (based on 20), requires a combination of mathematical skills and linguistic competence. The computational skills required for achieving this task includes the ability to order, add, subtract and multiply through a system of contraction, elision and euphonic assimilation. We shall also discuss how the computational model was formulated using formal methods and implemented using JFLAP (Java Formal Language and Automata Package) and the Python programming language. The software will be demonstrated and attempt to extend the software to mobile and web-based applications will be discussed.

»

Login to post comments
Google Scholar

Also...

User login

Also hosted on AfLaT.org

Register @ aflat.org

Registered members of AfLaT.org can upload publications, add links and information on their research projects. If you would like to become a member of AfLaT.org, please contact guy♻aflat.org.