Central Africa
Second Workshop on African Language Technology (AfLaT 2010) - Report
Submitted by Guy on Thu, 2010-08-05 08:33Martin Benjamin and Kevin Scannell on localisation and Central African languages
Submitted by Guy on Fri, 2010-04-23 06:45Podcast on localisation and Central African languages
Newsletter October 2009
Submitted by Guy on Mon, 2009-10-19 14:48Dear AfLaT member,
we would like to draw your attention to some recent highlights in African Language Technology.
(1) Google announced the availability of their Swahili Machine Translation system
http://aflat.org/?q=node/346
(2) CALL FOR ABSTRACTS for the National Human Language Technology Network Day 2010
http://aflat.org/?q=node/345
(3) MURI Call for White Papers: "Structured Modeling for Low-Density Languages"
http://aflat.org/?q=node/347
Also several new publications and links have recently been added. We hope to see you soon at http://AfLaT.org!
with kind regards
the AfLaT team
Funding opportunity: MURI Call for White Papers
Submitted by Guy on Mon, 2009-10-19 14:14Funding opportunity: MURI Call for White Papers, DUE Dec 11
BROAD AGENCY ANNOUNCEMENT (BAA)
Call for White Papers
DEADLINE: Dec. 11, 2009
Multidisciplinary University Research Initiative (MURI)
topic # 25: "Structured Modeling for Low-Density Languages"
Lingala Diacritic Correction
Submitted by Guy on Mon, 2009-04-27 12:45The problem with using web texts for many languages is that many people do not have the ability to enter diacritics or characters from extended
character sets from their keyboard, and so they simply leave them off (and so Lingala "likɔngá" becomes "likonga"), or use incorrect diacritics
(e.g. using acute/grave in place of the macron in Hawaiian, Maori, etc.). Information is usually lost in this conversion, and indeed most languages
that use diacritics have many pairs of words that differ only in diacritical marks; these distinctions are lost when texts are written in plain ASCII.
The aim of charlifter is to undo this lossy conversion, and "lift" the text back to its proper form.
Web Site Flore
Submitted by Enguehard on Mon, 2008-11-03 16:52This web site present the name of African trees in different languages.
Today there are 375 names, 17 languages.
The language of communication is French, but browing is very simple.
Unicode-Afrique
Submitted by donosborn on Sun, 2008-03-16 02:19Unicode-Afrique est un forum sur Yahoogroupes. Il existe pour : donner publicité aux projets en Afrique utilisant l'Unicode; discuter des questions et problèmes pratiques avec Unicode et les jeux de caractères pour les langues africaines; et partager des expériences utiles sur le développement et utilisation des polices unicodes pour les langues africaines. Cet e-groupe fait partie d'une "famille" de forums de discussion sur la rencontre des langues africaines et NTIC (les autres forums sont accessibles à la page portail "A12n," dont le lien se trouve au fond de cette page).
Corpora for African languages - An Crúbadán
Submitted by scannell on Thu, 2008-02-07 04:42The Crúbadán Project is devoted to creating basic language technology for minority languages and under-resourced languages using web-crawling and statistical techniques. As of early 2008 we have collected text corpora for 419 languages, including more than 125 African languages, and have used these to create open source spell checkers for more than 20 languages. Please contact Kevin Scannell (http://borel.slu.edu/) if you are interested in developing open source resources for other African languages using these data.
Automatic Diacritic Restoration for African Languages
Submitted by Guy on Tue, 2007-10-23 12:16This is a demonstration system for a diacritic restoration method that is able to automatically restore diacritics on the basis of local graphemic context. It is based on the machine learning method of Memory-Based learning. We have applied the method to the African languages of Cilubà, Gĩkũyũ, Kĩkamba, Maa, Sesotho sa Leboa, Tshivenḓa and Yoruba.
You can find more information on this system in this paper
Authors:
Guy De Pauw: CNTS - Language Technology Group, University of Antwerp, Antwerp, Belgium, guy [dot] depauw [at] ua [dot] ac [dot] beGilles-Maurice de Schryver: African Languages and Cultures, Ghent University, Ghent, Belgium, gillesmaurice [dot] deschryver [at] ugent [dot] be
Peter Waiganjo Wagacha: School of Computing and Informatics, University of Nairobi, Nairobi, Kenya, waiganjo [at] uonbi [dot] ac [dot] ke
