Northern Sotho

Base Concepts in the African Languages Compared to Upper Ontologies and the WordNet Top Ontology

Anderson, Winston, Laurette Pretorius, and Albert E. Kotzé. "Base Concepts in the African Languages Compared to Upper Ontologies and the WordNet Top Ontology." Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC'10). Eds. Nicoletta Calzolari, et al. Valletta, Malta: European Language Resources Association (ELRA), 2010.

Automatic Diacritic Restoration for African Languages

The orthography of many African languages includes diacritically marked characters. Falling outside the scope of the standard Latin encoding, these characters are often represented in digital language resources as their unmarked equivalents. This renders corpus compilation more difficult, as these languages typically do not have the benefit of large electronic dictionaries to perform diacritic restoration.

This is a demonstration system for a diacritic restoration method that is able to automatically restore diacritics on the basis of local graphemic context. It is based on the machine learning method of Memory-Based learning. We have applied the method to the African languages of Cilubà, Gĩkũyũ, Kĩkamba, Maa, Sesotho sa Leboa, Tshivenḓa and Yoruba.

You can find more information on this system in this paper

Select a language and enter the word or sentence you want to restore diacritics for.
Cilubà (e.g. mutekete)
Gĩkũyũ (e.g. nituronire)
Kĩkamba (e.g. ningulilikana)
Maasai (e.g. oltunani)
Sesotho sa Leboa (Northern Sotho) (e.g. swanetse)
Tshivenḓa (e.g. tshiswitulo)
Yoruba (e.g. isinku)
 

[Processing the text might take a while]

Authors:
Guy De Pauw: CNTS - Language Technology Group, University of Antwerp, Antwerp, Belgium, guy [dot] depauw [at] ua [dot] ac [dot] be
Gilles-Maurice de Schryver: African Languages and Cultures, Ghent University, Ghent, Belgium, gillesmaurice [dot] deschryver [at] ugent [dot] be
Peter Waiganjo Wagacha: School of Computing and Informatics, University of Nairobi, Nairobi, Kenya, waiganjo [at] uonbi [dot] ac [dot] ke

Northern Sotho Part-of-Speech Tagger (V2) - Demo

This demo showcases a part-of-speech tagger for Northern Sotho. It retrieves the morpho-syntactic categories for words in a sentence. It uses MBT, the memory-based tagger trained on a relatively small annotated corpus.

Version1: Ocotober 10 2007 (20k tokens training set)
Version2: December 8 2007 (35k tokens training set)


Type in the text you want to tag (2,500 character limit)
Example: Motho ge a sa tseba o swanetše go dumela seo gore bao ba tsebago ba mmotše.

[Tagging the text might take a while]

Authors:

Guy De Pauw: CNTS - Language Technology Group, University of Antwerp, Antwerp, Belgium, guy [dot] depauw [at] ua [dot] ac [dot] be
Gilles-Maurice de Schryver: African Languages and Cultures, Ghent University, Ghent, Belgium, gillesmaurice [dot] deschryver [at] ugent [dot] be

Paper

Verbal extension sequencing: An examination from a computational perspective

Anderson, Winston, and Albert E. Kotzé. "Verbal extension sequencing: An examination from a computational perspective." 14th International Conference of the African Language Association of Southern Africa. African Language Studies: Towards Sustainable Development. Nelson Mandela Metropolitan University, Port Elizabeth, Eastern Cape Province, South Africa 2007.

Morpheme sequencing of the verbal element in Northern Sotho with emphasis on non-concord morphemes: A computational perspective

Anderson, Winston, and Petronella M. Kotzé. "Morpheme sequencing of the verbal element in Northern Sotho with emphasis on non-concord morphemes: A computational perspective." 14th International Conference of the African Language Association of Southern Africa. African Language Studies: Towards Sustainable Development. Nelson Mandela Metropolitan University, Port Elizabeth, Eastern Cape Province, South Africa 2007.

Sounds like ‘Sutu’

Anderson, Winston, and Albert E. Kotzé. "Sounds like ‘Sutu’." South African Journal of African Languages. 25.2 (2005): 111-123.

Online Explanatory Northern Sotho Dictionary

Description: 

An on-line explanatory dictionary for Northern Sotho.

Syndicate content