warning: Creating default object from empty value in /home/webserver/html/aflat/modules/taxonomy/taxonomy.pages.inc on line 33.

Northern Sotho Part-of-Speech Tagger (V2) - Demo

This demo showcases a part-of-speech tagger for Northern Sotho. It retrieves the morpho-syntactic categories for words in a sentence. It uses MBT, the memory-based tagger trained on a relatively small annotated corpus.

Version1: Ocotober 10 2007 (20k tokens training set)
Version2: December 8 2007 (35k tokens training set)

Type in the text you want to tag (2,500 character limit)
Example: Motho ge a sa tseba o swanetše go dumela seo gore bao ba tsebago ba mmotše.

[Tagging the text might take a while]


Guy De Pauw: CNTS - Language Technology Group, University of Antwerp, Antwerp, Belgium, guy [dot] depauw [at] ua [dot] ac [dot] be
Gilles-Maurice de Schryver: African Languages and Cultures, Ghent University, Ghent, Belgium, gillesmaurice [dot] deschryver [at] ugent [dot] be




Apertium is an open-source machine translation platform, initially aimed at related-language pairs but recently expanded to deal with more divergent language pairs (such as English–Catalan). The platform provides (a) a language-independent machine translation engine (b) tools to manage the linguistic data necessary to build a machine translation system for a given language pair and (c) linguistic data for a growing number of language pairs.

Current released language pairs include:

* Spanish–Catalan
* Spanish–Portuguese
* Spanish–Galician
* Occitan–Catalan
* French–Catalan
* English–Catalan
* Romanian-Spanish

With the following under development:

* English-Afrikaans
* French-Spanish

Try it out:

* http://xixona.dlsi.ua.es/apertium-unstable/

Syndicate content