corpus

warning: Creating default object from empty value in /home/webserver/html/aflat.bak/modules/taxonomy/taxonomy.pages.inc on line 33.

Tagging and Verifying an Amharic News Corpus

Tagging and Verifying an Amharic News Corpus, Gambäck, Björn , Proceedings of the workshop on Language technology for normalisation of less-resourced languages (SALTMIL8/AfLaT2012), Istanbul, Turkey, p.79-84, (2012)

A Corpus of Santome

A Corpus of Santome, Hagemeijer, Tjerk, Hendrickx Iris, Amaro Haldane, and Tiny Abigail , Proceedings of the workshop on Language technology for normalisation of less-resourced languages (SALTMIL8/AfLaT2012), Istanbul, Turkey, p.61-66, (2012)

The Ukwabelana corpus - An annotated isiZulu corpus

Description: 


  • contains 10,000 morphologically labeled words and 3,000 POS-tagged sentences.
  • The corpus comprises around 100,000 common Zulu word types and 30,000 Zulu sentences compiled from fictional works and the Zulu Bible, from which the labeled words and sentences have been sampled.
  • All software and additional data used during the annotation process is provided: the partial grammar in DCG format, the abductive algorithm for parsing with incomplete information and a prototype for a POS tagger which assigns word categories to morphologically analyzed words."

Natural Language Processing

Description: 

Current research is focussed on resources for corpus development

Syndicate content