Web as Corpus 2007
3rd Web as Corpus Workshop (WAC3) - Incorporating Cleaneval, an ACL-SIGWAC event
Sept. 15-16, 2007
University of Louvain, Louvain-la-Neuve, Belgium
For those of us who use the Web to build African-language corpora, the WAC3 workshop may be a good opportunity to exchange results. From the announcement: "More and more people are using Web data for linguistic and NLP research. The workshop provides a venue for exploring how we can use it effectively and what we will find if we do. We invite submissions which :
* describe Web corpus collection projects, or modules for one part of the process (crawling, filtering, language-id, tokenising, lemmatising, POS-tagging, indexing, ...)
* explore characteristics of Web data, from a linguistics/NLP perspective including registers, domains, frequency distributions
* use crawled Web data for NLP purposes (with emphasis on the data rather than the use)
More info here.
- Login to post comments
Submissions Invited (Deadline: 1 May 2007)
As of today, and until 1 May 2007, papers for WAC3 can be submitted in one of the following categories:
Papers: 6-10 pages
Demos: max. 2 pages
Posters: max. 2 pages
Submissions are to be written in English, and may be uploaded on the WAC3 site: https://cental.fltr.ucl.ac.be/wac3/