Web as Corpus 2007

3rd Web as Corpus Workshop (WAC3) - Incorporating Cleaneval, an ACL-SIGWAC event
Sept. 15-16, 2007
University of Louvain, Louvain-la-Neuve, Belgium

For those of us who use the Web to build African-language corpora, the WAC3 workshop may be a good opportunity to exchange results. From the announcement: "More and more people are using Web data for linguistic and NLP research. The workshop provides a venue for exploring how we can use it effectively and what we will find if we do. We invite submissions which :
* describe Web corpus collection projects, or modules for one part of the process (crawling, filtering, language-id, tokenising, lemmatising, POS-tagging, indexing, ...)
* explore characteristics of Web data, from a linguistics/NLP perspective including registers, domains, frequency distributions
* use crawled Web data for NLP purposes (with emphasis on the data rather than the use)

More info here.

Submissions Invited (Deadline: 1 May 2007)

As of today, and until 1 May 2007, papers for WAC3 can be submitted in one of the following categories:

Papers: 6-10 pages
Demos: max. 2 pages
Posters: max. 2 pages

Submissions are to be written in English, and may be uploaded on the WAC3 site: http://cental.fltr.ucl.ac.be/wac3/