Web as Corpus 2007
3rd Web as Corpus Workshop (WAC3) - Incorporating Cleaneval, an ACL-SIGWAC event 
Sept. 15-16, 2007 
University of Louvain, Louvain-la-Neuve, Belgium  
For those of us who use the Web to build African-language corpora, the WAC3 workshop may be a good opportunity to exchange results.  From the announcement: "More and more people are using Web data for linguistic and NLP research. The workshop provides a venue for exploring how we can use it effectively and what we will find if we do. We invite submissions which :     
* describe Web corpus collection projects, or modules for one part of the process (crawling, filtering, language-id, tokenising, lemmatising, POS-tagging, indexing, ...)     
* explore characteristics of Web data, from a linguistics/NLP perspective including registers, domains, frequency distributions     
* use crawled Web data for NLP purposes (with emphasis on the data rather than the use)  
More info here.
- Login to post comments
 
          

Submissions Invited (Deadline: 1 May 2007)
As of today, and until 1 May 2007, papers for WAC3 can be submitted in one of the following categories:
Papers: 6-10 pages
Demos: max. 2 pages
Posters: max. 2 pages
Submissions are to be written in English, and may be uploaded on the WAC3 site: https://cental.fltr.ucl.ac.be/wac3/