Mapping WordNet Domains, WordNet Topics and Wikipedia Categories to Generate Multilingual Domain Specific Resources
In this paper we present the mapping between WordNet domains and WordNet topics, and the emergent Wikipedia categories. This mapping leads to a coarse alignment between WordNet and Wikipedia, useful for producing domain-specific and multilingual corpora. Multilinguality is achieved through the cross-language links between Wikipedia categories. Research in word-sense disambiguation has shown that within a specific domain, relevant words have restricted senses. The multilingual, and comparable, domain-specific corpora we produce have the potential to enhance research in word-sense disambiguation and terminology extraction in different languages, which could enhance the performance of various NLP tasks.
PDF Abstract