Today our ESR Shaheen Syed presented his paper “Bootstrapping a Semantic Lexicon on Verb Similarities” at the 8th International Conference on Knowledge Discovery and Information Retrieval, held in Porto, Portugal.
His paper is co-authored by Melania Borit, the SAF21 project coordinator, and Marco Spruit from Utrecht University.
His research proposes a new algorithm to create a semantic lexicon for the fisheries domain. A semantic lexicon is a list of important words that describe the concept of fisheries.
The paper is now published online and the full paper can be downloaded here:
Syed S., Spruit M. and Borit M. (2016). Bootstrapping a Semantic Lexicon on Verb Similarities. In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management – Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 189-196. DOI: 10.5220/0006036901890196
We present a bootstrapping algorithm to create a semantic lexicon from a list of seed words and a corpus that was mined from the web. We exploit extraction patterns to bootstrap the lexicon and use collocation statistics to dynamically score new lexicon entries. Extraction patterns are subsequently scored by calculating the conditional probability in relation to a non-related text corpus. We find that verbs that are highly domain related achieved the highest accuracy and collocation statistics affect the accuracy positively and negatively during the bootstrapping runs.