Ontology-Driven Construction of Domain Corpus with Frame Semantics Annotations
2012 (English)In: Computational linguistics and intelligent text processing, Springer Berlin/Heidelberg, 2012, 54-65 p.Conference paper (Refereed)
Semantic Role Labeling plays a key role in many text mining applications. The development of SRL systems for the biomedical domain is frustrated by the lack of large domain specific corpora that are labeled with semantic roles. In this paper we proposed a method for building corpus that are labeled with semantic roles for the domain of biomedicine. The method is based on the theory of frame semantics, and uses domain knowledge provided by ontologies. By using the method, we have built a corpus for transport events strictly following the domain knowledge provided by GO biological process ontology. We compared one of our frames to a BioFrameNet frame. We also examined the gaps between the semantic classification of the target words in this domain-specific corpus and in FrameNet and PropBank/VerbNet data. The successful corpus construction demonstrates that ontologies, as a formal representation of domain knowledge, can instruct us and ease all the tasks in building this kind of corpus. Furthermore, ontological domain knowledge leads to well-defined semantics exposed on the corpus, which will be very valuable in text mining applications.
Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2012. 54-65 p.
Lecture Notes in Computer Science, ISSN 0302-9743 (print), 1611-3349 (online) ; 7181
IdentifiersURN: urn:nbn:se:liu:diva-74807DOI: 10.1007/978-3-642-28604-9_5ISBN: 978-3-642-28603-2 (print)ISBN: 978-3-642-28604-9 (online)OAI: oai:DiVA.org:liu-74807DiVA: diva2:495806
13th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2012; New Delhi; India