Open this publication in new window or tab >>2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]
The Semantic Web provides a framework for representing, sharing, and integrating data on the Web using a set of specifications promoted by the World Wide Web Consortium (W3C). These specifications include RDF as the model for data inter-change on the Web and languages (e.g., RDFS and OWL) for defining schemas and ontologies. While the Semantic Web has traditionally focused on static or slowly changing data, information on the Web is becoming increasingly dynamic, with sources such as Internet-of-Things devices, sensor networks, smart cities, social me-dia, and more. RDF Stream Processing (RSP) extends Semantic Web technologies to support streaming data and continuous queries and has been suggested as a candidate for bridging the gap between Complex Event Processing (CEP), which focuses on identifying meaningful events and event patterns from streaming data, and the Semantic Web standards. Systems that operate on real-world data must often deal with uncertainty, which can arise from, for example, missing information, incomplete domain knowledge, sensor noise, or linguistic vagueness. Uncertainty has received attention in both Semantic Web and CEP research, but little is known about how it can be managed in RSP and how it might impact performance. The contributions of this thesis are threefold. First, the issue of supporting a general model of CEP in RSP is addressed. A set of requirements for CEP is identified and used to define an event ontology for use in RSP. An approach is then proposed for creating a CEP framework that can scale processing beyond the limitations of a single RSP instance. Second, an extension of the RSP-QL data model is defined for representation of statement-level annotations. The data model is then used as a basis for capturing different types of uncertainty in a use case inspired by a research project in electronic healthcare. Finally, the performance impact of explicitly managing different types of uncertainty is evaluated in a prototype implementation and a set of optimization strategies is introduced with a goal of reducing the impact of uncertainty on query execution performance. The results show that the proposed approach to representing statement-level metadata reduces required data transfer bandwidth and that it can improve query execution performance com-pared with using RDF reification. The optimization strategies produce improved query execution performance overall, but the impact of the heuristic depends on multiple factors, including the selectivity of filters, join cardinalities, and the cost of evaluating uncertainty functions.
Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2021. p. 112
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2153
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-179481 (URN)10.3384/diss.diva-179481 (DOI)9789179296216 (ISBN)
Public defence
2021-11-11, Ada Lovelace, B-building, Campus Valla, Linköping, 10:15 (English)
Opponent
Supervisors
Note
Funding agencies: This work was partly funded by: (1) VALCRI, financed by the EuropeanUnion Seventh Framework Programme (FP7/2007–2013) underthe EC Grant Agreement No FP7-IP608142; (2) E-care@home, financedby the Swedish Knowledge Foundation; and (3) STeDS, partlyfinanced by the research organization CENIIT (project id 12.10).
Revisions: 2021-10-06 The thesis was first published online. The online published version reflects the printed version.
2022-04-27 The thesis was updated with an errata list which is downloadable from here. Before this date the PDF was downloaded 207 times.
2021-10-062021-09-212022-04-27Bibliographically approved