liu.seSök publikationer i DiVA
Ändra sökning
Avgränsa sökresultatet
1 - 9 av 9
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Beställ onlineKöp publikationen >>
    Cheng, Sijin
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Query Processing over Heterogeneous Federations of Graph Data2024Doktorsavhandling, monografi (Övrigt vetenskapligt)
    Abstract [en]

    Graph data offers a natural and intuitive way to represent complex relationships in various real-world phenomena, such as social networks, e-commerce platforms, and biological networks. The way we interact with information, technology and even society has been changed by graph data, especially after Google started in 2012 to develop the so-called Knowledge Graph. A federation of Knowledge Graphs allows users to perform queries that span across multiple Knowledge Graphs, enabling them to discover relationships and insights that would not be apparent within a single isolated graph and to understand complex knowledge by considering information from different domains or sources. However, retrieving information from such a federation also comes with challenges that must be addressed. Motivated by issues related to retrieving information from federations of Knowledge Graphs, in this thesis, we focus on Knowledge Graphs represented in the Resource Description Framework (RDF) and two forms of heterogeneity: the heterogeneity in terms of data access interfaces, and the heterogeneity of vocabulary used in the schema of RDF data sources. Our research deals with these complexities by designing query planning and optimization to bridge the gap between different graph data sources.

    In this thesis, we first focus on federations that are heterogeneous in terms of data access interfaces. In particular, we establish a formal framework for defining and representing query plans over heterogeneous federations of graph data. We introduce a data model that captures the notion of a heterogeneous federation of RDF data sources. Based on this model, we define a language, called FedQPL, that can be used to describe logical query plans formally. More precisely, this language can be applied both to define query planning and optimization approaches in a more precise manner and to represent the logical plans in a query engine. Thereafter, we provide an extensive set of rewriting rules together with a cost model for optimization. A comprehensive experimental evaluation shows that the query plan selected using our cost model requires less data to be transferred compared to the baseline approach.

    Then, this thesis addresses the heterogeneity of vocabularies used in the schema of RDF data sources by extending FedQPL with vocabulary awareness. To this end, we first define what the expected result of a query in a vocabulary-aware setting is; then, we introduce two new query plan operators to translate solutions from a local to the global vocabulary and vice versa; and finally, we introduce an algorithm that produces correct, vocabulary-aware query plans. To identify the overhead of considering vocabulary mappings during query processing, we evaluate our approach in federations with different vocabulary mapping scenarios. Our experiments show that there is no overhead in planning time when considering vocabulary mappings; however, it takes slightly longer to execute the queries than in a baseline scenario with materialized mapped data. In addition, we also provide a set of rewriting rules specific to vocabulary-aware FedQPL expressions, which can be used as query rewriting rules for query optimization under various conditions. Experimental evaluations support the hypothesis that rewriting rules can significantly improve query processing performance while decreasing the amount of extra work introduced by considering vocabulary mappings.

    Furthermore, we explore possibilities of integrating other types of graph data sources (specifically GraphQL) into the federation. To better understand the different implementation techniques of GraphQL, we design a GraphQL performance benchmark to thoroughly evaluate and compare the performance of approaches to creating GraphQL servers, as a preparation for future integration of Knowledge Graphs that can be accessed via GraphQL APIs into our federation.

    Ladda ner fulltext (pdf)
    fulltext
    Ladda ner (png)
    presentationsbild
  • 2.
    Cheng, Sijin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Hartig, Olaf
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    A Cost Model to Optimize Queries over Heterogeneous Federations of RDF Data Sources2023Ingår i: Joint Proceedings of the ESWC 2023 Workshops and Tutorials co-located with 20th European Semantic Web Conference (ESWC 2023), 2023Konferensbidrag (Refereegranskat)
    Abstract [en]

    Federated processing of queries over RDF data sources offers significant potential when a SPARQL query cannot be answered by a single data source alone. However, finding efficient plans to execute a queryover a federation is challenging, especially if different federation members provide different types of data access interfaces. Different interfaces imply different request types, different forms of responses, and different physical algorithms that can be used, each of which consumes varying amounts of resources during query execution. This heterogeneity poses additional obstacles to the task of planning query executions, in addition to the inherent complexity arising from numerous possible join orderings andvarious physical algorithms. As a first step to address these challenges, we propose a cost model that captures the resource requirements of different operators depending on the type of federation member,allowing us to estimate cost of a given query execution plan without actually executing it. To evaluate our approach, we conduct experiments on FedBench with our cost model and compare it to the current state-of-the-art approach to query planning for heterogeneous federations of RDF data sources.

  • 3.
    Cheng, Sijin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Ferrada, Sebastian
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Hartig, Olaf
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Considering Vocabulary Mappings in Query Plans for Federations of RDF Data Sources2023Ingår i: Proceedings of the 29th International Conference on Cooperative Information Systems (CoopIS), 2023Konferensbidrag (Refereegranskat)
    Abstract [en]

    Federations of RDF data sources oer great potential for queries that cannot be answered by a single data source. However, querying such federations poses several challenges, one of which is that different but semantically-overlapping vocabularies may be used for the respective RDF data. Since the federation members usually retain their autonomy, this heterogeneity cannot simply be homogenized by modifying the data in the data sources. Therefore, handling this heterogeneity becomes a critical aspect of query planning and execution. We introduce an approach to address this challenge by leveraging vocabulary mappings for the processing of queries over federations with heterogeneous vocabularies. This approach not only translates SPARQL queries but also preserves the correctness of results during query execution. We demonstrate the effectiveness of the approach and measure how the application of vocabulary mappings affects on the performance of federated query processing.

  • 4.
    Cheng, Sijin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Hartig, Olaf
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    LinGBM: A Performance Benchmark for Approaches to Build GraphQL Servers2022Ingår i: Web Information Systems Engineering – WISE 2022: 23rd International Conference, Biarritz, France, November 1–3, 2022, Proceedings / [ed] Richard Chbeir, Helen Huang, Fabrizio Silvestri, Yannis Manolopoulos, Yanchun Zhang, Springer, 2022, s. 209-224Konferensbidrag (Refereegranskat)
    Abstract [en]

    GraphQL is a popular new approach to build Web APIs that enable clients to retrieve exactly the data they need. Given the growing number of tools and techniques for building GraphQL servers, there is an increasing need for comparing how particular approaches or techniques affect the performance of a GraphQL server. To this end, we present LinGBM, a GraphQL performance benchmark to experimentally study the performance achieved by various approaches for creating a GraphQL server. In this paper, we discuss the design considerations of the benchmark and describe its main components (data schema; query templates; performance metrics). Thereafter, we present experimental results obtained by applying the benchmark in two different use cases, which demonstrate the broad applicability of LinGBM.

    Ladda ner fulltext (pdf)
    fulltext
  • 5.
    Cheng, Sijin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Hartig, Olaf
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Source Selection for SPARQL Endpoints: Fit for Heterogeneous Federations of RDF Data Sources?2022Ingår i: Proceedings of the QuWeDa 2022: 6th Workshop on Storing, Querying and Benchmarking Knowledge Graphs co-located with 21st International Semantic Web Conference {(ISWC} 2022), CEUR-WS.org , 2022Konferensbidrag (Refereegranskat)
    Abstract [en]

    To answer queries over a federation of multiple data sources, a preliminary task is source selection; i.e.,identify the federation members that can answer each part of the query and decompose the query into subqueries assigned to each federation member. Existing source selection approaches for federations of RDF data sources have been developed based on the assumption that the federation member are SPARQL endpoints. This paper presents an analytical study that investigates whether these approaches are still effective in the context of federations that are heterogeneous in terms of the types of data access interface. In particular, we identify what information about the data of the federation members is required by the approaches and analyze the possibilities and the effort of obtaining this information via the different types of data access interfaces. We find that almost all existing source selection approaches can be adopted for heterogeneous federations but obtaining the required information may not be practical

  • 6.
    Cheng, Sijin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Hartig, Olaf
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Towards Query Processing over Heterogeneous Federations of RDF Data Sources2022Ingår i: The Semantic Web: ESWC 2022 Satellite Events - Hersonissos, Crete, Greece, May 29 - June 2, 2022, Proceedings., Springer, 2022, s. 57-62Konferensbidrag (Refereegranskat)
    Abstract [en]

    A federation of RDF data sources offers enormous potential when answers or insights of queries are unavailable via a single data source. As various interfaces for accessing RDF data are proposed, one challenge for querying such a federation is that the federation members are heterogeneous in terms of the type of data access interfaces. There does not exist any research on systematic approaches to tackle this challenge. To provide a formal foundation for future approaches that aim to address this challenge, we have introduced a language, called FedQPL, that can be used for representing query execution plans in this setting. With a poster in the conference we generally want to outline the vision for the next generation of query engines for such federations and, in this context, we want to raise awareness in the Semantic Web community for our language. In this extended abstract, we first discuss challenges in query processing over such heterogeneous federations; thereafter, we briefly introduce our proposed language, which we have extended with a few new features that we did not have in the version published originally.

    Ladda ner fulltext (pdf)
    fulltext
  • 7.
    Cheng, Sijin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Hartig, Olaf
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    FedQPL: A Language for Logical Query Plans over Heterogeneous Federations of RDF Data Sources2020Ingår i: iiWAS '20: The 22nd International Conference on Information Integration and Web-based Applications & Services, Virtual Event / Chiang Mai, Thailand, November 30 - December 2, 2020 / [ed] Maria Indrawan-Santiago, Eric Pardede, Ivan Luiz Salvadori, Matthias Steinbauer, Ismail Khalil, Gabriele Kotsis, New York, NY, United States: Association for Computing Machinery (ACM), 2020Konferensbidrag (Refereegranskat)
    Abstract [en]

    Federations of RDF data sources provide great potential whenqueried for answers and insights that cannot be obtained from one data source alone. A challenge for planning the execution of queries over such a federation is that the federation may be heterogeneous in terms of the types of data access interfaces provided by the federation members. This challenge has not received much attention in the literature. This paper provides a solid formal foundation for future approaches that aim to address this challenge. Our main conceptual contribution is a formal language for representing query execution plans; additionally, we identify a fragment of this language that can be used to capture the result of selecting relevant data sources for different parts of a given query. As technical contributions, we show that this fragment is more expressive than what is supported by existing source selection approaches, which effectively highlights an inherent limitation of these approaches.Moreover, we show that the source selection problem is NP-hard and in ΣP2 , and we provide a comprehensive set of rewriting rules that can be used as a basis for query optimization.

  • 8.
    Keskisärkkä, Robin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Li, Huanyu
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Cheng, Sijin
    Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik.
    Carlsson, Niklas
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Lambrix, Patrick
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska högskolan.
    An Ontology for Ice Hockey2019Ingår i: ISWC 2019 Satellites: Proceedings of the ISWC 2019 Satellite Tracks (Posters & Demonstrations, Industry, and Outrageous Ideas) co-located with 18th International Semantic Web Conference (ISWC 2019), 2019, s. 13-16Konferensbidrag (Refereegranskat)
    Abstract [en]

    Ice hockey is a highly popular sport that has seen significant increase in the use of sport analytics. To aid in such analytics, most major leagues collect and share increasing amounts of play-by-play data and other statistics. Additionally, some websites specialize in making such data available to the public in user-friendly forms. However, these sites fail to capture the semantic information of the data, and cannot be used to support more complex data requirements. In this paper, we present the design and development of an ice hockey ontology that provides improved knowledge representation, enables intelligent search and information acquisition, and helps when using information from multiple databases. Our ontology is substantially larger than previous ice hockey ontologies (that cover only a small part of the domain) and provides a formal and explicit representation of the ice hockey domain, supports information retrieval, data reuse, and complex performance metrics.

  • 9.
    Cheng, Sijin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Hartig, Olaf
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    OPT plus : A Monotonic Alternative to OPTIONAL in SPARQL2019Ingår i: Journal of Web Engineering, ISSN 1540-9589, E-ISSN 1544-5976, Vol. 18, nr 1-3, s. 169-206Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Due to the OPTIONAL operator, the core fragment of the SPARQL query language is non-monotonic. That is, some solutions of a query result can be returned to the user only after having consulted all relevant parts of the queried dataset(s). This property presents an obstacle when developing query execution approaches that aim to reduce responses times rather than the overall query execution times. Reducing the response times?i.e., returning as many solutions as early as possible? is important in particular in Web-based client-server query processing scenarios in which network latencies dominate query execution times. Such scenarios are typical in the context of integration of Web data sources where a data integration component executes queries over a decentralized federation of such data sources. In this paper we introduce an alternative operator that is similar in spirit to OPTIONAL but without causing non-monotonicity. We show fundamental properties of this operator and observe that the downside of achieving the desired monotonicity property is a potentially significant increase in query result sizes. We study the extend of this trade-off in practice.

1 - 9 av 9
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf