liu.seSearch for publications in DiVA
Change search
Refine search result
12 1 - 50 of 52
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Acosta, Maribel
    et al.
    Karlsruhe Institute of Technology.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Sequeda, Juan
    Capsenta.
    Federated RDF query processing2019In: Encyclopedia of big data technologies / [ed] Sherif Sakr, Albert Zomaya, Cham: Springer, 2019Chapter in book (Refereed)
    Abstract [en]

    Federated RDF query processing is concerned with querying a federation of RDF data sources where the queries are expressed using a declarative query language (typically, the RDF query language SPARQL), and the data sources are autonomous and heterogeneous. The current literature in this context assumes that the data and the data sources are semantically homogeneous, while heterogeneity occurs at the level of data formats and access protocols.

  • 2.
    Alam, Mehwish
    et al.
    Télécom Paris, Institut Polytechnique de Paris, France.
    Trojahn, CássiaIRIT, France.Hertling, SvenUniversity of Mannheim, Germany.Pesquita, CatiaLASIGE, Faculdade de Ciências, Universidade de Lisboa, Portugal.Aebeloe, ChristianAalborg University, Denmark.Aras, HidirFIZ-Karlsruhe, Germany.Azzam, AmrWU Vienna, Austria.Cano, JuanUniversidad Politécnica de Madrid, Spain.Domingue, JohnThe Open University, United Kingdom.Gottschalk, SimonL3S Research Center, Leibniz Universität Hannover, Germany.Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.Hose, KatjaAalborg University, Denmark.Kirrane, SabrinaVienna University of Economics and Business, Austria.Lisena, PasqualeEURECOM, France.Osborne, FrancescoThe Open University, Milton Keynes, United Kingdom.Rohde, Philipp D.TIB Leibniz Information Centre for Science and Technology and Leibniz University Hannover, Germany.Steels, LucBarcelona Supercomputing Center Spain.Taelman, RubenGhent University, Belgium.Third, AislingThe Open University, United Kingdom.Tiddi, IlariaVrije Universiteit Amsterdan, the Netherlands.Türker, RimaFIZ-Karlsruhe, Germany.
    Joint Proceedings of the ESWC 2023 Workshops and Tutorials co-located with 20th European Semantic Web Conference (ESWC)2023Conference proceedings (editor) (Other academic)
  • 3.
    Bellatreche, Ladjel
    et al.
    LIAS/ISAE-ENSMA, Chasseneuil-du-Poitou, France.
    Dumas, MarlonUniversity of Tartu, Tartu, Estonia.Karras, PanagiotisAarhus University, Aarhus, Denmark.Matulevicius, RaimundasUniversity of Tartu, Tartu, Estonia.Awad, AhmedUniversity of Tartu, Tartu, Estonia.Weidlich, MatthiasHumboldt-Universität zu Berlin, Berlin, Germany.Ivanovic, MirjanaUniversity of Novi Sad, Novi Sad, Serbia.Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    New Trends in Database and Information Systems - ADBIS 2021 Short Papers, Doctoral Consortium and Workshops: DOING, SIMPDA, MADEISD, MegaData, CAoNS, Tartu, Estonia, August 24-26, 2021, Proceedings2021Conference proceedings (editor) (Refereed)
  • 4.
    Blomqvist, Eva
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Hose, KatjaAalborg University.Paulheim, HeikoUniversity of Mannheim.Ławrynowicz, AgnieszkaPoznan University of Technology.Ciravegna, FabioUniversity of Sheffield.Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    The Semantic Web: ESWC 2017 Satellite Events - ESWC 2017 Satellite Events, Portorož, Slovenia, May 28 - June 1, 2017, Revised Selected Papers2017Conference proceedings (editor) (Refereed)
  • 5.
    Blomqvist, Eva
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Maynard, DianaUniversity of Sheffield.Gangemi, AldoParis Nord University.Hoekstra, RinkeVrije Universiteit Amsterdam.Hitzler, PascalWright State University.Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    The Semantic Web - 14th International Conference, ESWC 2017, Portorož, Slovenia, May 28 - June 1, 2017, Proceedings, Part I2017Conference proceedings (editor) (Refereed)
  • 6.
    Blomqvist, Eva
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Maynard, DianaUniversity of Sheffield.Gangemi, AldoParis Nord University.Hoekstra, RinkeVrije Universiteit Amsterdam.Hitzler, PascalWright State University.Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    The Semantic Web - 14th International Conference, ESWC 2017, Portorož, Slovenia, May 28 - June 1, 2017, Proceedings, Part II2017Conference proceedings (editor) (Refereed)
  • 7.
    Cheng, Sijin
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Ferrada, Sebastian
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Considering Vocabulary Mappings in Query Plans for Federations of RDF Data Sources2023In: Proceedings of the 29th International Conference on Cooperative Information Systems (CoopIS), 2023Conference paper (Refereed)
    Abstract [en]

    Federations of RDF data sources oer great potential for queries that cannot be answered by a single data source. However, querying such federations poses several challenges, one of which is that different but semantically-overlapping vocabularies may be used for the respective RDF data. Since the federation members usually retain their autonomy, this heterogeneity cannot simply be homogenized by modifying the data in the data sources. Therefore, handling this heterogeneity becomes a critical aspect of query planning and execution. We introduce an approach to address this challenge by leveraging vocabulary mappings for the processing of queries over federations with heterogeneous vocabularies. This approach not only translates SPARQL queries but also preserves the correctness of results during query execution. We demonstrate the effectiveness of the approach and measure how the application of vocabulary mappings affects on the performance of federated query processing.

  • 8.
    Cheng, Sijin
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    A Cost Model to Optimize Queries over Heterogeneous Federations of RDF Data Sources2023In: Joint Proceedings of the ESWC 2023 Workshops and Tutorials co-located with 20th European Semantic Web Conference (ESWC 2023), 2023Conference paper (Refereed)
    Abstract [en]

    Federated processing of queries over RDF data sources offers significant potential when a SPARQL query cannot be answered by a single data source alone. However, finding efficient plans to execute a queryover a federation is challenging, especially if different federation members provide different types of data access interfaces. Different interfaces imply different request types, different forms of responses, and different physical algorithms that can be used, each of which consumes varying amounts of resources during query execution. This heterogeneity poses additional obstacles to the task of planning query executions, in addition to the inherent complexity arising from numerous possible join orderings andvarious physical algorithms. As a first step to address these challenges, we propose a cost model that captures the resource requirements of different operators depending on the type of federation member,allowing us to estimate cost of a given query execution plan without actually executing it. To evaluate our approach, we conduct experiments on FedBench with our cost model and compare it to the current state-of-the-art approach to query planning for heterogeneous federations of RDF data sources.

  • 9.
    Cheng, Sijin
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    FedQPL: A Language for Logical Query Plans over Heterogeneous Federations of RDF Data Sources2020In: iiWAS '20: The 22nd International Conference on Information Integration and Web-based Applications & Services, Virtual Event / Chiang Mai, Thailand, November 30 - December 2, 2020 / [ed] Maria Indrawan-Santiago, Eric Pardede, Ivan Luiz Salvadori, Matthias Steinbauer, Ismail Khalil, Gabriele Kotsis, New York, NY, United States: Association for Computing Machinery (ACM), 2020Conference paper (Refereed)
    Abstract [en]

    Federations of RDF data sources provide great potential whenqueried for answers and insights that cannot be obtained from one data source alone. A challenge for planning the execution of queries over such a federation is that the federation may be heterogeneous in terms of the types of data access interfaces provided by the federation members. This challenge has not received much attention in the literature. This paper provides a solid formal foundation for future approaches that aim to address this challenge. Our main conceptual contribution is a formal language for representing query execution plans; additionally, we identify a fragment of this language that can be used to capture the result of selecting relevant data sources for different parts of a given query. As technical contributions, we show that this fragment is more expressive than what is supported by existing source selection approaches, which effectively highlights an inherent limitation of these approaches.Moreover, we show that the source selection problem is NP-hard and in ΣP2 , and we provide a comprehensive set of rewriting rules that can be used as a basis for query optimization.

  • 10.
    Cheng, Sijin
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    LinGBM: A Performance Benchmark for Approaches to Build GraphQL Servers2022In: Web Information Systems Engineering – WISE 2022: 23rd International Conference, Biarritz, France, November 1–3, 2022, Proceedings / [ed] Richard Chbeir, Helen Huang, Fabrizio Silvestri, Yannis Manolopoulos, Yanchun Zhang, Springer, 2022, p. 209-224Conference paper (Refereed)
    Abstract [en]

    GraphQL is a popular new approach to build Web APIs that enable clients to retrieve exactly the data they need. Given the growing number of tools and techniques for building GraphQL servers, there is an increasing need for comparing how particular approaches or techniques affect the performance of a GraphQL server. To this end, we present LinGBM, a GraphQL performance benchmark to experimentally study the performance achieved by various approaches for creating a GraphQL server. In this paper, we discuss the design considerations of the benchmark and describe its main components (data schema; query templates; performance metrics). Thereafter, we present experimental results obtained by applying the benchmark in two different use cases, which demonstrate the broad applicability of LinGBM.

    Download full text (pdf)
    fulltext
  • 11.
    Cheng, Sijin
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    OPT plus : A Monotonic Alternative to OPTIONAL in SPARQL2019In: Journal of Web Engineering, ISSN 1540-9589, E-ISSN 1544-5976, Vol. 18, no 1-3, p. 169-206Article in journal (Refereed)
    Abstract [en]

    Due to the OPTIONAL operator, the core fragment of the SPARQL query language is non-monotonic. That is, some solutions of a query result can be returned to the user only after having consulted all relevant parts of the queried dataset(s). This property presents an obstacle when developing query execution approaches that aim to reduce responses times rather than the overall query execution times. Reducing the response times?i.e., returning as many solutions as early as possible? is important in particular in Web-based client-server query processing scenarios in which network latencies dominate query execution times. Such scenarios are typical in the context of integration of Web data sources where a data integration component executes queries over a decentralized federation of such data sources. In this paper we introduce an alternative operator that is similar in spirit to OPTIONAL but without causing non-monotonicity. We show fundamental properties of this operator and observe that the downside of achieving the desired monotonicity property is a potentially significant increase in query result sizes. We study the extend of this trade-off in practice.

  • 12.
    Cheng, Sijin
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Source Selection for SPARQL Endpoints: Fit for Heterogeneous Federations of RDF Data Sources?2022In: Proceedings of the QuWeDa 2022: 6th Workshop on Storing, Querying and Benchmarking Knowledge Graphs co-located with 21st International Semantic Web Conference {(ISWC} 2022), CEUR-WS , 2022, Vol. 3279, p. 5-16Conference paper (Refereed)
    Abstract [en]

    To answer queries over a federation of multiple data sources, a preliminary task is source selection; i.e.,identify the federation members that can answer each part of the query and decompose the query into subqueries assigned to each federation member. Existing source selection approaches for federations of RDF data sources have been developed based on the assumption that the federation member are SPARQL endpoints. This paper presents an analytical study that investigates whether these approaches are still effective in the context of federations that are heterogeneous in terms of the types of data access interface. In particular, we identify what information about the data of the federation members is required by the approaches and analyze the possibilities and the effort of obtaining this information via the different types of data access interfaces. We find that almost all existing source selection approaches can be adopted for heterogeneous federations but obtaining the required information may not be practical

  • 13.
    Cheng, Sijin
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Towards Query Processing over Heterogeneous Federations of RDF Data Sources2022In: The Semantic Web: ESWC 2022 Satellite Events - Hersonissos, Crete, Greece, May 29 - June 2, 2022, Proceedings., Springer, 2022, p. 57-62Conference paper (Refereed)
    Abstract [en]

    A federation of RDF data sources offers enormous potential when answers or insights of queries are unavailable via a single data source. As various interfaces for accessing RDF data are proposed, one challenge for querying such a federation is that the federation members are heterogeneous in terms of the type of data access interfaces. There does not exist any research on systematic approaches to tackle this challenge. To provide a formal foundation for future approaches that aim to address this challenge, we have introduced a language, called FedQPL, that can be used for representing query execution plans in this setting. With a poster in the conference we generally want to outline the vision for the next generation of query engines for such federations and, in this context, we want to raise awareness in the Semantic Web community for our language. In this extended abstract, we first discuss challenges in query processing over such heterogeneous federations; thereafter, we briefly introduce our proposed language, which we have extended with a few new features that we did not have in the version published originally.

    Download full text (pdf)
    fulltext
  • 14.
    Dang, Minh-Hoang
    et al.
    Nantes Universite.
    Aimonier-Dava, Julien
    Nantes Universite.
    Molli, Pascal
    Nantes Universite.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Skaf-Molli, Hala
    Nantes Universite.
    Le Crom, Yotlan
    Nantes Universite.
    FedShop: A Benchmark for Testing the Scalability of SPARQL Federation Engines2023In: Proceedings of the 22nd International Semantic Web Conference (ISWC), Springer, 2023Conference paper (Refereed)
    Abstract [en]

    While several approaches to query a federation of SPARQL endpoints have been proposed in the literature, very little is known about the effectiveness of these approaches and the behavior of the resulting query engines for cases in which the number of federation members increases. The existing benchmarks that are typically used to evaluate SPARQL federation engines do not consider such a form of scalability. In this paper, we set out to close this knowledge gap by investigating the behavior of 4 state-of-the-art SPARQL federation engines using a novel benchmark designed for scalability experiments. Based on the benchmark, we show that scalability is a challenge for each of these engines, especially with respect to the effectiveness of their source selection & query decomposition approaches. FedShop is freely available online at:https://github.com/GDD-Nantes/FedShop

  • 15.
    Dang, Minh-Hoang
    et al.
    Nantes Univ, France.
    Aimonier-Davat, Julien
    Nantes Univ, France.
    Molli, Pascal
    Nantes Univ, France.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Skaf-Molli, Hala
    Nantes Univ, France.
    Le Crom, Yotlan
    Nantes Univ, France.
    FedShop: A Benchmark for Testing the Scalability of SPARQL Federation Engines2023In: SEMANTIC WEB, ISWC 2023, PT II, SPRINGER INTERNATIONAL PUBLISHING AG , 2023, Vol. 14266, p. 285-301Conference paper (Refereed)
    Abstract [en]

    While several approaches to query a federation of SPARQL endpoints have been proposed in the literature, very little is known about the effectiveness of these approaches and the behavior of the resulting query engines for cases in which the number of federation members increases. The existing benchmarks that are typically used to evaluate SPARQL federation engines do not consider such a form of scalability. In this paper, we set out to close this knowledge gap by investigating the behavior of 4 state-of-the-art SPARQL federation engines using a novel benchmark designed for scalability experiments. Based on the benchmark, we show that scalability is a challenge for each of these engines, especially with respect to the effectiveness of their source selection & query decomposition approaches. FedShop is freely available online at: https://github.com/GDD-Nantes/FedShop.

  • 16.
    Fionda, Valeria
    et al.
    University of Calabria, Italy.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Abdolazimi, Reyhaneh
    Syracuse University, USA.
    Tutorials at The Web Conference 20232023In: Companion Proceedings of the ACM Web Conference 2023, Association for Computing Machinery (ACM), 2023Conference paper (Other academic)
    Abstract [en]

    This paper summarizes the content of the 28 tutorials that have been given at The Web Conference 2023.

  • 17.
    Ghidini, Chiara
    et al.
    Fondazione Bruno Kessler.
    Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.Maleshkova, MariaUniversity of Bonn.Svátek, VojtěchUniversity of Economics Prague.Cruz, IsabelUniversity of Illinois at Chicago.Hogan, AidanUniversity of Chile.Song, JieMemect Technology.Lefrançois, MaximeMines Saint-Etienne.Gandon, FabienInria Sophia Antipolis - Méditerranée.
    The Semantic Web - ISWC 2019 - 18th International Semantic Web Conference, Auckland, New Zealand, October 26-30, 2019, Proceedings, Part I2019Conference proceedings (editor) (Refereed)
  • 18.
    Ghidini, Chiara
    et al.
    Fondazione Bruno Kessler.
    Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.Maleshkova, MariaUniversity of Bonn.Svátek, VojtěchUniversity of Economics Prague.Cruz, IsabelUniversity of Illinois at Chicago.Hogan, AidanUniversity of Chile.Song, JieMemect Technology.Lefrançois, MaximeMines Saint-Etienne.Gandon, FabienInria Sophia Antipolis - Méditerranée.
    The Semantic Web - ISWC 2019 - 18th International Semantic Web Conference, Auckland, New Zealand, October 26-30, 2019, Proceedings, Part II2019Conference proceedings (editor) (Refereed)
  • 19.
    Harth, Andreas
    et al.
    University of Erlangen-Nuremberg, Nuremberg, Germany.
    Presutti, ValentinaNational Research Council, Rome, Italy.Troncy, RaphaëlEurecom, Sophia Antipolis, France.Acosta, MaribelKarlsruhe Institute of Technology, Karlsruhe, Germany.Polleres, AxelInstitute for Information Business at WU Wien, Vienna, Austria.Fernandez, Javier D.Institute for Information Business at WU Wien, Vienna, Austria.Xavier Parreira, JosianeSiemens AG Österreich, Vienna, Austria.Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.Hose, KatjaAalborg University, Aalborg, Denmark.Cochez, MichaelVrije Universiteit Amsterdam, Amsterdam, The Netherlands.
    The Semantic Web: ESWC 2020 Satellite Events - ESWC 2020 Satellite Events, Heraklion, Crete, Greece, May 31 - June 4, 2020, Revised Selected Papers2020Conference proceedings (editor) (Refereed)
  • 20.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Foundations of RDF* and SPARQL*: (An Alternative Approach to Statement-Level Metadata in RDF)2017In: Proceedings of the 11th Alberto Mendelzon International Workshop on Foundations of Data Management and the Web 2017 / [ed] Juan Reutter, Divesh Srivastava, Juan Reutter, Divesh Srivastava , 2017, Vol. 1912, article id 12Conference paper (Refereed)
    Abstract [en]

    The standard approach to annotate statements in RDF with metadatahas a number of shortcomings including data size blow-up and unnecessarilycomplicated queries. We propose an alternative approach that is based on nestingof RDF triples and of query patterns. The approach allows for a more compactrepresentation of data and queries, and it is backwards compatible with the standard.In this paper we present the formal foundations of our proposal and ofdifferent approaches to implement it. More specifically, we formally capture thenecessary extensions of the RDF data model and its query language SPARQL,and we define mappings based on which our extended notions can be convertedback to ordinary RDF and SPARQL. Additionally, for such type of mappings wedefine two desirable properties, information preservation and query result equivalence,and we show that the introduced mappings possess these properties.

    Download full text (pdf)
    fulltext
  • 21.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Foundations to Query Labeled Property Graphs using SPARQL2019In: Joint Proceedings of the 1st International Workshop On Semantics For Transport and the 1st International Workshop on Approaches for Making Data Interoperable co-located with 15th Semantics Conference (SEMANTiCS 2019), Karlsruhe, Germany, September 9, 2019, 2019Conference paper (Refereed)
    Abstract [en]

    The RDF*/SPARQL* approach extends RDF and SPARQL with means to capture and to query annotations of RDF triples, which is a feature that is natively available in graph databases modeled as Labeled Property Graphs (LPGs). Hence, the approach presents a step towards making the different graph database models interoperable. This paper takes this step further by providing a solid theoretical foundation for converting LPGs into RDF* data and for querying LPGs using the query language SPARQL*. Regarding the latter, the contributions in this paper consider approaches that materialize the RDF* representation of the LPGs into an RDF*-enabled triplestore as well as approaches in which the queried LPGs may reside in an LPG-specific database system.

  • 22.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    RDF* and SPARQL*: An Alternative Approach to Annotate Statements in RDF2017In: Proceedings of the ISWC 2017 Posters & Demonstrations and Industry Tracks co-located with 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, October 23rd - to - 25th, 2017, 2017Conference paper (Refereed)
    Abstract [en]

    The standard approach to annotate statements in RDF with metadata has a number of shortcomings including data size blow-up and complicated queries. We propose an alternative approach that is based on nesting of RDF triples and of query patterns. The approach allows for a more compact representation of data and queries, and it is backwards compatible with the standard.

  • 23.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. Hasso Plattner Institute, University of Potsdam, Potsdam, Germany.
    Bull-Aranda, Carlos
    Informatics Department, Universidad Técnica Federico Santa María, Valparaíso, Chile.
    Bindings-restricted triple pattern fragments2016In: On the Move to Meaningful Internet Systems: OTM 2016 Conferences, Springer Berlin/Heidelberg, 2016, Vol. 10033, p. 762-769Conference paper (Refereed)
    Abstract [en]

    The Triple Pattern Fragment (TPF) interface is a recent proposal for reducing server load in Web-based approaches to execute SPARQL queries over public RDF datasets. The price for less overloaded servers is a higher client-side load and a substantial increase in network load (in terms of both the number of HTTP requests and data transfer). In this paper, we propose a slightly extended interface that allows clients to attach intermediate results to triple pattern requests. The response to such a request is expected to contain triples from the underlying dataset that do not only match the given triple pattern (as in the case of TPF), but that are guaranteed to contribute in a join with the given intermediate result. Our hypothesis is that a distributed query execution using this extended interface can reduce the network load (in comparison to a pure TPF-based query execution) without reducing the overall throughput of the client-server system significantly. Our main contribution in this paper is twofold: we empirically verify the hypothesis and provide an extensive experimental comparison of our proposal and TPF.

    Download full text (pdf)
    fulltext
  • 24.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Curé, Olivier
    Université Paris-Est Marne la Vallée Paris, France.
    Semantic Data Management in Practice2017In: WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion, 2017, Association for Computing Machinery (ACM), 2017, p. 901-904Conference paper (Refereed)
    Abstract [en]

    After years of research and development, standards and technologiesfor semantic data are suciently mature to be usedas the foundation of novel data science projects that employsemantic technologies in various application domains such asbio-informatics, materials science, criminal intelligence, andsocial science. Typically, such projects are carried out bydomain experts who have a conceptual understanding of semantictechnologies but lack the expertise to choose and toemploy existing data management solutions for the semanticdata in their project. For such experts, including domainfocuseddata scientists, project coordinators, and projectengineers, our tutorial delivers a practitioner's guide to se-mantic data management. We discuss the following importantaspects of semantic data management and demonstratehow to address these aspects in practice by using mature,production-ready tools: i) storing and querying semanticdata; ii) understanding, iii) searching, and iv) visualizingthe data; v) automated reasoning; vi) integrating externaldata and knowledge; and vii) cleaning the data.

    Download full text (pdf)
    fulltext
  • 25.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hidders, Jan
    Delft University of Technology, Delft, Netherlands.
    Defining Schemas for Property Graphs by using the GraphQL Schema Definition Language2019In: Proceedings of the 2nd Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), Amsterdam, The Netherlands, 30 June 2019, Association for Computing Machinery (ACM), 2019, p. 1-11, article id 6Conference paper (Refereed)
    Abstract [en]

    GraphQL is a highly popular new approach to build Web APIs. An important component of this approach is the GraphQL schema definition language (SDL). The original purpose of this language is to define a so-called GraphQL schema that specifies the types of objects that can be queried when accessing a specific GraphQL Web API. This paper focuses on the question: Can we repurpose this language to define schemas for graph databases that are based on the Property Graph model? This question is relevant because there does not exist a commonly adopted approach to define schemas for Property Graphs, and because the form in which GraphQL APIs represent their underlying data sources is very similar to the Property Graph model. To answer the question we propose an approach to adopt the GraphQL SDL for Property Graph schemas. We define this approach formally and show its fundamental properties such as the complexity of checking the satisfiability of schemas and of validating data against a schema.

    Download full text (pdf)
    fulltext
  • 26.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hose, Katja
    Aalborg University.
    Sequeda, Juan
    Capsenta.
    Linked data management2019In: Encyclopedia of big data technologies / [ed] Sherif Sakr, Albert Zomaya, Cham: Springer, 2019Chapter in book (Refereed)
  • 27.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. Amazon Web Serv, Sweden.
    Kaoudi, Zoi
    IT Univ Copenhagen, Denmark.
    GRADES-NDA'24: 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)2024In: COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, ASSOC COMPUTING MACHINERY , 2024, p. 653-654Conference paper (Refereed)
    Abstract [en]

    GRADES-NDA is the premier workshop series on graph data management and analytics that aims to bring together researchers from academia, industry, and governmental organizations. GRADES-NDA'24 is a forum for discussing recent advances in (large-scale) graph data management and analytics systems, as well as proposing and discussing novel methods and techniques for addressing domain-specific challenges. In 2024, GRADES-NDA is in its seventh edition.

  • 28.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Letter, Ian
    Universidad de Chile.
    Pérez, Jorge
    Universidad de Chile.
    A Formal Framework for Comparing Linked Data Fragments2017In: The Semantic Web - ISWC 2017 - 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part I, Springer, 2017Conference paper (Refereed)
    Abstract [en]

    The Linked Data Fragment (LDF) framework has been proposed as a uniform view to explore the trade-offs of consuming Linked Data when servers provide (possibly many) different interfaces to access their data. Every such interface has its own particular properties regarding performance, bandwidth needs, caching, etc. Several practical challenges arise. For example, before exposing a new type of LDFs in some server, can we formally say something about how this new LDF interface compares to other interfaces previously implemented in the same server? From the client side, given a client with some restricted capabilities in terms of time constraints, network connection, or computational power, which is the best type of LDFs to complete a given task? Today there are only a few formal theoretical tools to help answer these and other practical questions, and researchers have embarked in solving them mainly by experimentation.In this paper we propose the Linked Data Fragment Machine (LDFM) which is the first formalization to model LDF scenarios. LDFMs work as classical Turing Machines with extra features that model the server and client capabilities. By proving formal results based on LDFMs, we draw a fairly complete expressiveness lattice that shows the interplay between several combinations of client and server capabilities. We also show the usefulness of our model to formally analyze the fine grain interplay between several metrics such as the number of requests sent to the server, and the bandwidth of communication between client and server.

  • 29.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Letter, Ian
    Universidad de Chile.
    Pérez, Jorge
    Universidad de Chile.
    A Model of Distributed Query Computation in Client-Server Scenarios on the Semantic Web2018In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, AAAI Press , 2018, p. 5259-5263Conference paper (Refereed)
    Abstract [en]

    This paper provides an overview of a model for capturing properties of client-server-based query computation setups. This model can be used to formally analyze different combinations of client and server capabilities, and compare them in terms of various fine-grain complexity measures. While the motivations and the focus of the presented work are related to querying the Semantic Web, the main concepts of the model are general enough to be applied in other contexts as well.

  • 30.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Perez, Jorge
    University of Chile, Chile; Chilean Centre Semant Web Research, Chile.
    LDQL: A query language for the Web of Linked Data2016In: Journal of Web Semantics, ISSN 1570-8268, E-ISSN 1873-7749, Vol. 41Article in journal (Refereed)
    Abstract [en]

    The Web of Linked Data is composed of tons of RDF documents interlinked to each other forming a huge repository of distributed semantic data. Effectively querying this distributed data source is an important open problem in the Semantic Web area. In this paper, we propose LDQL, a declarative language to query Linked Data on the Web. One of the novelties of LDQL is that it expresses separately (i) patterns that describe the expected query result, and (ii) Web navigation paths that select the data sources to be used for computing the result. We present a formal syntax and semantics, prove equivalence rules, and study the expressiveness of the language. In particular, we show that LDQL is strictly more expressive than all the query formalisms that have been proposed previously for Linked Data on the Web. We also study some computability issues regarding LDQL. We first prove that when considering the Web of Linked Data as a fully accessible graph, the evaluation problem for LDQL can be solved in polynomial time. Nevertheless, when the limited data access capabilities of Web clients are considered, the scenario changes drastically; there are LDQL queries for which a complete execution is not possible in practice. We formally study this issue and provide a sufficient syntactic condition to avoid this problem; queries satisfying this condition are ensured to have a procedure to be effectively evaluated over the Web of Linked Data. (C) 2016 Elsevier B.V. All rights reserved.

    Download full text (pdf)
    fulltext
  • 31.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Perez, Jorge
    Univ Chile, Chile.
    Semantics and Complexity of GraphQL2018In: WWW '18: Proceedings of the 2018 World Wide Web Conference / [ed] Pierre-Antoine Champin, Fabien Gandon, Lionel Médini, Mounia Lalmas, Panagiotis G. Ipeirotis, Association for Computing Machinery (ACM), 2018, p. 1155-1164Conference paper (Refereed)
    Abstract [en]

    GraphQL is a recently proposed, and increasingly adopted, conceptual framework for providing a new type of data access interface on the Web. The framework includes a new graph query language whose semantics has been specified informally only. This has prevented the formal study of the main properties of the language. We embark on the formalization and study of GraphQL. To this end, we first formalize the semantics of GraphQL queries based on a labeled-graph data model. Thereafter, we analyze the language and show that it admits really efficient evaluation methods. In particular, we prove that the complexity of the GraphQL evaluation problem is NL-complete. Moreover, we show that the enumeration problem can be solved with constant delay. This implies that a server can answer a GraphQL query and send the response byte-by-byte while spending just a constant amount of time between every byte sent. Despite these positive results, we prove that the size of a GraphQL response might be prohibitively large for an internet scenario. We present experiments showing that current practical implementations suffer from this issue. We provide a solution to cope with this problem by showing that the total size of a GraphQL response can be computed in polynomial time. Our results on polynomial-time size computation plus the constant-delay enumeration can help developers to provide more robust GraphQL interfaces on the Web.

  • 32.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. University of Potsdam, Germany.
    Pirro, Giuseppe
    Italian National Research Council ICAR CNR, Italy.
    SPARQL with property paths on the Web2017In: Semantic Web, ISSN 1570-0844, E-ISSN 2210-4968, Vol. 8, no 6, p. 773-795Article in journal (Refereed)
    Abstract [en]

    Linked Data on the Web represents an immense source of knowledge suitable to be automatically processed and queried. In this respect, there are different approaches for Linked Data querying that differ on the degree of centralization adopted. On one hand, the SPARQL query language, originally defined for querying single datasets, has been enhanced with features to query federations of datasets; however, this attempt is not sufficient to cope with the distributed nature of data sources available as Linked Data. On the other hand, extensions or variations of SPARQL aim to find trade-offs between centralized and fully distributed querying. The idea is to partially move the computational load from the servers to the clients. Despite the variety and the relative merits of these approaches, as of today, there is no standard language for querying Linked Data on theWeb. A specific requirement for such a language to capture the distributed, graph-like nature of Linked Data sources on the Web is a support of graph navigation. Recently, SPARQL has been extended with a navigational feature called property paths (PPs). However, the semantics of SPARQL restricts the scope of navigation via PPs to single RDF graphs. This restriction limits the applicability of PPs for querying distributed Linked Data sources on the Web. To fill this gap, in this paper we provide formal foundations for evaluating PPs on the Web, thus contributing to the definition of a query language for Linked Data. We first introduce a family of reachability-based query semantics for PPs that distinguish between navigation on the Web and navigation at the data level. Thereafter, we consider another, alternative query semantics that couples Web graph navigation and data level navigation; we call it context-based semantics. Given these semantics, we find that for some PP-based SPARQL queries a complete evaluation on the Web is not possible. To study this phenomenon we introduce a notion of Web-safeness of queries, and prove a decidable syntactic property that enables systems to identify queries that areWeb-safe. In addition to establishing these formal foundations, we conducted an experimental comparison of the context-based semantics and a reachability- based semantics. Our experiments show that when evaluating a PP-based query under the context-based semantics one experiences a significantly smaller number of dereferencing operations, but the computed query result may contain less solutions.

    Download full text (pdf)
    fulltext
  • 33.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Pérez, Jorge
    Department of Computer Science, Universidad de Chile, Chile.
    An Initial Analysis of Facebook’s GraphQL Language2017In: Proceedings of the 11th Alberto Mendelzon International Workshop on Foundations of Data Management and the Web. / [ed] Juan Reutter, Divesh Srivastava, Juan Reutter, Divesh Srivastava , 2017, Vol. 1912, article id 11Conference paper (Refereed)
    Abstract [en]

    Facebook’s GraphQL is a recently proposed, and increasingly adopted,conceptual framework for providing a new type of data access interface on theWeb. The framework includes a new graph query language whose semantics hasbeen specified informally only. The goal of this paper is to understand the propertiesof this language. To this end, we first provide a formal query semantics.Thereafter, we analyze the language and show that it has a very low complexityfor evaluation. More specifically, we show that the combined complexity ofthe main decision problems is in NL (Nondeterministic Logarithmic Space) and,thus, they can be solved in polynomial time and are highly parallelizable.

    Download full text (pdf)
    fulltext
  • 34.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Seneviratne, OshaniRensselaer Polytechnic Institute (RPI), Institute for Data Exploration and Applications (IDEA), Troy NY, USA.
    Proceedings of the Doctoral Consortium at the 21st International Semantic Web Conference (ISWC 2022)2022Conference proceedings (editor) (Refereed)
  • 35.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Tamer Ozsu, M.
    University of Waterloo, Canada.
    Walking Without a Map: Ranking-Based Traversal for Querying Linked Data2016In: SEMANTIC WEB - ISWC 2016, PT I, Springer-Verlag New York, 2016, Vol. 9981, p. 305-324Conference paper (Refereed)
    Abstract [en]

    The traversal-based approach to execute queries over Linked Data on the WWW fetches data by traversing data links and, thus, is able to make use of up-to-date data from initially unknown data sources. While the downside of this approach is the delay before the query engine completes a query execution, user perceived response time may be improved significantly by returning as many elements of the result set as soon as possible. To this end, the query engine requires a traversal strategy that enables the engine to fetch result-relevant data as early as possible. The challenge for such a strategy is that the query engine does not know a priori which of the data sources discovered during the query execution will contain result-relevant data. In this paper, we investigate 14 different approaches to rank traversal steps and achieve a variety of traversal strategies. We experimentally study their impact on response times and compare them to a baseline that resembles a breadth-first traversal. While our experiments show that some of the approaches can achieve noteworthy improvements over the baseline in a significant number of cases, we also observe that for every approach, there is a non-negligible chance to achieve response times that are worse than the baseline.

    Download full text (pdf)
    fulltext
  • 36.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. Amazon Web Services.
    Yoshida, Yuichi
    National Institute of Informatics, Tokyo, Japan.
    GRADES-NDA'23: 6th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)2023In: Companion of the 2023 International Conference on Management of Data, SIGMOD/PODS 2023, Association for Computing Machinery (ACM), 2023, p. 307-308Conference paper (Other academic)
    Abstract [en]

    GRADES-NDA is the premier workshop series on graph data management and analytics that aims to bring together researchers from academia, industry, and government. GRADES-NDA'23 is a forum for discussing recent advances in (large-scale) graph data management and analytics systems, as well as proposing and discussing novel methods and techniques for addressing domain-specific challenges or handling noise in real-world graphs. In 2023, GRADES-NDA is in its sixth edition.

  • 37.
    Hartig, Olaf
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. Amazon Web Services.
    Yoshida, YuichiNational Institute of Informatics, Japan.
    Proceedings of the 6th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)2023Conference proceedings (editor) (Other academic)
  • 38.
    Hitzler, Pascal
    et al.
    Kansas State University.
    Kirrane, SabrinaVienna University of Economics and Business.Hartig, OlafLinköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.de Boer, VictorVrije Universiteit Amsterdam.Vidal, Maria-EstherLeibniz Information Centre for Science and Technology University Library (TIB).Maleshkova, MariaUniversity of Bonn.Schlobach, StefanVrije Universiteit Amsterdam.Hammar, KarlJönköping University.Lasierra, NeliaF. Hoffmann-La Roche AG.Stadtmüller, SteffenRobert Bosch GmbH.Hose, KatjaAalborg University.Verborgh, RubenGhent University.
    The Semantic Web: ESWC 2019 Satellite Events - ESWC 2019 Satellite Events, Portorož, Slovenia, June 2-6, 2019, Revised Selected Papers2019Conference proceedings (editor) (Refereed)
  • 39.
    Keskisärkkä, Robin
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Blomqvist, Eva
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Capturing and Querying Uncertainty in RDF Stream Processing2020In: Knowledge Engineering and Knowledge Management - 22nd International Conference, EKAW 2020, Bolzano, Italy, September 16-20, 2020, Proceedings / [ed] C. Maria Keet and Michel Dumontier, 2020Conference paper (Refereed)
    Abstract [en]

    RDF Stream Processing (RSP) has been proposed as a candidate for bringing together the Complex Event Processing (CEP) paradigm and the Semantic Web standards. In this paper, we investigate the impact of explicitly representing and processing uncertainty in RSP for the use in CEP. Additionally, we provide a representation for capturing the relevant notions of uncertainty in the RSP-QL* data model and describe query functions that can operate on this representation. The impact evaluation is based on a use case within electronic healthcare, where we compare the query execution overhead of different uncertainty options in a prototype implementation. The experiments show that the influence on query execution performance varies greatly, but that uncertainty can have noticeable impact on query execution performance. On the otherhand, the overhead grows linearly with respect to the stream rate for all uncertainty options in the evaluation, and the observed performance is sufficient for many use cases. Extending the representation and operations to support more uncertainty options and investigating different query optimization strategies to reduce the impact on execution performance remain important areas for future research.

  • 40.
    Keskisärkkä, Robin
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Blomqvist, Eva
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Optimizing RDF Stream Processing for Uncertainty Management2021In: Further with Knowledge Graph, IOS Press, 2021, Vol. 53, p. 118-132Conference paper (Refereed)
    Abstract [en]

    RDF Stream Processing (RSP) has been proposed as a way of bridging the gap between the Complex Event Processing (CEP) paradigm and the Semantic Web standards. Uncertainty has been recognized as a critical aspect in CEP, but it has received little attention within the context of RSP. In this paper, we investigate the impact of different RSP optimization strategies for uncertainty management. The paper describes (1) an extension of the RSP-QL* data model to capture bind expressions, filter expressions, and uncertainty functions; (2) optimization techniques related to lazy variables and caching of uncertainty functions, and a heuristic for reordering uncertainty filters in query plans; and (3) an evaluation of these strategies in a prototype implementation. The results show that using a lazy variable mechanism for uncertainty functions can improve query execution performance by orders of magnitude while introducing negligible overhead. The results also show that caching uncertainty function results can improve performance under most conditions, but that maintaining this cache can potentially add overhead to the overall query execution process. Finally, the effect of the proposed heuristic on query execution performance was shown to depend on multiple factors, including the selectivity of uncertainty filters, the size of intermediate results, and the cost associated with the evaluation of the uncertainty functions.

    Download full text (pdf)
    fulltext
  • 41.
    Keskisärkkä, Robin
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Blomqvist, Eva
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Lind, Leili
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    RSP-QL*: Enabling Statement-Level Annotations in RDF Streams2019In: Semantic Systems. The Power of AI and Knowledge Graphs - 15th International Conference, SEMANTiCS 2019, Karlsruhe, Germany, September 9-12, 2019, Proceedings, Germany, 2019, p. -55Conference paper (Refereed)
    Abstract [en]

    RSP-QL was developed by the W3C RDF Stream Processing (RSP) community group as a common way to express and query RDF streams. However, RSP-QL does not provide any way of annotating data on the statement level, for example, to express the uncertainty that is often associated with streaming information. Instead, the only way to provide such information has been to use RDF reification, which adds additional complexity to query processing, and is syntactically verbose. In this paper, we define an extension of RSP-QL, called RSP-QL*, that provides an intuitive way for supporting statement-level annotations in RSP. The approach leverages the concepts previously described for RDF* and SPARQL*. We illustrate the proposed approach based on a scenario from a research project in e-health. An open-source implementation of the proposal is provided and compared to the baseline approach of using RDF reification. The results show that this way of dealing with statement-level annotations offers advantages with respect to both data transfer bandwidth and query execution performance.

    Download full text (pdf)
    fulltext
  • 42.
    Khayatbashi, Shahrzad
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Jalali, Amin
    Stockholm University.
    Transforming Event Knowledge Graph to Object-Centric Event Logs: A Comparative Study for Multi-dimensional Process Analysis2023Conference paper (Refereed)
    Abstract [en]

    Process mining has significantly transformed business process management by introducing innovative data-based analysis techniques and empowering organizations to unveil hidden insights previously buried within their recorded data. The analysis is conducted on event logs structured by conceptual models. Traditional models were defined based on only a single case notion, e.g., order or item in the purchase process. This limitation hinders the application of process mining in practice for which new data models are developed, a.k.a, Event Knowledge Graph (EKG) and Object-Centric Event Log (OCEL).While several tools have been developed for OCEL, there is a lack of process mining tooling around the EKG. In addition, there is a lack of comparison about the practical implication of choosing one approach over another. To fill this gap, the contribution of this paper is threefold.First, it defines and implements an algorithm to transform event logs represented as EKG to OCEL. The implementation is used to transform 5 real event logs based on which the approach is evaluated. Second, it compares the performance of analyzing event logs represented in these two models. Third, it compares and reveals similarities and differences in analyzing processes based on event logs represented in these two models.The results highlight ten important findings, including different approaches in calculating directly-follows relations when analyzing filtered event logs in these models and the limitations of OCEL in supporting event lifecycle and inter-log relation analysis.

  • 43.
    Khayatbashi, Shahrzad
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Sebastian, Ferrada
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Converting Property Graphs to RDF: A Preliminary Study of the Practical Impact of Different Mappings2022In: GRADES-NDA '22: Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), Association for Computing Machinery (ACM), 2022Conference paper (Refereed)
    Abstract [en]

    Today's space of graph database solutions is characterized by two main technology stacks that have evolved separate from one another: on one hand, there are systems that focus on supporting the RDF family of standards; on the other hand, there is the Property Graph category of systems. As a basis for bringing these stacks together and, in particular, to facilitate data exchange between the different types of systems, different direct mappings between the underlying graph data models have been introduced in the literature. While fundamental properties are well-documented for most of these mappings, the same cannot be said about the practical implications of choosing one mapping over another. Our research aims to contribute towards closing this gap. In this paper we report on a preliminary study for which we have selected two direct mappings from (Labeled) Property Graphs to RDF, where one of them uses features of the RDF-star extension to RDF. We compare these mappings in terms of the query performance achieved by two popular commercial RDF stores, GraphDB and Stardog, in which the converted data is imported. While we find that, for both of these systems, none of the mappings is a clear winner in terms of guaranteeing better query performance, we also identify types of queries that are problematic for the systems when using one mapping but not the other.

    Download full text (pdf)
    fulltext
  • 44.
    Kim, Yun Wan
    et al.
    University of Toronto.
    Consens, Mariano P.
    University of Toronto.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    An Empirical Analysis of GraphQL API Schemas in Open Code Repositories and Package Registries2019In: Proceedings of the 13th Alberto Mendelzon International Workshop on Foundations of Data Management, Asunción, Paraguay, June 3-7, 2019, 2019Conference paper (Refereed)
    Abstract [en]

    GraphQL is a query language for APIs that has been increasingly adopted by web developers since its specification was open sourced in 2015. The GraphQL framework lets API clients tailor data requests by using queries that return JSON objects described using GraphQL Schema. We present initial results of an exploratory empirical study with the goal of characterizing GraphQL Schemas in open code repositories and package registries. Our first approach identifies over 20 thousand GraphQL-related projects in publically accessible repositories hosted by GitHub. Our second, and complementary, approach uses package registries to select 30 GraphQL “reference” packages (the ones with the highest dependency counts), and then finds their 90 thousand dependent packages (and the related repositories in GitHub, GitLab, and Bitbucket). In addition, over 2 thousand schema files were loaded into the GraphQL.js reference implementation to conduct a detailed analysis of the schema information. Our study provides insights into the usage of different schema constructs, the number of distinct types and the most popular types in schemas, as well as the presence of cycles in schemas.

  • 45.
    Knuth, Magnus
    et al.
    Hasso Plattner Institute, University of PotsdamPotsdamGermany.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Scheduling Refresh Queries for Keeping Results from a SPARQL Endpoint Up-to-Date2016In: On the Move to Meaningful Internet Systems: OTM 2016 Conferences, 2016, Vol. 10033, p. 780-791Conference paper (Refereed)
    Abstract [en]

    Many datasets change over time. As a consequence, long-running applications that cache and repeatedly use query results obtained from a SPARQL endpoint may resubmit the queries regularly to ensure up-to-dateness of the results. While this approach may be feasible if the number of such regular refresh queries is manageable, with an increasing number of applications adopting this approach, the SPARQL endpoint may become overloaded with such refresh queries. A more scalable approach would be to use a middle-ware component at which the applications register their queries and get notified with updated query results once the results have changed. Then, this middle-ware can schedule the repeated execution of the refresh queries without overloading the endpoint. In this paper, we study the problem of scheduling refresh queries for a large number of registered queries by assuming an overload-avoiding upper bound on the length of a regular time slot available for testing refresh queries. We investigate a variety of scheduling strategies and compare them experimentally in terms of time slots needed before they recognize changes and number of changes that they miss.

    Download full text (pdf)
    fulltext
  • 46.
    Lambrix, Patrick
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University, Sweden; Department of Building Engineering, Energy Systems and Sustainability Science, University of Gävle, Sweden.
    Armiento, Rickard
    Linköping University, Department of Physics, Chemistry and Biology, Theoretical Physics. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University.
    Li, Huanyu
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Abd Nikooie Pour, Mina
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University.
    Li, Ying
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University.
    The materials design ontology2024In: Semantic Web, ISSN 1570-0844, E-ISSN 2210-4968, Vol. 15, no 2, p. 481-515Article in journal (Refereed)
    Abstract [en]

    In the materials design domain, much of the data from materials calculations is stored in different heterogeneous databases with different data and access models. Therefore, accessing and integrating data from different sources is challenging. As ontology-based access and integration alleviates these issues, in this paper we address data access and interoperability for computational materials databases by developing the Materials Design Ontology. This ontology is inspired by and guided by the OPTIMADE effort that aims to make materials databases interoperable and includes many of the data providers in computational materials science. In this paper, first, we describe the development and the content of the Materials Design Ontology. Then, we use a topic model-based approach to propose additional candidate concepts for the ontology. Finally, we show the use of the Materials Design Ontology by a proof-of-concept implementation of a data access and integration system for materials databases based on the ontology.

    Download full text (pdf)
    fulltext
  • 47.
    Lassila, Ora
    et al.
    Amazon Web Serv, WA 98109 USA.
    Schmidt, Michael
    Amazon Web Serv, WA 98109 USA.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. Amazon Web Serv, WA 98109 USA.
    Bebee, Brad
    Amazon Web Serv, WA 98109 USA.
    Bechberger, Dave
    Amazon Web Serv, WA 98109 USA.
    Broekema, Willem
    Amazon Web Serv, WA 98109 USA.
    Khandelwal, Ankesh
    Amazon Web Serv, WA 98109 USA.
    Lawrence, Kelvin
    Amazon Web Serv, WA 98109 USA.
    Enriquez, Carlos Manuel Lopez
    Amazon Web Serv, WA 98109 USA.
    Sharda, Ronak
    Amazon Web Serv, WA 98109 USA.
    Thompson, Bryan
    Amazon Web Serv, WA 98109 USA.
    The OneGraph vision: Challenges of breaking the graph model lock-in2023In: Semantic Web, ISSN 1570-0844, E-ISSN 2210-4968, Vol. 14, no 1, p. 125-134Article in journal (Refereed)
    Abstract [en]

    Amazon Neptune is a graph database service that supports two graph models: W3Cs Resource Description Framework (RDF) and Labeled Property Graphs (LPG). Customers choose one or the other model. This choice determines which data modeling features can be used and - perhaps more importantly - which query languages are available. The choice between the two technology stacks is difficult and time consuming. It requires consideration of data modeling aspects, query language features, their adequacy for current and future use cases, as well as developer knowledge. Even in cases where customers evaluate the pros and cons and make a conscious choice that fits their use case, over time we often see requirements from new use cases emerge that could be addressed more easily with a different data model or query language. It is therefore highly desirable that the choice of the query language can be made without consideration of what graph model is chosen and can be easily revised or complemented at a later point. To this end, we advocate and explore the idea of OneGraph ("1G" for short), a single, unified graph data model that embraces both RDF and LPGs. The goal of 1G is to achieve interoperability at both data level, by supporting the co-existence of RDF and LPG in the same database, as well as query level, by enabling queries and updates over the unified data model with a query language of choice. In this paper, we sketch our vision and investigate technical challenges towards a unification of the two graph data models.

    Download full text (pdf)
    fulltext
  • 48.
    Li, Huanyu
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Armiento, Rickard
    Linköping University, Department of Physics, Chemistry and Biology, Theoretical Physics. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University.
    Lambrix, Patrick
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University.
    OBG-gen: Ontology-Based GraphQL Server Generation for Data Integration2023In: Proceedings of the ISWC 2023 Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice: co-located with 22nd International Semantic Web Conference (ISWC 2023) / [ed] Irini Fundulaki, Kouji Kozaki, Daniel Garijo, Jose Manuel Gomez-Perez, 2023Conference paper (Refereed)
    Abstract [en]

    A GraphQL server contains two building blocks: (1) a GraphQL schema defining the types of data objects that can be requested; (2) resolver functions fetching the relevant data from underlying data sources. GraphQL can be used for data integration if the GraphQL schema provides an integrated view of data from multiple data sources, and the resolver functions are implemented accordingly.However, there does not exist a semantics-aware approach to use GraphQL for data integration.We proposed a framework using GraphQL for data integration in which a global domain ontology informs the generation of a GraphQL server. Furthermore, we implemented a prototype of this framework, OBG-gen. In this paper, we demonstrate OBG-gen in a real-world data integration scenario in the materials design domain and in  a synthetic benchmark scenario.

  • 49.
    Li, Huanyu
    et al.
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University, Sweden.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Armiento, Rickard
    Linköping University, Department of Physics, Chemistry and Biology, Theoretical Physics. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University, Sweden.
    Lambrix, Patrick
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering. The Swedish e-Science Research Centre, Linköping University, Sweden;Department of Building Engineering, Energy Systems and Sustainability Science, University of Gävle, Sweden.
    Ontology-based GraphQL server generation for data access and data integration2024In: Semantic Web, ISSN 1570-0844, E-ISSN 2210-4968, Vol. 15, no 5, p. 1639-1675Article in journal (Refereed)
    Abstract [en]

    In a GraphQL Web API, a so-called GraphQL schema defines the types of data objects that can be queried, and so-called resolver functions are responsible for fetching the relevant data from underlying data sources. Thus, we can expect to use GraphQL not only for data access but also for data integration, if the GraphQL schema reflects the semantics of data from multiple data sources, and the resolver functions can obtain data from these data sources and structure the data according to the schema. However, there does not exist a semantics-aware approach to employ GraphQL for data integration. Furthermore, there are no formal methods for defining a GraphQL API based on an ontology.In this work, we introduce a framework for using GraphQL in which a global domain ontology informs the generation of a GraphQL server that answers requests by querying heterogeneous data sources.The core of this framework consists of an algorithm to generate a GraphQL schema based on an ontology and a generic resolver function based on semantic mappings. We provide a prototype, OBG-gen, of this framework, and we evaluate our approach over a real-world data integration scenario in the materials design domain and two synthetic benchmark scenarios (Linköping GraphQL Benchmark and GTFS-Madrid-Bench). The experimental results of our evaluation indicate that: (i) our approach is feasible to generate GraphQL servers for data access and integration over heterogeneous data sources, thus avoiding a manual construction of GraphQL servers, and (ii) our data access and integration approach is general and applicable to different domains where data is shared or queried via different ways.

    Download full text (pdf)
    fulltext
  • 50.
    Potocki, Alexander
    et al.
    University of Leipzig.
    Saleem, Muhammad
    University of Leipzig.
    Soru, Tommaso
    University of Leipzig.
    Hartig, Olaf
    Linköping University, Department of Computer and Information Science, Database and information techniques. Linköping University, Faculty of Science & Engineering.
    Voigt, Martin
    Ontos GmbH.
    Ngomo, Axel-Cyrille Ngonga
    University of Paderborn.
    Federated SPARQL Query Processing Via CostFed2017In: Proceedings of the ISWC 2017 Posters & Demonstrations and Industry Tracks co-located with 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, October 23rd - to - 25th, 2017, 2017Conference paper (Refereed)
    Abstract [en]

    The runtime optimization of federated SPARQL query engines is of central importance to ensure the usability of the Web of Data in real-world applications. The efficient selection of sources (SPARQL endpoints in our case) as well as the generation of optimized query plans belong to the most important optimization steps in this respect. This paper presents CostFed, an index-assisted federation engine for federated SPARQL query processing over multiple SPARQL endpoints. CostFed makes use of statistical information collected from endpoints to perform efficient source selection and cost-based query planning. In contrast to the state of the art, it relies on a non-linear model for the estimation of the selectivity of joins. Therewith, it is able to generate better plans than the state-of-the-art federation engines. In an experimental evaluation based on FedBench benchmark, we show that CostFed is 3 to 121 times faster than the state of the art SPARQL endpoint federation engines.

12 1 - 50 of 52
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf