liu.seSearch for publications in DiVA
Change search
Refine search result
1 - 12 of 12
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Birath, Bjorn
    et al.
    Linköping University, Department of Computer and Information Science. Linköping University, Faculty of Science & Engineering.
    Ernstsson, August
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Tinnerholm, John
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    High-Level Programming of FPGA-Accelerated Systems with Parallel Patterns2024In: International journal of parallel programming, ISSN 0885-7458, E-ISSN 1573-7640Article in journal (Refereed)
    Abstract [en]

    As a result of frequency and power limitations, multi-core processors and accelerators are becoming more and more prevalent in today's systems. To fully utilize such systems, heterogeneous parallel programming is needed, but this introduces new complexities to the development. High-level frameworks such as SkePU have been introduced to help alleviate these complexities. SkePU is a skeleton programming framework based on a set of programming constructs implementing computational parallel patterns, while presenting a sequential interface to the programmer. Using the various skeleton backends, SkePU programs can execute, without source code modification, on multiple types of hardware such as CPUs, GPUs, and clusters. This paper presents the design and implementation of a new backend for SkePU, adding support for FPGAs. We also evaluate the effect of FPGA-specific optimizations in the new backend and compare it with the existing GPU backend, where the actual devices used are of similar vintage and price point. For simple examples, we find that the FPGA-backend's performance is similar to that of the existing backend for GPUs, while it falls behind in more complex tasks. Finally, some shortcomings in the backend are highlighted and discussed, along with potential solutions.

  • 2.
    Ernstsson, August
    et al.
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Griebler, Dalvan
    Pontif Catholic Univ Rio Grande do Sul PUCRS, Brazil.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Assessing Application Efficiency and Performance Portability in Single-Source Programming for Heterogeneous Parallel Systems2023In: International journal of parallel programming, ISSN 0885-7458, E-ISSN 1573-7640, Vol. 51, p. 61-82Article in journal (Refereed)
    Abstract [en]

    We analyze the performance portability of the skeleton-based, single-source multi-backend high-level programming framework SkePU across multiple different CPU-GPU heterogeneous systems. Thereby, we provide a systematic application efficiency characterization of SkePU-generated code in comparison to equivalent hand-written code in more low-level parallel programming models such as OpenMP and CUDA. For this purpose, we contribute ports of the STREAM benchmark suite and of a part of the NAS Parallel Benchmark suite to SkePU. We show that for STREAM and the EP benchmark, SkePU regularly scores efficiency values above 80% and in particular for CPU systems, SkePU can outperform hand-written code.

    Download full text (pdf)
    fulltext
  • 3.
    Ernstsson, August
    et al.
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Vandenbergen, Nicolas
    Julich Supercomp Ctr, Germany.
    Keller, Jörg
    Fernuniv, Germany.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    A Deterministic Portable Parallel Pseudo-Random Number Generator for Pattern-Based Programming of Heterogeneous Parallel Systems2022In: International journal of parallel programming, ISSN 0885-7458, E-ISSN 1573-7640, Vol. 50, p. 319-340Article in journal (Refereed)
    Abstract [en]

    SkePU is a pattern-based high-level programming model for transparent program execution on heterogeneous parallel computing systems. A key feature of SkePU is that, in general, the selection of the execution platform for a skeleton-based function call need not be determined statically. On single-node systems, SkePU can select among CPU, multithreaded CPU, single or multi-GPU execution. Many scientific applications use pseudo-random number generators (PRNGs) as part of the computation. In the interest of correctness and debugging, deterministic parallel execution is a desirable property, which however requires a deterministically parallelized pseudo-random number generator. We present the API and implementation of a deterministic, portable parallel PRNG extension to SkePU that is scalable by design and exhibits the same behavior regardless where and with how many resources it is executed. We evaluate it with four probabilistic applications and show that the PRNG enables scalability on both multi-core CPU and GPU resources, and hence supports the universal portability of SkePU code even in the presence of PRNG calls, while source code complexity is reduced.

    Download full text (pdf)
    fulltext
  • 4.
    Papadopoulos, Lazaros
    et al.
    Natl Tech Univ Athens, Greece.
    Soudris, Dimitrios
    Natl Tech Univ Athens, Greece.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Ernstsson, August
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Ahlqvist, Johan
    Linköping University, Department of Computer and Information Science. Linköping University, Faculty of Science & Engineering.
    Vasilas, Nikos
    Ctr Res & Technol Hellas, Greece.
    Papadopoulos, Athanasios I
    Ctr Res & Technol Hellas, Greece.
    Seferlis, Panos
    Ctr Res & Technol Hellas, Greece.
    Prouveur, Charles
    CEA, France.
    Haefele, Matthieu
    Univ Pau & Pays Adour, France.
    Thibault, Samuel
    Bordeaux Univ, France.
    Salamanis, Athanasios
    Ctr Res & Technol Hellas, Greece.
    Ioakimidis, Theodoros
    Ctr Res & Technol Hellas, Greece.
    Kehagias, Dionysios
    Ctr Res & Technol Hellas, Greece.
    EXA2PRO: A Framework for High Development Productivity on Heterogeneous Computing Systems2022In: IEEE Transactions on Parallel and Distributed Systems, ISSN 1045-9219, E-ISSN 1558-2183, Vol. 33, no 4, p. 792-804Article in journal (Refereed)
    Abstract [en]

    Programming upcoming exascale computing systems is expected to be a major challenge. New programming models are required to improve programmability, by hiding the complexity of these systems from application developers. The EXA2PRO programming framework aims at improving developers productivity for applications that target heterogeneous computing systems. It is based on advanced programming models and abstractions that encapsulate low-level platform-specific optimizations and it is supported by a runtime that handles application deployment on heterogeneous nodes. It supports a wide variety of platforms and accelerators (CPU, GPU, FPGA-based Data-Flow Engines), allowing developers to efficiently exploit heterogeneous computing systems, thus enabling more HPC applications to reach exascale computing. The EXA2PRO framework was evaluated using four HPC applications from different domains. By applying the EXA2PRO framework, the applications were automatically deployed and evaluated on a variety of computing architectures, enabling developers to obtain performance results on accelerators, test scalability on MPI clusters and productively investigate the degree by which each application can efficiently use different types of hardware resources.

    Download full text (pdf)
    fulltext
  • 5.
    Ernstsson, August
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Pattern-based Programming Abstractions for Heterogeneous Parallel Computing2022Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Contemporary computer architectures utilize wide multi-core processors, accelerators such as GPUs, and clustering of individual computers into complex large-scale systems. These hardware trends are prevalent across computers of all sizes, from the largest supercomputers down to the smallest mobile phones. While these innovations provide high peak computing performance, software developers find it increasingly difficult to effectively target all the processing resources without expert knowledge in parallelization, heterogeneous computing, communication, synchronization, and so on. To ensure that software can keep up with the development of hardware architectures, advanced high-level programming environments and frameworks are needed to bridge the programmability gap. In addition, as the industry is trending towards increased vertical integration of software development stacks, vendor lock-in presents a risk of coupling software projects to proprietary technologies. Combined with problems of technical debt in large-scale software systems, it is clear that portability and open source are desirable properties of high-level parallel programming environments. One example of a programming framework fulfilling the above criteria is SkePU, a framework for high-level data-parallel pattern programming consisting of a compiler toolchain, programming interface, and run-time system.

    The work presented in this thesis proposes a design of the pattern-centric skeleton programming model of the SkePU framework based on modern C++ with variadic template metaprogramming and state-of-the-art compiler technology. The design enables further flexibility, expressivity, and portability and gives rise to several new performance optimization techniques. The focus lies on a strong set of programming abstractions: providing new and extended patterns, improving the data access locality of existing ones, and using both static and dynamic knowledge about program flow. The work combines novel programming interfaces and implementations with practical evaluation on synthetic and real-world applications. Several contributions are results from international collaborations in application-framework co-design: a single-source parallelization approach of skeleton programs on heterogeneous clusters, an extension mechanism for inserting platform-optimized code variants in high-level skeleton programs, and an integrated abstraction for portable parallel deterministic random number generation. The work places a strong emphasis on programmability aspects to make heterogeneous parallel computing accessible to non-experts, while also providing sufficient performance and interface familiarity for the high-performance computing community.

    Download full text (pdf)
    fulltext
    Download (png)
    presentationsbild
  • 6.
    Ernstsson, August
    et al.
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Ahlqvist, Johan
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Zouzoula, Stavroula
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    SkePU 3: Portable High-Level Programming of Heterogeneous Systems and HPC Clusters2021In: International journal of parallel programming, ISSN 0885-7458, E-ISSN 1573-7640, Vol. 49, no 6, p. 846-866Article in journal (Refereed)
    Abstract [en]

    We present the third generation of the C++-based open-source skeleton programming framework SkePU. Its main new features include new skeletons, new data container types, support for returning multiple objects from skeleton instances and user functions, support for specifying alternative platform-specific user functions to exploit e.g. custom SIMD instructions, generalized scheduling variants for the multicore CPU backends, and a new cluster-backend targeting the custom MPI interface provided by the StarPU task-based runtime system. We have also revised the smart data containers memory consistency model for automatic data sharing between main and device memory. The new features are the result of a two-year co-design effort collecting feedback from HPC application partners in the EU H2020 project EXA2PRO, and target especially the HPC application domain and HPC platforms. We evaluate the performance effects of the new features on high-end multicore CPU and GPU systems and on HPC clusters.

    Download full text (pdf)
    fulltext
  • 7. Order onlineBuy this publication >>
    Ernstsson, August
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Designing a Modern Skeleton Programming Framework for Parallel and Heterogeneous Systems2020Licentiate thesis, monograph (Other academic)
    Abstract [en]

    Today's society is increasingly software-driven and dependent on powerful computer technology. Therefore it is important that advancements in the low-level processor hardware are made available for exploitation by a growing number of programmers of differing skill level. However, as we are approaching the end of Moore's law, hardware designers are finding new and increasingly complex ways to increase the accessible processor performance. It is getting more and more difficult to effectively target these processing resources without expert knowledge in parallelization, heterogeneous computation, communication, synchronization, and so on. To ensure that the software side can keep up, advanced programming environments and frameworks are needed to bridge the widening gap between hardware and software. One such example is the pattern-centric skeleton programming model and in particular the SkePU project. The work presented in this thesis first redesigns the SkePU framework based on modern C++ variadic template metaprogramming and state-of-the-art compiler technology. It then explores new ways to improve performance: by providing new patterns, improving the data access locality of existing ones, and using both static and dynamic knowledge about program flow. The work combines novel ideas with practical evaluation of the approach on several applications. The advancements also include the first skeleton API that allows variadic skeletons, new data containers, and finally an approach to make skeleton programming more customizable without compromising universal portability.

    Download full text (pdf)
    fulltext
    Download (png)
    presentationsbild
  • 8.
    Öhberg, Tomas
    et al.
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Ernstsson, August
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Hybrid CPU-GPU execution support in the skeleton programming framework SkePU2020In: Journal of Supercomputing, ISSN 0920-8542, E-ISSN 1573-0484, Vol. 76, no 7, p. 5038-5056Article in journal (Refereed)
    Abstract [en]

    In this paper, we present a hybrid execution backend for the skeleton programming framework SkePU. The backend is capable of automatically dividing the workload and simultaneously executing the computation on a multi-core CPU and any number of accelerators, such as GPUs. We show how to efficiently partition the workload of skeletons such as Map, MapReduce, and Scan to allow hybrid execution on heterogeneous computer systems. We also show a unified way of predicting how the workload should be partitioned based on performance modeling. With experiments on typical skeleton instances, we show the speedup for all skeletons when using the new hybrid backend. We also evaluate the performance on some real-world applications. Finally, we show that the new implementation gives higher and more reliable performance compared to an old hybrid execution implementation based on dynamic scheduling.

    Download full text (pdf)
    fulltext
  • 9.
    Ernstsson, August
    et al.
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Multi-Variant User Functions for Platform-Aware Skeleton Programming2020In: PARALLEL COMPUTING: TECHNOLOGY TRENDS, IOS PRESS , 2020, Vol. 36, p. 475-484Conference paper (Refereed)
    Abstract [en]

    Todays computer architectures are increasingly specialized and heterogeneous configurations of computational units are common. To provide efficient programming of these systems while still achieving good performance, including performance portability across platforms, high-level parallel programming libraries and tool-chains are used, such as the skeleton programming framework SkePU. SkePU works on heterogeneous systems by automatically generating program components, "user functions", for multiple different execution units in the system, such as CPU and GPU, from a high-level C++ program. This work extends this multi-backend approach by providing the possibility for the programmer to provide additional variants of these user functions tailored for different scenarios, such as platform constraints. This paper introduces the overall approach of multi-variant user functions, provides several use cases including explicit SIMD vectorization for supported hardware, and evaluates the result of these optimizations that can be achieved using this extension.

  • 10.
    Panagiotou, Sotirios
    et al.
    Natl Tech Univ Athens, Greece.
    Ernstsson, August
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Ahlqvist, Johan
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Papadopoulos, Lazaros
    Natl Tech Univ Athens, Greece.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Soudris, Dimitrios
    Natl Tech Univ Athens, Greece.
    Portable exploitation of parallel and heterogeneous HPC architectures in neural simulation using SkePU2020In: PROCEEDINGS OF THE 23RD INTERNATIONAL WORKSHOP ON SOFTWARE AND COMPILERS FOR EMBEDDED SYSTEMS (SCOPES 2020), ASSOC COMPUTING MACHINERY , 2020, p. 74-77Conference paper (Refereed)
    Abstract [en]

    The complexity of modern HPC systems requires the use of new tools that support advanced programming models and offer portability and programmability of parallel and heterogeneous architectures. In this work we evaluate the use of SkePU framework in an HPC application from the neural computing domain. We demonstrate the successful deployment of the application based on SkePU using multiple back-ends (OpenMP, OpenCL and MPI) and present lessons-learned towards future extensions of the SkePU framework.

  • 11.
    Ernstsson, August
    et al.
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Extending smart containers for data locality-aware skeleton programming2019In: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 31, no 5, article id e5003Article in journal (Refereed)
    Abstract [en]

    We present an extension for the SkePU skeleton programming framework to improve the performance of sequences of transformations on smart containers. By using lazy evaluation, SkePU records skeleton invocations and dependencies as directed by smart container operands. When a partial result is required by a different part of the program, the run-time system will process the entire lineage of skeleton invocations; tiling is applied to keep chunks of container data in the working set for the whole sequence of transformations. The approach is inspired by big data frameworks operating on large clusters where good data locality is crucial. We also consider benefits other than data locality with the increased run-time information given by the lineage structures, such as backend selection for heterogeneous systems. Experimental evaluation of example applications shows potential for performance improvements due to better cache utilization, as long as the overhead of lineage construction and management is kept low.

  • 12.
    Ernstsson, August
    et al.
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Li, Lu
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    Kessler, Christoph
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
    SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems2018In: International journal of parallel programming, ISSN 0885-7458, E-ISSN 1573-7640, Vol. 46, no 1, p. 62-80Article in journal (Refereed)
    Abstract [en]

    In this article we present SkePU 2, the next generation of the SkePU C++ skeleton programming framework for heterogeneous parallel systems. We critically examine the design and limitations of the SkePU 1 programming interface. We present a new, flexible and type-safe, interface for skeleton programming in SkePU 2, and a source-to-source transformation tool which knows about SkePU 2 constructs such as skeletons and user functions. We demonstrate how the source-to-source compiler transforms programs to enable efficient execution on parallel heterogeneous systems. We show how SkePU 2 enables new use-cases and applications by increasing the flexibility from SkePU 1, and how programming errors can be caught earlier and easier thanks to improved type safety. We propose a new skeleton, Call, unique in the sense that it does not impose any predefined skeleton structure and can encapsulate arbitrary user-defined multi-backend computations. We also discuss how the source-to-source compiler can enable a new optimization opportunity by selecting among multiple user function specializations when building a parallel program. Finally, we show that the performance of our prototype SkePU 2 implementation closely matches that of SkePU 1.

    Download full text (pdf)
    fulltext
1 - 12 of 12
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf