VectorPU: A Generic and Efficient Data-container and Component Model for Transparent Data Transfer on GPU-based Heterogeneous Systems
2017 (English)In: Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM'17), Association for Computing Machinery (ACM), 2017, p. 7-12Conference paper, Published paper (Refereed)
Abstract [en]
We present VectorPU, a C++ based programming framework providing high-level and efficient unified memory access on heterogeneous systems, in particular GPU-based systems. VectorPU consists of a light-weight runtime library providing a generic, "smart" data-container abstraction for transparent software caching of array operands with programmable memory coherence, and a light-weight component model realized by macro-based data access annotations. VectorPU thereby enables a flexible unified memory view with data transfer and device memory management abstracted away from programmers, while keeping the efficiency of expert-written code with manual data movement and memory management. We provide a prototype of VectorPU for (CUDA) GPU-based systems, and show that it can achieve 1.40x to 13.29x speedup over good quality code using Nvidia's Unified Memory by experiments on several machines ranging from laptops to supercomputer nodes, with Kepler and Maxwell GPUs. We also show the expressiveness and wide applicability of VectorPU, and its low overhead and equal efficiency compared to expert-written code.
Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2017. p. 7-12
Keywords [en]
heterogeneous computing, programming model, flow signature, programming framework, run-time system, memory coherence management, software caching, VectorPU, GPGPU, CUDA, unified memory
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-168605DOI: 10.1145/3029580.3029582ISBN: 9781450348775 (print)OAI: oai:DiVA.org:liu-168605DiVA, id: diva2:1461472
Conference
8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM'17), Stockholm, Sweden, Jan. 2017
Funder
Swedish e‐Science Research Center, PSDE2020-08-262020-08-262020-08-27