liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Auto-tuning SkePU: A multi-backend skeleton programming framework for multi-GPU systems
Linköping University, Department of Computer and Information Science, PELAB - Programming Environment Laboratory. Linköping University, The Institute of Technology.
Linköping University, Department of Computer and Information Science, PELAB - Programming Environment Laboratory. Linköping University, The Institute of Technology.
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, The Institute of Technology.ORCID iD: 0000-0001-5241-0026
2011 (English)In: IWMSE '11 Proceedings of the 4th International Workshop on Multicore Software Engineering, New York, NY, USA: Association for Computing Machinery (ACM), 2011, 25-32 p.Conference paper, Published paper (Other academic)
Abstract [en]

SkePU is a C++ template library that provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU systems. Currently available skeletons in SkePU include map, reduce, mapreduce, map-with-overlap, maparray, and scan. The performance of SkePU generated code is comparable to that of hand-written code, even for more complex applications such as ODE solving.

In this paper, we discuss initial results from auto-tuning SkePU using an off-line, machine learning approach where we adapt skeletons to a given platform using training data. The prediction mechanism at execution time uses off-line pre-calculated estimates to construct an execution plan for any desired configuration with minimal overhead. The prediction mechanism accurately predicts execution time for repetitive executions and includes a mechanism to predict execution time for user functions of different complexity. The tuning framework covers selection between different backends as well as choosing optimal parameter values for the selected backend. We will discuss our approach and initial results obtained for different skeletons (map, mapreduce, reduce).

Place, publisher, year, edition, pages
New York, NY, USA: Association for Computing Machinery (ACM), 2011. 25-32 p.
Keyword [en]
auto-tuning, cuda, data parallelism, gpu, opencl, skeleton programming
National Category
Computer Science
Identifiers
URN: urn:nbn:se:liu:diva-91514DOI: 10.1145/1984693.1984697ISBN: 978-1-4503-0577-8 (print)OAI: oai:DiVA.org:liu-91514DiVA: diva2:618210
Conference
Fourth International Workshop on Multicore Software Engineering (IWMSE 2011), May 21, 2011, Waikiki, Honolulu, HI, USA
Projects
PEPPHER EU FP7 project
Available from: 2013-04-26 Created: 2013-04-26 Last updated: 2014-10-08

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Dastgeer, UsmanKessler, Christoph

Search in DiVA

By author/editor
Dastgeer, UsmanKessler, Christoph
By organisation
PELAB - Programming Environment LaboratoryThe Institute of TechnologySoftware and Systems
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 58 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf