Adaptive Off-Line Tuning for Optimized Composition of Components for Heterogeneous Many-Core Systems
2013 (English)In: High Performance Computing for Computational Science - VECPAR 2012, Springer, 2013, 329-345 p.Conference paper (Refereed)
In recent years heterogeneous multi-core systems have been given much attention. However, performance optimization on these platforms remains a big challenge. Optimizations performed by compilers are often limited due to lack of dynamic information and run time environment, which makes applications often not performance portable. One current approach is to provide multiple implementations for the same interface that could be used interchangeably depending on the call context, and expose the composition choices to a compiler, deployment-time composition tool and/or run-time system. Using off-line machine-learning techniques allows to improve the precision and reduce the run-time overhead of run-time composition and leads to an improvement of performance portability. In this work we extend the run-time composition mechanism in the PEPPHER composition tool by off-line composition and present an adaptive machine learning algorithm for generating compact and efficient dispatch data structures with low training time. As dispatch data structure we propose an adaptive decision tree structure, which implies an adaptive training algorithm that allows to control the trade-off between training time, dispatch precision and run-time dispatch overhead.
We have evaluated our optimization strategy with simple kernels (matrix-multiplication and sorting) as well as applications from RODINIA benchmark on two GPU-based heterogeneous systems. On average, the precision for composition choices reaches 83.6 percent with approximately 34 minutes off-line training time.
Place, publisher, year, edition, pages
Springer, 2013. 329-345 p.
Lecture Notes in Computer Science, ISSN 0302-9743 (print), 1611-3349 (online) ; 7851
parallel programming, parallel computing, automated performance tuning, machine learning, adaptive sampling, GPU, multicore processor, software composition, program optimization, autotuning
IdentifiersURN: urn:nbn:se:liu:diva-93471DOI: 10.1007/978-3-642-38718-0_32ISI: 000342997100032ISBN: 978-3-642-38717-3 (print)ISBN: 978-3-642-38718-0 (online)OAI: oai:DiVA.org:liu-93471DiVA: diva2:625081
10th International Conference on High Performance Computing for Computational Science, VECPAR 2012; Kobe; Japan
ProjectsEU FP7 PEPPHER (2010-2012), #248481, www.peppher.euSeRC - OpCoReS
FunderEU, FP7, Seventh Framework Programme, 248481Swedish e‐Science Research Center, OpCoReS