liu.seSearch for publications in DiVA
Change search
ReferencesLink to record
Permanent link

Direct link
Optimized selection of runtime mode for the reconfigurable PRAM-NUMA architecture REPLICA using machine-learning
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, The Institute of Technology. (PELAB)
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, The Institute of Technology. (PELAB)ORCID iD: 0000-0001-5241-0026
2014 (English)In: Euro-Par 2014: Parallel Processing Workshops: Euro-Par 2014 International Workshops, Porto, Portugal, August 25-26, 2014, Revised Selected Papers, Part II / [ed] Luis Lopes et al., Springer-Verlag New York, 2014, 133-145 p.Conference paper (Refereed)
Abstract [en]

The massively hardware multithreaded VLIW emulated shared memory (ESM) architecture REPLICA has a dynamically reconfigurable on-chip network that offers two execution modes: PRAM and NUMA. PRAM mode is mainly suitable for applications with high amount of thread level parallelism (TLP) while NUMA mode is mainly for accelerating execution of sequential programs or programs with low TLP. Also, some types of regular data parallel algorithms execute faster in NUMA mode. It is not obvious in which mode a given program region shows the best performance. In this study we focus on generic stencil-like computations exhibiting regular control flow and memory access pattern. We use two state-of-the art machine-learning methods, C5.0 (decision trees) and Eureqa Pro (symbolic regression) to select which mode to use.We use these methods to derive different predictors based on the same training data and compare their results. The accuracy of the best derived predictors are 95% and are generated by both C5.0 and Eureqa Pro, although the latter can in some cases be more sensitive to the training data. The average speedup gained due to mode switching ranges between 1.92 to 2.23 for all generated predictors on the evaluation test cases, and using a majority voting algorithm, based on the three best predictors, we can eliminate all misclassifications.

Place, publisher, year, edition, pages
Springer-Verlag New York, 2014. 133-145 p.
Lecture Notes in Computer Science, ISSN 0302-9743 (print), 1611-3349 (online) ; 8806
Keyword [en]
parallel computing, reconfigurable architecture, chip multiprocessor, machine learning, program optimization, performance analysis
National Category
Computer Science
URN: urn:nbn:se:liu:diva-114342DOI: 10.1007/978-3-319-14313-2_12ISI: 000354785000012ISBN: 978-3-319-14312-5ISBN: 978-3-319-14313-2OAI: diva2:789423
Euro-Par 2014 Conference
Swedish e‐Science Research Center, OpCoReS
Available from: 2015-02-18 Created: 2015-02-18 Last updated: 2015-06-12

Open Access in DiVA

No full text

Other links

Publisher's full textSpringerLink

Search in DiVA

By author/editor
Hansson, ErikKessler, Christoph
By organisation
Software and SystemsThe Institute of Technology
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 19 hits
ReferencesLink to record
Permanent link

Direct link