liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Optimized pipelined parallel merge sort on the cell BE
Fern Universität in Hagen, Germany.
Linköping University, Department of Computer and Information Science, PELAB - Programming Environment Laboratory. Linköping University, The Institute of Technology.ORCID iD: 0000-0001-5241-0026
2009 (English)In: Euro-Par 2008 Workshops - Parallel Processing: VHPC 2008, UNICORE 2008, HPPC 2008, SGS 2008, PROPER 2008, ROIA 2008, and DPA 2008, Las Palmas de Gran Canaria, Spain, August 25-26, 2008, Revised Selected Papers / [ed] Eduardo César, Michael Alexander, Achim Streit, Jesper Larsson Träff, Christophe Cérin, Andreas Knüpfer, Dieter Kranzlmüller and Shantenu Jha, Springer Berlin/Heidelberg, 2009, Vol. 5415 LNCS, 131-140 p.Chapter in book (Refereed)
Abstract [en]

Chip multiprocessors designed for streaming applications such as Cell BE offer impressive peak performance but suffer from limited bandwidth to off-chip main memory. As the number of cores is expected to rise further, this bottleneck will become more critical in the coming years. Hence, memory-efficient algorithms are required. As a case study, we investigate parallel sorting on Cell BE as a problem of great importance and as a challenge where the ratio between computation and memory transfer is very low. Our previous work led to a parallel mergesort that reduces memory bandwidth requirements by pipelining between SPEs, but the allocation of SPEs was rather ad-hoc. In our present work, we investigate mappings of merger nodes to SPEs. The mappings are designed to provide optimal trade-offs between load balancing, buffer memory consumption, and communication load on the on-chip bus. We solve this multi-objective optimization problem by deriving an integer linear programming formulation and compute Pareto-optimal solutions for the mapping of merge trees with up to 127 merger nodes. For mapping larger trees, we give a fast divide-and-conquer based approximation algorithm. We evaluate the sorting algorithm resulting from our mappings by a discrete event simulation.

Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2009. Vol. 5415 LNCS, 131-140 p.
Series
Lecture Notes in Computer Science, ISSN 0302-9743 (print), 1611-3349 (online) ; 5415
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:liu:diva-21257DOI: 10.1007/978-3-642-00955-6_18ISBN: 3-642-00954-9 (print)ISBN: 978-3-642-00954-9 (print)ISBN: e-978-3-642-00955-6 OAI: oai:DiVA.org:liu-21257DiVA: diva2:241036
Note

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),

Available from: 2009-09-30 Created: 2009-09-30 Last updated: 2014-10-08Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textfind book at a swedish library/hitta boken i ett svenskt bibliotek

Authority records BETA

Kessler, Christoph

Search in DiVA

By author/editor
Kessler, Christoph
By organisation
PELAB - Programming Environment LaboratoryThe Institute of Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 97 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf