liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Optimized Pipelined Parallel Merge Sort on the Cell BE
Dept. of Mathematics and Computer Science FernUniversität in Hagen, Germany.
Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, PELAB - Programming Environment Laboratory.ORCID iD: 0000-0001-5241-0026
2008 (English)In: 2nd Int. Workshop on Highly Parallel Processing on a Chip HPPC-2008,2008, Berlin: Springer , 2008Conference paper, Published paper (Refereed)
Abstract [en]

Chip multiprocessors designed for streaming applications such as Cell BE offer impressive peak performance but suffer from limited bandwidth to off-chip main memory. As the number of cores is expected to rise further, this bottleneck will become more critical in the coming years. Hence, memory-efficient algorithms are required. As a case study, we investigate parallel sorting on Cell BE as a problem of great importance and as a challenge where the ratio between computation and memory transfer is very low. Our previous work led to a parallel mergesort that reduces memory bandwidth requirements by pipelining between SPEs, but the allocation of SPEs was rather ad-hoc. In our present work, we investigate mappings of merger nodes to SPEs. The mappings are designed to provide optimal trade-offs between load balancing, buffer memory consumption, and communication load on the on-chip bus. We solve this multi-objective optimization problem by deriving an integer linear programming formulation and compute Pareto-optimal solutions for the mapping of merge trees with up to 127 merger nodes. For mapping larger trees, we give a fast divide-and-conquer based approximation algorithm. We evaluate the sorting algorithm resulting from our mappings by a discrete event simulation.

Place, publisher, year, edition, pages
Berlin: Springer , 2008.
Keywords [en]
Parallel computing, multicore processor, computer architecture, algorithm engineering, parallel sorting, integer linear programming
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-43703Local ID: 74568OAI: oai:DiVA.org:liu-43703DiVA, id: diva2:264563
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2018-01-12

Open Access in DiVA

No full text in DiVA

Other links

http://www.ida.liu.se/~chrke/publ.html

Authority records

Kessler, Christoph

Search in DiVA

By author/editor
Kessler, Christoph
By organisation
The Institute of TechnologyPELAB - Programming Environment Laboratory
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 125 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf