liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Algorithm Adaptation and Optimization of a Novel DSP Vector Co-processor
Linköping University, Department of Electrical Engineering, Computer Engineering.
2010 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The Division of Computer Engineering at Linköping's university is currently researching the possibility to create a highly parallel DSP platform, that can keep up with the computational needs of upcoming standards for various applications, at low cost and low power consumption. The architecture is called ePUMA and it combines a general RISC DSP master processor with eight SIMD co-processors on a single chip. The master processor will act as the main processor for general tasks and execution control, while the co-processors will accelerate computing intensive and parallel DSP kernels.This thesis investigates the performance potential of the co-processors by implementing matrix algebra kernels for QR decomposition, LU decomposition, matrix determinant and matrix inverse, that run on a single co-processor. The kernels will then be evaluated to find possible problems with the co-processors' microarchitecture and suggest solutions to the problems that might exist. The evaluation shows that the performance potential is very good, but a few problems have been identified, that causes significant overhead in the kernels. Pipeline mismatches, that occurs due to different pipeline lengths for different instructions, causes pipeline hazards and the current solution to this, doesn't allow effective use of the pipeline. In some cases, the single port memories will cause bottlenecks, but the thesis suggests that the situation could be greatly improved by using buffered memory write-back. Also, the lack of register forwarding makes kernels with many data dependencies run unnecessarily slow.

Place, publisher, year, edition, pages
2010. , 78 p.
Keyword [en]
DSP, SIMD, ePUMA, real-time, embedded, matrix, QR, LU, inverse, determinant, master-multi-SIMD, parallel algorithms, matrix algebra
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:liu:diva-57427ISRN: LiTH-ISY-EX--10/4372--SEOAI: oai:DiVA.org:liu-57427DiVA: diva2:325426
Presentation
2010-06-15, Systemet, Linköpings universitet, Linköping, 09:00 (English)
Uppsok
Technology
Supervisors
Examiners
Available from: 2010-06-18 Created: 2010-06-18 Last updated: 2010-06-18Bibliographically approved

Open Access in DiVA

fulltext(3480 kB)1366 downloads
File information
File name FULLTEXT01.pdfFile size 3480 kBChecksum SHA-512
e4c597b1647c8dca127f7e7f3507c7aebe06e536e78cf08052c3afebfb30c3c0e9171830e7cf73acc3113ae12d919cf7a2c41c481343667ab03f8a4033f8f6ef
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Karlsson, Andréas
By organisation
Computer Engineering
Computer Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 1366 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 643 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf