liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Systematic detection of memory related performance bottlenecks in GPGPU programs
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
Centre for IT-Security, Privacy and Accountability, Saarland University, Germany.
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
2016 (English)In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 71, 73-87 p.Article in journal (Refereed) Published
Abstract [en]

Graphics processing units (GPUs) pose an attractive choice for designing high-performance and energy-efficient software systems. This is because GPUs are capable of executing massively parallel applications. However, the performance of GPUs is limited by the contention in memory subsystems, often resulting in substantial delays and effectively reducing the parallelism. In this paper, we propose GRAB, an automated debugger to aid the development of efficient GPU kernels. GRAB systematically detects, classifies and discovers the root causes of memory-performance bottlenecks in GPUs. We have implemented GRAB and evaluated it with several open-source GPU kernels, including two real-life case studies. We show the usage of GRAB through improvement of GPU kernels on a real NVIDIA Tegra K1 hardware – a widely used GPU for mobile and handheld devices. The guidance obtained from GRAB leads to an overall improvement of up to 64%.

Place, publisher, year, edition, pages
Elsevier, 2016. Vol. 71, 73-87 p.
Keyword [en]
Performance debugging, GPGPU, Caches
National Category
Computer Science
Identifiers
URN: urn:nbn:se:liu:diva-131079DOI: 10.1016/j.sysarc.2016.08.002ISI: 000390503600008OAI: oai:DiVA.org:liu-131079DiVA: diva2:959330
Available from: 2016-09-07 Created: 2016-09-07 Last updated: 2017-10-02Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full texthttp://www.sciencedirect.com/science/article/pii/S1383762116300935

Search in DiVA

By author/editor
Horga, AdrianChattopadhyay, SudiptaEles, PetruPeng, Zebo
By organisation
Software and SystemsFaculty of Science & Engineering
In the same journal
Journal of systems architecture
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 167 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf