Accelerating System-Level Design Tasks using Graphics Processors
2011 (English)Other (Other academic)
Recent years have seen the increasing use of graphics processing units (GPUs) for nongraphics related applications. Applications that have harnessed the computational power of GPUs span across numerical algorithms, computational geometry, database processing, image processing, astrophysics and bioinformatics. There are many compelling reasons behind exploiting GPUs for such general-purpose computing tasks. First, modern GPUs are extremely powerful. For example, highend GPUs such as the NVidia GeForce GTX 480 and ATI Radeon 5870 have 1.35 TFlops and 2.72 TFlops of peak single precision performance, whereas a high-end general-purpose processor such as the Intel Core i7-960 has a peak performance of 102 Gflops. Additionally, the memory bandwidth of these GPUs are more than 5x greater than what is available to a CPU, which allows them to excel even in low compute intensity but high bandwidth usage scenarios. Second, GPUs are now commodity items as their costs have dramatically reduced over the last few years.
In spite of a wide variety of computationally expensive system-level design tasks (in the context of embedded systems design) that are regularly solved by software tools running on desktops and laptops equipped with high-end GPUs, the use of GPUs for accelerating such problems is still not a conventional practice within the design automation community. As a result, of late, there has been a lot of research interest in demonstrating the applicability of GPUs in accelerating design automation tasks. Some of tasks that have been accelerated using modern GPUs include schedulability/timing analysis, hardware/software partitioning, fault simulation, and verification of digital designs. In this tutorial we will describe techniques for programming GPUs for general purpose computing (i.e., nongraphics applications) and cover a number of case studies from the electronic design automation area. We will demonstrate how GPUs can lead to significant improvement in running times and hence the usability of the design tools that exploit them. In particular, we will start by introducing the graphics processor architecture and programming models for GPUs (OpenCL and CUDA). OpenCL is an open standard for programming GPUs (and also other modern processors) and is a cross-platform alternative to CUDA. It has been created by a consortium that includes AMD, Apple, IBM, Intel, and Nvidia. We will then discuss various examples of system-level design tasks and identify their computational kernels. Finally, we will present different case studies to illustrate how system-level design algorithms have to be suitably modified in order to map them onto GPUs.
Place, publisher, year, edition, pages
Engineering and Technology
IdentifiersURN: urn:nbn:se:liu:diva-63301OAI: oai:DiVA.org:liu-63301DiVA: diva2:377974
The tutorial proposal has been accepted for presentation at the International Conference on VLSI Design, Chennai, India, January 2-7, 2011.2010-12-152010-12-152010-12-21Bibliographically approved