liu.seSearch for publications in DiVA
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Latency-Aware Packet Processing on CPU-GPU Heterogeneous Systems
Linköpings universitet, Institutionen för datavetenskap, Programvara och system. Linköpings universitet, Tekniska fakulteten.
Linköpings universitet, Institutionen för datavetenskap, Programvara och system. Linköpings universitet, Tekniska fakulteten.
Linköpings universitet, Institutionen för datavetenskap, Programvara och system. Linköpings universitet, Tekniska fakulteten. Ericsson Sweden.
Ericsson Sweden.
Vise andre og tillknytning
2017 (engelsk)Inngår i: DAC '17 Proceedings of the 54th Annual Design Automation Conference 2017, New York, NY, USA: Association for Computing Machinery (ACM), 2017Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

In response to the tremendous growth of the Internet, towards what we call the Internet of Things (IoT), there is a need to move from costly, high-time-to-market specific-purpose hardware to flexible, low-time-to-market general-purpose devices for packet processing. Among several such devices, GPUs have attracted attention in the past, mainly because the high computing demand of packet processing applications can, potentially, be satisfied by these throughput-oriented machines. However, another important aspect of such applications is the packet latency which, if not handled carefully, will overshadow the throughput benefits. Unfortunately, until now, this aspect has been mostly ignored. To address this issue, we propose a method that considers the variable bit rate of the traffic and, depending on the current rate, minimizes the latency, while meeting the rate demand. We propose a persistent kernel based software architecture to overcome the challenges inherent in GPU implementation like kernel invocation overhead, CPU-GPU communication and memory access overhead. We have chosen packet classification as the packet processing application to demonstrate our technique. Using the proposed approach, we are able to reduce the packet latency on average by a factor of 3.5, compared to the state-of-the-art solutions, without any packet drop.

sted, utgiver, år, opplag, sider
New York, NY, USA: Association for Computing Machinery (ACM), 2017.
Serie
Design Automation Conference DAC, ISSN 0738-100X
HSV kategori
Identifikatorer
URN: urn:nbn:se:liu:diva-141212DOI: 10.1145/3061639.3062269ISI: 000424895400129Scopus ID: 2-s2.0-85023612665ISBN: 978-1-4503-4927-7 (tryckt)OAI: oai:DiVA.org:liu-141212DiVA, id: diva2:1144800
Konferanse
54th ACM/EDAC/IEEE Design Automation Conference (DAC), Austin, TX, USA, June 18-22, 2017
Tilgjengelig fra: 2017-09-27 Laget: 2017-09-27 Sist oppdatert: 2018-12-07bibliografisk kontrollert
Inngår i avhandling
1. System-Level Design of GPU-Based Embedded Systems
Åpne denne publikasjonen i ny fane eller vindu >>System-Level Design of GPU-Based Embedded Systems
2018 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Modern embedded systems deploy several hardware accelerators, in a heterogeneous manner, to deliver high-performance computing. Among such devices, graphics processing units (GPUs) have earned a prominent position by virtue of their immense computing power. However, a system design that relies on sheer throughput of GPUs is often incapable of satisfying the strict power- and time-related constraints faced by the embedded systems.

This thesis presents several system-level software techniques to optimize the design of GPU-based embedded systems under various graphics and non-graphics applications. As compared to the conventional application-level optimizations, the system-wide view of our proposed techniques brings about several advantages: First, it allows for fully incorporating the limitations and requirements of the various system parts in the design process. Second, it can unveil optimization opportunities through exposing the information flow between the processing components. Third, the techniques are generally applicable to a wide range of applications with similar characteristics. In addition, multiple system-level techniques can be combined together or with application-level techniques to further improve the performance.

We begin by studying some of the unique attributes of GPU-based embedded systems and discussing several factors that distinguish the design of these systems from that of the conventional high-end GPU-based systems. We then proceed to develop two techniques that address an important challenge in the design of GPU-based embedded systems from different perspectives. The challenge arises from the fact that GPUs require a large amount of workload to be present at runtime in order to deliver a high throughput. However, for some embedded applications, collecting large batches of input data requires an unacceptable waiting time, prompting a trade-off between throughput and latency. We also develop an optimization technique for GPU-based applications to address the memory bottleneck issue by utilizing the GPU L2 cache to shorten data access time. Moreover, in the area of graphics applications, and in particular with a focus on mobile games, we propose a power management scheme to reduce the GPU power consumption by dynamically adjusting the display resolution, while considering the user's visual perception at various resolutions. We also discuss the collective impact of the proposed techniques in tackling the design challenges of emerging complex systems.

The proposed techniques are assessed by real-life experimentations on GPU-based hardware platforms, which demonstrate the superior performance of our approaches as compared to the state-of-the-art techniques.

sted, utgiver, år, opplag, sider
Linköping: Linköping University Electronic Press, 2018. s. 62
Serie
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1964
Emneord
GPU, GPGPU, embedded system, heterogeneous computing, system-level design
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-152469 (URN)10.3384/diss.diva-152469 (DOI)9789176851753 (ISBN)
Disputas
2018-12-19, Nobel BL32, B-Huset, Campus Valla, Linköping, 13:15 (engelsk)
Opponent
Veileder
Forskningsfinansiär
CUGS (National Graduate School in Computer Science), 995523
Tilgjengelig fra: 2018-12-07 Laget: 2018-12-07 Sist oppdatert: 2019-09-30bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Personposter BETA

Maghazeh, ArianBordoloi, Unmesh D.Dastgeer, Usman

Søk i DiVA

Av forfatter/redaktør
Maghazeh, ArianBordoloi, Unmesh D.Dastgeer, UsmanEles, PetruPeng, Zebo
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 96 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf