liu.seSearch for publications in DiVA
Change search
ReferencesLink to record
Permanent link

Direct link
Hadoop Read Performance During Datanode Crashes
Linköping University, Department of Computer and Information Science.
Linköping University, Department of Computer and Information Science.
2016 (Swedish)Independent thesis Basic level (degree of Bachelor), 10,5 credits / 16 HE creditsStudent thesisAlternative title
Hadoops läsprestanda vid datanodkrascher (English)
Abstract [en]

This bachelor thesis evaluates the impact of datanode crashes on the performance of the read operations of a Hadoop Distributed File System, HDFS. The goal is to better understand how datanode crashes, as well as how certain parameters, affect the  performance of the read operation by looking at the execution time of the get command. The parameters used are the number of crashed nodes, block size and file size. By setting up a Linux test environment with ten virtual machines and Hadoop installed on them and running tests on it, data has been collected in order to answer these questions. From this data the average execution time and standard deviation of the get command was calculated. The network activity during the tests was also measured. The results showed that neither the number of crashed nodes nor block size had any significant effect on the execution time. It also demonstrated that the execution time of the get command was not directly proportional to the size of the fetched file. The execution time was up to 4.5 times as long when the file size was four times as large. A four times larger file did sometimes result in more than a four times as long execution time. Although, the consequences of a datanode crash while fetching a small file appear to be much greater than with a large file. The average execution time increased by up to 36% when a large file was fetched but it increased by as much as 85% when fetching a small file.

Place, publisher, year, edition, pages
2016. , 21 p.
Keyword [en]
hadoop, read performance, datanode, crashes, impact, file size, block size, number, crashed, nodes, distributed systems
National Category
Computer Science Computer Engineering
URN: urn:nbn:se:liu:diva-130466ISRN: LIU-IDA/LITH-EX-G--16/056--SEOAI: diva2:951409
Subject / course
Information Technology
2016-05-31, Visionen, Linköpings universitet 58131, Linköping, 08:00 (Swedish)
Available from: 2016-08-16 Created: 2016-08-08 Last updated: 2016-08-16Bibliographically approved

Open Access in DiVA

fulltext(841 kB)8 downloads
File information
File name FULLTEXT01.pdfFile size 841 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Johannsen, FabianHellsing, Mattias
By organisation
Department of Computer and Information Science
Computer ScienceComputer Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 8 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 62 hits
ReferencesLink to record
Permanent link

Direct link