liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
What makes an (audio)book popular?
Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Vad gör en (ljud)bok populär? (Swedish)
Abstract [en]

Audiobook reading has traditionally been used for educational purposes but has in recent times grown into a popular alternative to the more traditional means of consuming literature. In order to differentiate themselves from other players in the market, but also provide their users enjoyable literature, several audiobook companies have lately directed their efforts on producing own content. Creating highly rated content is, however, no easy task and one reoccurring challenge is how to make a bestselling story. In an attempt to identify latent features shared by successful audiobooks and evaluate proposed methods for literary quantification, this thesis employs an array of frameworks from the field of Statistics, Machine Learning and Natural Language Processing on data and literature provided by Storytel - Sweden’s largest audiobook company.

We analyze and identify important features from a collection of 3077 Swedish books concerning their promotional and literary success. By considering features from the aspects Metadata, Theme, Plot, Style and Readability, we found that popular books are typically published as a book series, cover 1-3 central topics, write about, e.g., daughter-mother relationships and human closeness but that they also hold, on average, a higher proportion of verbs and a lower degree of short words. Despite successfully identifying these, but also other factors, we recognized that none of our models predicted “bestseller” adequately and that future work may desire to study additional factors, employ other models or even use different metrics to define and measure popularity.

From our evaluation of the literary quantification methods, namely topic modeling and narrative approximation, we found that these methods are, in general, suitable for Swedish texts but that they require further improvement and experimentation to be successfully deployed for Swedish literature. For topic modeling, we recognized that the sole use of nouns provided more interpretable topics and that the inclusion of character names tended to pollute the topics. We also identified and discussed the possible problem of word inflections when modeling topics for more morphologically complex languages, and that additional preprocessing treatments such as word lemmatization or post-training text normalization may improve the quality and interpretability of topics. For the narrative approximation, we discovered that the method currently suffers from three shortcomings: (1) unreliable sentence segmentation, (2) unsatisfactory dictionary-based sentiment analysis and (3) the possible loss of sentiment information induced by translations. Despite only examining a handful of literary work, we further found that books written initially in Swedish had narratives that were more cross-language consistent compared to books written in English and then translated to Swedish.

Place, publisher, year, edition, pages
2018. , p. 76
Keywords [en]
Audiobooks, Bestsellers, Algorithmic Criticism, Large-scale Literary Analysis, Natural Language Processing, Gaussian Processes, Topic Modeling
National Category
Other Computer and Information Science Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:liu:diva-152871ISRN: LIU-IDA/STAT-A--18/007--SEOAI: oai:DiVA.org:liu-152871DiVA, id: diva2:1265673
External cooperation
STORYTEL SWEDEN AB
Subject / course
Statistics
Supervisors
Examiners
Available from: 2019-02-15 Created: 2018-11-26 Last updated: 2019-02-15Bibliographically approved

Open Access in DiVA

what_makes_an_audiobook_popular(6631 kB)127 downloads
File information
File name FULLTEXT01.pdfFile size 6631 kBChecksum SHA-512
2f4ae420a1a92b46fc3b80c9b650535de3142116645c1197aaaf035d9d9c199e0dd84ab008f7eff21980ddb236e28e03150426ba36274229b42cf04f1e79ce43
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Barakat, Arian
By organisation
The Division of Statistics and Machine Learning
Other Computer and Information ScienceProbability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 127 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 598 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf