liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Probability as readability: A new machine learning approach to readability assessment for written Swedish
Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
2012 (English)Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesisAlternative title
Sannolikhet som läsbarhet : En ny maskininlärningsansats till läsbarhetsmätning för skriven svenska (Swedish)
Abstract [en]

This thesis explores the possibility of assessing the degree of readability of writtenSwedish using machine learning. An application using four levels of linguistic analysishas been implemented and tested with four different established algorithmsfor machine learning. The new approach has then been compared to establishedreadability metrics for Swedish. The results indicate that the new method workssignificantly better for readability classification of both sentences and documents.The system has also been tested with so called soft classification which returns aprobability for the degree of readability of a given text. This probability can thenbe used to rank texts according to probable degree of readability.

Abstract [sv]

Detta examensarbete utforskar möjligheterna att bedöma svenska texters läsbarhet med hjälp av maskininlärning. Ett system som använder fyra nivåer av lingvistisk analys har implementerats och testats med fyra olika etablerade algoritmer för maskininlärning. Det nya angreppssättet har sedan jämförts med etablerade läsbarhetsmått för svenska. Resultaten visar att den nya metoden fungerar markant bättre för läsbarhetsklassning av både meningar och hela dokument. Systemet har också testats med så kallad mjuk klassificering som ger ett sannolikhetsvärde för en given texts läsbarhetsgrad. Detta sannolikhetsvärde kan användas för rangordna texter baserad på sannolik läsbarhetsgrad.

Place, publisher, year, edition, pages
2012. , 76 p.
Keyword [en]
Readability, Natural Language Processing, Computational Linguistics, Machine Learning, Swedish
National Category
Computer Science Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-78107ISRN: LIU-IDA/LITH-EX-A--12/023--SEOAI: oai:DiVA.org:liu-78107DiVA: diva2:531389
Subject / course
Computer and information science at the Institute of Technology
Uppsok
Technology
Supervisors
Examiners
Available from: 2012-06-08 Created: 2012-06-07 Last updated: 2012-06-08Bibliographically approved

Open Access in DiVA

fulltext(10025 kB)308 downloads
File information
File name FULLTEXT01.pdfFile size 10025 kBChecksum SHA-512
3d6027a9294d4091dc18579e32d7984f25168bd8e56c0f8e6e7b3f8b0cf819850057894162bd2c71c941101c0f57a52f9213796fd045e67dad9d564be1ca2cbf
Type fulltextMimetype application/pdf

By organisation
NLPLAB - Natural Language Processing LaboratoryThe Institute of Technology
Computer ScienceLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 308 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 677 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf