liu.seSearch for publications in DiVA
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Classifying easy-to-read texts without parsing
Linköpings universitet, Institutionen för datavetenskap. Linköpings universitet, Tekniska högskolan.ORCID-id: 0000-0002-6357-4461
Linköpings universitet, Institutionen för datavetenskap. Linköpings universitet, Tekniska högskolan.ORCID-id: 0000-0003-4899-588X
2014 (engelsk)Inngår i: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), Association for Computational Linguistics, 2014, s. 114-122Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Document classification using automated linguistic analysis and machine learning (ML) has been shown to be a viable road forward for readability assessment. The best models can be trained to decide if a text is easy to read or not with very high accuracy, e.g. a model using 117 parameters from shallow, lexical, morphological and syntactic analyses achieves 98,9% accuracy. In this paper we compare models created by parameter optimization over subsets of that total model to find out to which extent different high-performing models tend to consist of the same parameters and if it is possible to find models that only use features not requiring parsing. We used a genetic algorithm to systematically optimize parameter sets of fixed sizes using accuracy of a Support Vector Machine classi- fier as fitness function. Our results show that it is possible to find models almost as good as the currently best models while omitting parsing based features.

sted, utgiver, år, opplag, sider
Association for Computational Linguistics, 2014. s. 114-122
Emneord [en]
Readability, Readability Assessment, Genetic optimization, Machine Learning, Support Vector Machine
HSV kategori
Identifikatorer
URN: urn:nbn:se:liu:diva-117547ISBN: 978-1-937284-91-6 (tryckt)OAI: oai:DiVA.org:liu-117547DiVA, id: diva2:809460
Konferanse
14th Conference of the European Chapter of the Association for Computational Linguistics
Tilgjengelig fra: 2015-05-04 Laget: 2015-05-04 Sist oppdatert: 2018-01-11bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Person

Falkenjack, JohanJönsson, Arne

Søk i DiVA

Av forfatter/redaktør
Falkenjack, JohanJönsson, Arne
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 115 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf