liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Applying and Optimising a Multi-Scale Probit Model for Cross-Source Text Complexity Classification and Ranking in Swedish
Linköping University.
Linköping University, Faculty of Science & Engineering. Linköping University, Department of Computer and Information Science, Human-Centered Systems.ORCID iD: 0000-0002-6357-4461
Linköping University, Department of Computer and Information Science, Human-Centered Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0003-4899-588X
2025 (English)In: Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), 2025Conference paper, Published paper (Other academic)
Abstract [en]

We present results from using Probit models to classify and rank texts of varying complexity from multiple sources. We use multiple linguistic sources includingSwedish easy-to-read books and investigate data augmentation and feature regularisation as optimisation methods for text complexity assessment. Multi-Scale and Single Scale Probit models are implemented using different ratios of training data, and then compared. Overall, the findings suggest that the Multi-Scale Probit model is an effective method for classifying text complexity and ranking new texts and could be used to improve the performance on small datasets as well as normalise datasets labelled using different scales. 

Place, publisher, year, edition, pages
2025.
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:liu:diva-212606OAI: oai:DiVA.org:liu-212606DiVA, id: diva2:1947123
Conference
Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
Available from: 2025-03-25 Created: 2025-03-25 Last updated: 2025-04-02Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Falkenjack, JohanJönsson, Arne

Search in DiVA

By author/editor
Andersson, ElsaFalkenjack, JohanJönsson, Arne
By organisation
Linköping UniversityFaculty of Science & EngineeringHuman-Centered Systems
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 105 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf