liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Controllable Sentence Simplification in Swedish using Control Prefixes and Mined Paraphrases
Linköping University, Department of Computer and Information Science. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Computer and Information Science, Human-Centered Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0003-4899-588X
2024 (English)In: Proceedings of the 2024 Joint International Conference on ComputationalLinguistics, Language Resources and Evaluation (LREC-COLING 2024) / [ed] Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti and Nianwen Xue, ELRA and ICCL , 2024, p. 3943-3954Conference paper, Published paper (Refereed)
Abstract [en]

Making information accessible to diverse target audiences, including individuals with dyslexia and cognitive disabilities, is crucial. Automatic Text Simplification (ATS) systems aim to facilitate readability and comprehension by reducing linguistic complexity. However, they often lack customizability to specific user needs, and training data for smaller languages can be scarce. This paper addresses ATS in a Swedish context, using methods that provide more control over the simplification. A dataset of Swedish paraphrases is mined from large amounts of text and used to train ATS models utilizing prefix-tuning with control prefixes. We also introduce a novel data-driven method for selecting complexity attributes for controlling the simplification and compare it with previous approaches. Evaluation of the trained models using SARI and BLEU demonstrates significant improvements over the baseline -- a fine-tuned Swedish BART model -- and compared to previous Swedish ATS results. These findings highlight the effectiveness of employing paraphrase data in conjunction with controllable generation mechanisms for simplification. Additionally, the set of explored attributes yields similar results compared to previously used attributes, indicating their ability to capture important simplification aspects.

Place, publisher, year, edition, pages
ELRA and ICCL , 2024. p. 3943-3954
Series
COLING, ISSN 2951-2093
Keywords [en]
natural language generation, simplification, text mining
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:liu:diva-204450ISBN: 9782493814104 (electronic)OAI: oai:DiVA.org:liu-204450DiVA, id: diva2:1868015
Conference
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 20-25 May, 2024 Torino, Italia
Available from: 2024-06-11 Created: 2024-06-11 Last updated: 2025-08-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Länk till konferens / Link to conference

Authority records

Jönsson, Arne

Search in DiVA

By author/editor
Monsen, JuliusJönsson, Arne
By organisation
Department of Computer and Information ScienceFaculty of Science & EngineeringHuman-Centered Systems
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 131 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf