liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
Alternative names
Publications (10 of 110) Show all publications
Andersson, E., Falkenjack, J. & Jönsson, A. (2025). Applying and Optimising a Multi-Scale Probit Model for Cross-Source Text Complexity Classification and Ranking in Swedish. In: Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025): . Paper presented at Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025).
Open this publication in new window or tab >>Applying and Optimising a Multi-Scale Probit Model for Cross-Source Text Complexity Classification and Ranking in Swedish
2025 (English)In: Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), 2025Conference paper, Published paper (Other academic)
Abstract [en]

We present results from using Probit models to classify and rank texts of varying complexity from multiple sources. We use multiple linguistic sources includingSwedish easy-to-read books and investigate data augmentation and feature regularisation as optimisation methods for text complexity assessment. Multi-Scale and Single Scale Probit models are implemented using different ratios of training data, and then compared. Overall, the findings suggest that the Multi-Scale Probit model is an effective method for classifying text complexity and ranking new texts and could be used to improve the performance on small datasets as well as normalise datasets labelled using different scales. 

National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-212606 (URN)
Conference
Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
Available from: 2025-03-25 Created: 2025-03-25 Last updated: 2025-04-02Bibliographically approved
Jönsson, A., Bandyopadhyay, S., Pantic-Dragisic, S. & Fried, A. (2024). Analyses of information security standards on data crawled from company web sites using SweClarin resources. In: Selected papers from the CLARIN Annual Conference 2023: . Paper presented at CLARIN Annual Conference 2023, Leuven, Belgium, 16-18 October 2023.
Open this publication in new window or tab >>Analyses of information security standards on data crawled from company web sites using SweClarin resources
2024 (English)In: Selected papers from the CLARIN Annual Conference 2023, 2024Conference paper, Published paper (Refereed)
Abstract [en]

With the purpose of  analysing Swedish companies' adherence and adoption of the information security standard ISO 27001 and to examine the communicative constitution of preventive innovation in organisations, we have created a corpus of corporate texts from Swedish company websites. The corpus was analysed from multiple interdisciplinary perspectives in close cooperation with management researchers and SweClarin researchers using SweClarin tools and resources as well as standard language technology tools. Some analyses require deep reading, which was performed by management researchers, often guided by results from language analyses. Initial results have been presented at a management studies conference. In this paper, we focus on presenting the research issues, the methods used in the project, the results, and the experience of SweClarin researchers supporting researchers in  social sciences. Our contribution is to show how it is possible, through the integration of human insights and digital methods, to increase the credibility and validity of a digitally acquired data set and subsequent research findings. In our view, a combination of human deep reading (management researchers), contextual lexical verification (management studies) and language technology (content and sentiment analysis) can help to sensitise computational text analysis for medium-sized data sets.

Series
Linköping Electronic Conference Proceedings, ISSN 1650-3686, E-ISSN 1650-3740 ; 210
National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-204447 (URN)10.3384/ecp210012 (DOI)978-91-8075-740-9 (ISBN)
Conference
CLARIN Annual Conference 2023, Leuven, Belgium, 16-18 October 2023
Available from: 2024-06-11 Created: 2024-06-11 Last updated: 2025-04-11Bibliographically approved
Holmer, D. & Jönsson, A. (2024). Auxiliary Techniques to Help Readers Understand Texts. In: Papers from The Tenth Swedish Language Technology Conference (SLTC): . Paper presented at The Tenth Swedish Language Technology Conference (SLTC), November 27-29, 2024.
Open this publication in new window or tab >>Auxiliary Techniques to Help Readers Understand Texts
2024 (English)In: Papers from The Tenth Swedish Language Technology Conference (SLTC), 2024Conference paper, Published paper (Other academic)
Abstract [en]

 We explore three auxiliary techniques for automatic text adaptation (ATA)—epithets for nouns, explanations for keywords, and syllabification—to aid reading for individuals with reading difficulties. In an initial evaluation, we conducted a study with individuals possessing average reading skills. Results indicate that while all three techniques demonstrate high accuracy, their usefulness varies. Epithets were found to be less beneficial, possibly due to the introduction of excessive information, although they may assist certain populations, such as individuals with intellectual disabilities. Keyword explanations were generally helpful and accurate, though occasional inaccuracies arose with rare or domain-specific terms. The effectiveness of syllabification was found to be contingent on the specific words being processed. These findings suggest that while ATA techniques can improve reading accessibility, their varying impacts highlight the need for tailored approaches based on the reader's needs.

National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-212604 (URN)
Conference
The Tenth Swedish Language Technology Conference (SLTC), November 27-29, 2024
Available from: 2025-03-25 Created: 2025-03-25 Last updated: 2025-04-02Bibliographically approved
Monsen, J. & Jönsson, A. (2024). Controllable Sentence Simplification in Swedish using Control Prefixes and Mined Paraphrases. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation: . Paper presented at Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation.
Open this publication in new window or tab >>Controllable Sentence Simplification in Swedish using Control Prefixes and Mined Paraphrases
2024 (English)In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, 2024Conference paper, Published paper (Refereed)
Abstract [en]

Making information accessible to diverse target audiences, including individuals with dyslexia and cognitive disabilities, is crucial. Automatic Text Simplification (ATS) systems aim to facilitate readability and comprehension by reducing linguistic complexity. However, they often lack customizability to specific user needs, and training data for smaller languages can be scarce. This paper addresses ATS in a Swedish context, using methods that provide more control over the simplification. A dataset of Swedish paraphrases is mined from large amounts of text and used to train ATS models utilizing prefix-tuning with control prefixes. We also introduce a novel data-driven method for selecting complexity attributes for controlling the simplification and compare it with previous approaches. Evaluation of the trained models using SARI and BLEU demonstrates significant improvements over the baseline -- a fine-tuned Swedish BART model -- and compared to previous Swedish ATS results. These findings highlight the effectiveness of employing paraphrase data in conjunction with controllable generation mechanisms for simplification. Additionally, the set of explored attributes yields similar results compared to previously used attributes, indicating their ability to capture important simplification aspects.

National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-204450 (URN)
Conference
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Available from: 2024-06-11 Created: 2024-06-11 Last updated: 2025-02-07Bibliographically approved
Brissman, W. & Jönsson, A. (2024). Exploring and Analyzing Differences Across Levels of Readability in Easy-to-Read Text. In: Papers from The Tenth Swedish Language Technology Conference (SLTC): . Paper presented at The Tenth Swedish Language Technology Conference (SLTC), November 27-29, 2024.
Open this publication in new window or tab >>Exploring and Analyzing Differences Across Levels of Readability in Easy-to-Read Text
2024 (English)In: Papers from The Tenth Swedish Language Technology Conference (SLTC), 2024Conference paper, Published paper (Other academic)
Abstract [en]

In this paper, we present results from investigations of text complexity using cohesion measures and their importance related to other text complexity measures. To provide additional nuance, we introduce the interrelated concepts of epistemic stance and narrativity, deepening the analysis of the statistical findings. These concepts also facilitate further discussion on complexity and cohesion as they relate to reading skills and knowledge asymmetries. We employ principal component analysis (PCA) to uncover these statistical relationships on a broader scale, while conducting more specific in-depth analyses of certain metrics. Our findings, which mostly align with existing literature, reaffirm the significance of narrativity in contextualizing cohesion. However, we unexpectedly found a clear link between higher complexity and less narrative text. Additionally, the PCA reveals a more nuanced picture of referential cohesion and the use of its constituent metrics, which varies depending on both narrativity and complexity.

National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-212605 (URN)
Conference
The Tenth Swedish Language Technology Conference (SLTC), November 27-29, 2024
Available from: 2025-03-25 Created: 2025-03-25 Last updated: 2025-04-02Bibliographically approved
Jönsson, A., Bandyopadhyay, S., Pantic-Dragisic, S. & Fried, A. (2023). Analyses of information security standards on data crawled from company web sites using SweClarin resources. In: Krister Lindén, Jyrki Niemi, and Thalassia Kontino (Ed.), CLARIN Annual Conference Proceedings: 2023. Paper presented at Proceedings of the 2023 CLARIN Annual Conference, Leuven, Belgium, 16 – 18 October, 2023..
Open this publication in new window or tab >>Analyses of information security standards on data crawled from company web sites using SweClarin resources
2023 (English)In: CLARIN Annual Conference Proceedings: 2023 / [ed] Krister Lindén, Jyrki Niemi, and Thalassia Kontino, 2023Conference paper, Poster (with or without abstract) (Other academic)
Series
CLARIN Annual Conference Proceedings, E-ISSN 2773-2177
National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-198914 (URN)
Conference
Proceedings of the 2023 CLARIN Annual Conference, Leuven, Belgium, 16 – 18 October, 2023.
Available from: 2023-11-01 Created: 2023-11-01 Last updated: 2025-03-03Bibliographically approved
Ahrenberg, L., Holmer, D., Holmlid, S. & Jönsson, A. (2023). Analysing changes in official use of the design concept using SweCLARIN resources. In: Tomaž Erjavec and Maria Eskevich (Ed.), Tomaž Erjavec and Maria Eskevich (Ed.), Selected papers from the CLARIN Annual Conference 2022: . Paper presented at CLARIN Annual Conference, 10-12 October 2022, Prague, Czechia,. Linköping: Linköping University Electronic Press
Open this publication in new window or tab >>Analysing changes in official use of the design concept using SweCLARIN resources
2023 (English)In: Selected papers from the CLARIN Annual Conference 2022 / [ed] Tomaž Erjavec and Maria Eskevich, Linköping: Linköping University Electronic Press, 2023Conference paper, Published paper (Refereed)
Abstract [en]

We investigate changes in the use of four Swedish words from the fields of design and archi- tecture. It has been suggested that their meanings have been blurred, especially in governmental reports and policy documents, so that distinctions between them that are important to stakeholders in the respective fields are lost. Specifically, we compare usage in two governmental public reports on design, one from 1999 and the other from 2015, and additionally in opinion responses to the 2015 report. Our approach is to contextualise occurrences of the words in different representations of the texts using word embeddings, topic modelling and sentiment analysis. Tools and language resources developed within the SweClarin infrastructure have been crucial for the implementation of the study.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2023
Series
Linköping Electronic Conference Proceedings, ISSN 1650-3686, E-ISSN 1650-3740
National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-194933 (URN)10.3384/ecp198001 (DOI)9789180752541 (ISBN)
Conference
CLARIN Annual Conference, 10-12 October 2022, Prague, Czechia,
Available from: 2023-06-13 Created: 2023-06-13 Last updated: 2025-02-07Bibliographically approved
Graichen, E. & Jönsson, A. (2023). Context-aware Swedish Lexical Simplification. In: Sanja Štajner, Horacio Saggio, Matthew Shardlow, Fernando Alva-Manchego. (Ed.), Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability: . Paper presented at Second Workshop on Text Simplification, Accessibility and Readability, TSAR 2023 (pp. 11-20). Shoumen
Open this publication in new window or tab >>Context-aware Swedish Lexical Simplification
2023 (English)In: Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability / [ed] Sanja Štajner, Horacio Saggio, Matthew Shardlow, Fernando Alva-Manchego., Shoumen, 2023, p. 11-20Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Shoumen: , 2023
National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-198913 (URN)978-954-452-086-1 (ISBN)
Conference
Second Workshop on Text Simplification, Accessibility and Readability, TSAR 2023
Note

The article is licensed on a Creative Commons Attribution 4.0 International License. 

Available from: 2023-11-01 Created: 2023-11-01 Last updated: 2025-02-07Bibliographically approved
Ahrenberg, L., Holmer, D., Holmlid, S. & Jönsson, A. (2022). Analysing Changes in Official Use of the Design Concept Using SweCLARIN Resources. In: Proceedings of the CLARIN Annual meeting: . Paper presented at CLARIN Annual Conference 2022, Prague, Czechia, 10 - 12 October, 2022.
Open this publication in new window or tab >>Analysing Changes in Official Use of the Design Concept Using SweCLARIN Resources
2022 (English)In: Proceedings of the CLARIN Annual meeting, 2022Conference paper, Published paper (Refereed)
Abstract [en]

We show how the tools and language resources developed within the SweClarin infrastructure can be used to investigate changes in the use and understanding of the Swedish related words arkitektur, design, form, and formgivning. Specifically, we compare their use in two governmental public reports on design, one from 1999 and the other from 2015. We test the hypothesis that their meaning has developed in a way that blurs distinctions that may be important to stakeholders in the respective fields.

National Category
Natural Language Processing
Identifiers
urn:nbn:se:liu:diva-190562 (URN)
Conference
CLARIN Annual Conference 2022, Prague, Czechia, 10 - 12 October, 2022
Available from: 2022-12-13 Created: 2022-12-13 Last updated: 2025-02-07Bibliographically approved
Fried, A., Pantic-Dragisic, S., Jönsson, A. & Mirtsch, M. (2022). Communicating preventive innovation - the case of the information security standard ISO/IEC 27001. In: : . Paper presented at European Group of Organization Studies Colloquium 2022, Subtheme 6 on Performing creativity, innovation, and change: communicating to reconfigure the organization, Wirtschaftsunversität Wien, Austria.
Open this publication in new window or tab >>Communicating preventive innovation - the case of the information security standard ISO/IEC 27001
2022 (English)Conference paper, Oral presentation only (Other academic)
Abstract [en]

Preventive innovation differs from ordinary innovation. The innovation diffusion literature claims that the economic benefits of preventive innovation to adopters, such as ensuring information security, are mainly intangible and often time-delayed and sometimes only adopted for incidents that may never occur. Adopter communication about preventive innovation therefore seems to be crucial.

Using the example of the information security standard ISO/IEC 27001, we examine how communication of preventive innovations is shaped by its adopters. By analyzing texts about the information security standard ISO/IEC 27001 on Swedish corporate websites using computational linguistics tools and classical content analysis, we could identify, first, different adoption approaches of preventive innovation driven, second, by three modes of data governance: agency, stewardship and brokerage. Third, we provide evidence that the communication of preventive innovation depends on its data governance mode, but, fourth, also on the economic benefits of preventive innovation for adopters.

Our contribution to the innovation literature is twofold. First, the concept of preventive innovation originally presented by Rogers (1995) is revived and further developed. Comparing it to its original scope, we show that preventive innovation can be meaningful for adopting organizations not only when they go through all possible adoption phases identified by Rogers (1995). Also an economic benefit from preventive innovation is possible. Both aspects, adoption approach as well as economic opportunity strongly shape the production of meaning in communication about preventive innovation. Second, we show that computational linguistics can support qualitative research in the study of meaning production in communication, especially when dealing with large amounts of data, for instance, gained from corporate websites.

National Category
Business Administration Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:liu:diva-187636 (URN)
Conference
European Group of Organization Studies Colloquium 2022, Subtheme 6 on Performing creativity, innovation, and change: communicating to reconfigure the organization, Wirtschaftsunversität Wien, Austria
Available from: 2022-08-17 Created: 2022-08-17 Last updated: 2023-11-09
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-4899-588X

Search in DiVA

Show all publications