liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Visually Guided Extraction of Prevalent Topics
Linnaeus University, Sweden.ORCID iD: 0000-0001-6150-0787
Blekinge Institute of Technology, Sweden.ORCID iD: 0000-0001-6745-4398
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering. (iVis, INV)ORCID iD: 0000-0002-1907-7820
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering. Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM). (iVis, INV)ORCID iD: 0000-0002-0519-2537
2025 (English)In: Information Visualization, ISSN 1473-8716, E-ISSN 1473-8724, Vol. 24, no 2, p. 179-198Article in journal (Refereed) Published
Abstract [en]

The sensemaking process of large sets of text documents is highly challenging for tasks such as obtaining a comprehensive overview or keeping up with the most important trends and topics. Even though several established methods for condensation and summarization of large text corpora exist, many of them lack the ability to account for difference in prevalence between identified topics, which in turn impedes quantitative analysis. In this paper, we therefore propose a novel prevalence-aware method for topic extraction, and show how it can be used to obtain important insights from two text corpora with very different content. We also implemented a prototype visual analytics tool which guides the user in the search for relevant insights and promotes trust in the yielded results. We have verified our application by a user study, as well as by a validation run on a data set with previously known topic structure. The results clearly show that our approach is suitable for text mining, that is can be used by non-experts, and that it offers features which makes it an interesting candidate for use in several different analyze scenarios.

Place, publisher, year, edition, pages
Sage Publications, 2025. Vol. 24, no 2, p. 179-198
Keywords [en]
Visual Analytics, Text Mining, Text Embedding, Topic Modelling, Similarity Calculations
National Category
Computer Sciences Human Computer Interaction
Identifiers
URN: urn:nbn:se:liu:diva-210850DOI: 10.1177/14738716241312400ISI: 001408697200001Scopus ID: 2-s2.0-85216198128OAI: oai:DiVA.org:liu-210850DiVA, id: diva2:1925676
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Note

This work was partially supported through the ELLIIT environment for strategic research in Sweden. The work of Ilir Jusufi was supported in part by the Knowledge Foundation, Sweden, through the project ”Rekryteringar 21, Universitetslektor i spelteknik” under Contract 20210077.

Available from: 2025-01-09 Created: 2025-01-09 Last updated: 2025-03-19

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Kucher, KostiantynKerren, Andreas

Search in DiVA

By author/editor
Witschard, DanielJusufi, IlirKucher, KostiantynKerren, Andreas
By organisation
Media and Information TechnologyFaculty of Science & Engineering
In the same journal
Information Visualization
Computer SciencesHuman Computer Interaction

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 192 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf