From Word Clouds to Word Rain: Revisiting the Classic Word Cloud to Visualize Climate Change Texts
2024 (English)In: Information Visualization, ISSN 1473-8716, E-ISSN 1473-8724, Vol. 23, no 3, p. 217-238Article in journal (Refereed) Published
Abstract [en]
Word Rain is a development of the classic word cloud. It addresses some of the limitations of word clouds, in particular the lack of a semantically motivated positioning of the words, and the use of font size as a sole indicator of word prominence. Word Rain uses the semantic information encoded in a distributional semantics-based language model – reduced into one dimension – to position the words along the x-axis. Thereby, the horizontal positioning of the words reflects semantic similarity. Font size is still used to signal word prominence, but this signal is supplemented with a bar chart, as well as with the position of the words on the y-axis. We exemplify the use of Word Rain by three concrete visualization tasks, applied on different real-world texts and document collections on climate change. In these case studies, word2vec models, reduced to one dimension with t-SNE, are used to encode semantic similarity, and TF-IDF is used for measuring word prominence. We evaluate the technique further by carrying out domain expert reviews.
Place, publisher, year, edition, pages
Sage Publications, 2024. Vol. 23, no 3, p. 217-238
Keywords [en]
word cloud, tag cloud, text visualization, digital humanities, climate change data, text and document data
National Category
Computer Sciences Human Computer Interaction Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-201979DOI: 10.1177/14738716241236188ISI: 001193261300001OAI: oai:DiVA.org:liu-201979DiVA, id: diva2:1847837
Funder
Swedish Research Council, 2021-00176Swedish Research Council, 2021-00181Swedish Research Council, 2017-00626
Note
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article, i.e. Word Rain has been developed with funding from three research infrastructures:
• Huminfra: National infrastructure for Research in the Humanities and Social Sciences (Swedish Research Council, 2021-00176)
• InfraVis: the Swedish National Research Infrastructure for Data Visualization (Swedish Research Council, 2021-00181)
• Nationella Språkbanken: The National Language Bank of Sweden (Swedish Research Council, 2017-00626)
Funding: Huminfra: National infrastructure for Research in the Humanities and Social Sciences (Swedish Research Council) [2021-00176]; InfraVis: Swedish National Research Infrastructure for Data Visualization (Swedish Research Council) [2021-00181]; Nationella Sprakbanken: National Language Bank of Sweden (Swedish Research Council) [2017-00626]
2024-03-292024-03-292024-06-17