liu.seSearch for publications in DiVA
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Algorithmically Guided Information Visualization: Explorative Approaches for High Dimensional, Mixed and Categorical Data
Linköpings universitet, Institutionen för teknik och naturvetenskap, Medie- och Informationsteknik. Linköpings universitet, Tekniska högskolan.
2011 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)Alternativ tittel
Algoritmiskt vägledd informationsvisualisering för högdimensionell och kategorisk data (svensk)
Abstract [en]

Facilitated by the technological advances of the last decades, increasing amounts of complex data are being collected within fields such as biology, chemistry and social sciences. The major challenge today is not to gather data, but to extract useful information and gain insights from it. Information visualization provides methods for visual analysis of complex data but, as the amounts of gathered data increase, the challenges of visual analysis become more complex.

This thesis presents work utilizing algorithmically extracted patterns as guidance during interactive data exploration processes, employing information visualization techniques. It provides efficient analysis by taking advantage of fast pattern identification techniques as well as making use of the domain expertise of the analyst. In particular, the presented research is concerned with the issues of analysing categorical data, where the values are names without any inherent order or distance; mixed data, including a combination of categorical and numerical data; and high dimensional data, including hundreds or even thousands of variables.

The contributions of the thesis include a quantification method, assigning numerical values to categorical data, which utilizes an automated method to define category similarities based on underlying data structures, and integrates relationships within numerical variables into the quantification when dealing with mixed data sets. The quantification is incorporated in an interactive analysis pipeline where it provides suggestions for numerical representations, which may interactively be adjusted by the analyst. The interactive quantification enables exploration using commonly available visualization methods for numerical data. Within the context of categorical data analysis, this thesis also contributes the first user study evaluating the performance of what are currently the two main visualization approaches for categorical data analysis.

Furthermore, this thesis contributes two dimensionality reduction approaches, which aim at preserving structure while reducing dimensionality, and provide flexible and user-controlled dimensionality reduction. Through algorithmic quality metric analysis, where each metric represents a structure of interest, potentially interesting variables are extracted from the high dimensional data. The automatically identified structures are visually displayed, using various visualization methods, and act as guidance in the selection of interesting variable subsets for further analysis. The visual representations furthermore provide overview of structures within the high dimensional data set and may, through this, aid in focusing subsequent analysis, as well as enabling interactive exploration of the full high dimensional data set and selected variable subsets. The thesis also contributes the application of algorithmically guided approaches for high dimensional data exploration in the rapidly growing field of microbiology, through the design and development of a quality-guided interactive system in collaboration with microbiologists.

sted, utgiver, år, opplag, sider
Linköping: Linköping University Electronic Press , 2011. , s. 72
Serie
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1400
Emneord [en]
Information visualization, data mining, high dimensional data, categorical data, mixed data
HSV kategori
Identifikatorer
URN: urn:nbn:se:liu:diva-70860ISBN: 978-91-7393-056-7 (tryckt)OAI: oai:DiVA.org:liu-70860DiVA, id: diva2:445884
Disputas
2011-11-11, Domen, Norrköpings Visualiseringscenter, Kungsgatan 54, 602 33 Norrköping, 09:15 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2011-10-06 Laget: 2011-09-20 Sist oppdatert: 2019-12-19bibliografisk kontrollert
Delarbeid
1. Interactive Quantification of Categorical Variables in Mixed Data Sets
Åpne denne publikasjonen i ny fane eller vindu >>Interactive Quantification of Categorical Variables in Mixed Data Sets
2008 (engelsk)Inngår i: Information Visualisation, 2008. IV '08. 12th International Conference / [ed] Ebad Banissi, Liz Stuart, Mikael Jern, Gennady Andrienko, Francis T. Marchese, Nasrullah Memon, Reda Alhajj, Theodor G Wyeld, Remo Aslak Burkhard, Georges Grinstein, Dennis Groth, Anna Ursyn, Carsten Maple, Anthony Faiola and Brock Craft, Los Alamitos, California: IEEE Computer Society, 2008, s. 3-10Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Data sets containing a combination of categorical and continuous variables (mixed data sets) are difficult to analyse since no generalized similarity measure exists for categorical variables. Quantification of categorical variables makes it possible to represent this type of data using techniques designed for numerical data. This paper presents a quantification process of categorical variables in mixed data sets that incorporates information on relationships among the continuous variables into the process, as well as utilizing the domain knowledge of a user. An interactive visualization environment using parallel coordinates as a visual interface is provided, where the user is able to control the quantification process and analyse the result. The efficiency of the approach is demonstrated using two mixed data sets.

sted, utgiver, år, opplag, sider
Los Alamitos, California: IEEE Computer Society, 2008
Serie
IEEE International Conference on Information Visualisation, ISSN 1550-6037
Emneord
Categorical data, mixed data, parallel coordinates, quantification, correspondence analysis, clustering
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-43480 (URN)10.1109/IV.2008.33 (DOI)000259178400001 ()73940 (Lokal ID)978-0-7695-3268-4 (ISBN)73940 (Arkivnummer)73940 (OAI)
Konferanse
12th International Conference Information Visualisation, IV '08, London, UK, 9-11 July 2008
Tilgjengelig fra: 2009-10-10 Laget: 2009-10-10 Sist oppdatert: 2025-02-18bibliografisk kontrollert
2. Visual Exploration of Categorical and Mixed Data Sets
Åpne denne publikasjonen i ny fane eller vindu >>Visual Exploration of Categorical and Mixed Data Sets
2009 (engelsk)Inngår i: Proceeding VAKD '09 Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration: Workshop on Visual Analytics and Knowledge Discovery, New York, USA: ACM Press, 2009, s. 21-29Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

For categorical data there does not exist any similarity measurewhich is as straight forward and general as the numericaldistance between numerical items. Due to this it is often difficultto analyse data sets including categorical variables or a combination of categorical and numerical variables (mixeddata sets). Quantification of categorical variables enablesanalysis using commonly used visual representations andanalysis techniques for numerical data. This paper presents a tool for exploratory analysis of categorical and mixed data, which uses a quantification process introduced in [Johansson2008]. The application enables analysis of mixed data sets by providingan environment for exploratory analysis using commonvisual representations in multiple coordinated views and algorithmic analysis that facilitates detection of potentially interesting patterns within combinations of categorical and numerical variables. The effectiveness of the quantificationprocess and of the features of the application is demonstratedthrough a case scenario.

sted, utgiver, år, opplag, sider
New York, USA: ACM Press, 2009
Emneord
Information visualization, visual exploration, quantification, categorical data, mixed data, data mining
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-25572 (URN)10.1145/1562849.1562852 (DOI)978-1-60558-670-0 (ISBN)
Konferanse
ACM SIGKDD Workshop on Visual Analytics and Knowledge DiscoveryACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery, 28 June - 1 July, Paris, France
Tilgjengelig fra: 2009-10-08 Laget: 2009-10-08 Sist oppdatert: 2025-02-19bibliografisk kontrollert
3. Visual Analysis of Mixed Data Sets Using Interactive Quantification
Åpne denne publikasjonen i ny fane eller vindu >>Visual Analysis of Mixed Data Sets Using Interactive Quantification
2009 (engelsk)Inngår i: ACM SIGKDD Explorations Newsletter, ISSN 1931-0145, Vol. 11, nr 2, s. 29-38Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

It is often diffcult to analyse data sets including a combi-nation of categorical and numerical variables (mixed datasets) since there does not exist any similarity measure whichis as straight forward and general as the numerical distancebetween numerical items. Quantication of categorical vari-ables enables analysis using commonly used visual represen-tations and analysis techniques for numerical data. Thispaper presents a tool for exploratory analysis of categoricaland mixed data which uses a quantication process intro-duced in [Johansson2008]. The application enables analysis of mixeddata sets by providing an environment for exploratory anal-ysis using common visual representations in multiple coordi-nated views and algorithmic analysis that facilitates detec-tion of potentially interesting patterns within combinationsof categorical and numerical variables. The generality andusefulness of the quantication process and of the featuresof the application is demonstrated through a case scenariousing a data set from the IEEE VAST 2008 Challenge.

sted, utgiver, år, opplag, sider
New York: ACM, 2009
Emneord
Information Visualization, Visual Analysis, Categorical Data, Quantification
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-60142 (URN)10.1145/1809400.1809406 (DOI)
Tilgjengelig fra: 2010-10-06 Laget: 2010-10-06 Sist oppdatert: 2025-02-18
4. A Task Based Performance Evaluation of Visualization Approaches for Categorical Data Analysis.
Åpne denne publikasjonen i ny fane eller vindu >>A Task Based Performance Evaluation of Visualization Approaches for Categorical Data Analysis.
2011 (engelsk)Inngår i: Proceedings - 15th International Conferenceon Information Visualisation, Los Alamitos, CA, USA: IEEE Computer Society, 2011, s. 80-89Konferansepaper, Publicerat paper (Annet vitenskapelig)
Abstract [en]

Categorical data is common within many areas and efficient methods for analysis are needed. It is, however, often difficult to analyse categorical data since no general measure of similarity exists. One approach is to represent the categories with numerical values (quantification) prior to visualization using methods for numerical data. Another is to use visual representations specifically designed for categorical data. Although commonly used, very little guidance is available as to which method may be most useful for different analysis tasks. This paper presents an evaluation comparing the performance of employing quantification prior to visualization and visualization using a method designed for categorical data. It also provides a guidance as to which visualization approach is most useful in the context of two basic data analysis tasks: one related to similarity structures and one related to category frequency. The results strongly indicate that the quantification approach is most efficient for the similarity related task, whereas the visual representation designed for categorical data is most efficient for the task related to category frequency.

sted, utgiver, år, opplag, sider
Los Alamitos, CA, USA: IEEE Computer Society, 2011
Emneord
Categorical Data, Quantitative Evaluation, Usability Studies, Parallel Sets, Quantification
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-70855 (URN)10.1109/IV.2011.92 (DOI)978-1-4577-0868-8 (ISBN)
Konferanse
15th International Conference on Information Visualisation (IV), 2011 , 13-15 July, London, UK
Tilgjengelig fra: 2011-09-20 Laget: 2011-09-20 Sist oppdatert: 2025-02-18
5. Interactive Dimensionality Reduction Through User-defined Combinations of Quality Metrics
Åpne denne publikasjonen i ny fane eller vindu >>Interactive Dimensionality Reduction Through User-defined Combinations of Quality Metrics
2009 (engelsk)Inngår i: IEEE Transactions on Visualization and Computer Graphics, ISSN 1077-2626, E-ISSN 1941-0506, Vol. 15, nr 6, s. 993-1000Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Multivariate data sets including hundreds of variables are increasingly common in many application areas. Most multivariate visualization techniques are unable to display such data effectively, and a common approach is to employ dimensionality reduction prior to visualization. Most existing dimensionality reduction systems focus on preserving one or a few significant structures in data. For many analysis tasks, however, several types of structures can be of high significance and the importance of a certain structure compared to the importance of another is often task-dependent. This paper introduces a system for dimensionality reduction by combining user-defined quality metrics using weight functions to preserve as many important structures as possible. The system aims at effective visualization and exploration of structures within large multivariate data sets and provides enhancement of diverse structures by supplying a range of automatic variable orderings. Furthermore it enables a quality-guided reduction of variables through an interactive display facilitating investigation of trade-offs between loss of structure and the number of variables to keep. The generality and interactivity of the system is demonstrated through a case scenario.

Emneord
Dimensionality reduction, interactivity, quality metrics, variable ordering
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-53092 (URN)10.1109/TVCG.2009.153 (DOI)19834164 (PubMedID)
Tilgjengelig fra: 2010-01-15 Laget: 2010-01-15 Sist oppdatert: 2025-02-18bibliografisk kontrollert
6. Visual Exploration of Microbial Populations
Åpne denne publikasjonen i ny fane eller vindu >>Visual Exploration of Microbial Populations
Vise andre…
2011 (engelsk)Inngår i: IEEE Symposium on Biological Data Visualization, 2011, s. 127-134Konferansepaper, Publicerat paper (Annet vitenskapelig)
Abstract [en]

Studies of the ecology of microbial populations are increasingly common within many research areas as the field of microbiomics develops rapidly. The study of the ecology in sampled microbial populations generates high dimensional data sets. Although many analysis methods are available for examination of such data, a tailored tool was required to fulfill the need of interactivity and flexibility for microbiologists. In this paper, MicrobiVis is presented. It is a tool for visual exploration and interactive analysis of microbiomic populations. MicrobiVis has been designed in close collaboration with end users. It extends previous interactive systems for explorative dimensionality reduction by including a range of domain relevant features. It contributes a flexible and explorative dimensionality reduction as well as a visual and interactive environment for examination of data subsets. By combining information visualization and methods based on analytic tasks common in microbiology as a means for gaining new and relevant insights. The utility of MicrobiVis is demonstrated through a use case describinghow a microbiologist may use the system for a visual analysis of amicrobial data set. Its usability and potential is indicated throughpositive feedback from the current end users.

Emneord
Dimensionality reduction, information visualization, explorative analysis, microbiomics, bacterial population
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-70852 (URN)10.1109/BioVis.2011.6094057 (DOI)978-1-4673-0003-2 (ISBN)
Konferanse
BioVis 2011 - the 1st IEEE Symposium on Biological Data Visualization, 23-24 October 2011, Providence, RI, USA
Tilgjengelig fra: 2011-09-20 Laget: 2011-09-20 Sist oppdatert: 2025-02-18bibliografisk kontrollert
7. Quality Based Guidance for Exploratory Dimensionality Reduction
Åpne denne publikasjonen i ny fane eller vindu >>Quality Based Guidance for Exploratory Dimensionality Reduction
2013 (engelsk)Inngår i: Information Visualization, ISSN 1473-8716, E-ISSN 1473-8724, Vol. 12, nr 1, s. 44-64Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

High dimensional data sets containing hundreds of variables are difficult to explore, since traditional visualization methods often are unable to represent such data effectively. Dimensionality reduction is commonly employed prior to visualization to address this difficulty, and numerous dimensionality reduction methods are available. However, few dimensionality reduction approaches take the importance of several structures into account and few provide an overview of structures existing in the full high dimensional data set. For exploratory analysis, as well as for many other tasks, several structures may be of interest and exploration of the full high dimensional data set without reduction may also be desirable.This paper presents methods for exploratory analysis and interactive dimensionality reduction, where automated methods are employed to analyse and rank the variables using a range of quality metrics, providing one or more measures of ‘interestingness’ for individual variables. Through ranking, a single value of interestingness is obtained based on several quality metrics which is usable as a threshold for the most interesting variables. An interactive environment is presented where the user is provided many possibilities to explore and gain understanding of the structures within the high dimensional data set, all based on quality metrics and ranking. Guided by this, the analyst can explore the high dimensional data set and select interactively a subset of the potentially most interesting variables, employing various interactive methods for dimensionality reduction. The effectiveness and usefulness of the system is demonstrated through a use-case analysing data from a DNA sequence-based study of bacterial populations.

sted, utgiver, år, opplag, sider
Palgrave Macmillan / SAGE Publications (UK and US), 2013
Emneord
High-dimensional data, dimensionality reduction, quality metrics, visual exploration, interactive visual analysis
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-70859 (URN)10.1177/1473871612460526 (DOI)000315073700003 ()
Merknad

Funding Agencies|Unilever Discover Port Sunlight||Swedish Research Council in the Linnaeus Centre CADICS||Visualization Programme||

Tilgjengelig fra: 2011-09-20 Laget: 2011-09-20 Sist oppdatert: 2017-12-08bibliografisk kontrollert

Open Access i DiVA

Algorithmically Guided Information Visualization: Explorative Approaches for High Dimensional, Mixed and Categorical Data(4043 kB)2669 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 4043 kBChecksum SHA-512
7b9e0c108ff4129bc16048078753912c02540af0b2209ea8d29a3d7ceb5c3f6abdba124ea1b87b2ae2c40286fd334ed91aa2c991f52188145c7c351e2eea802b
Type fulltextMimetype application/pdf
omslag(2983 kB)187 nedlastinger
Filinformasjon
Fil COVER01.pdfFilstørrelse 2983 kBChecksum SHA-512
b67be1af35370f06229ab73c3f8d9e36877775ddf2e85057f4a730a2090b150e9bd73d6e54bb5a77f2a4d31e5b2edaab1088e30c929b67d5922d40df77a8cf72
Type coverMimetype application/pdf
Bestill online >>

Person

Johansson Fernstad, Sara

Søk i DiVA

Av forfatter/redaktør
Johansson Fernstad, Sara
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 2670 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 1909 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf