liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Everyday mining: Exploring sequences in event-based data
Linköping University, Department of Science and Technology, Visual Information Technology and Applications (VITA). Linköping University, The Institute of Technology.ORCID iD: 0000-0003-4761-8601
2010 (English)Doctoral thesis, comprehensive summary (Other academic)Alternative title
Utforskning av sekvenser i händelsebaserade data (Swedish)
Abstract [en]

Event-based data are encountered daily in many disciplines and are used for various purposes. They are collections of ordered sequences of events where each event has a start time and a duration. Examples of such data include medical records, internet surfing records, transaction records, industrial process or system control records, and activity diary data.

This thesis is concerned with the exploration of event-based data, and in particular the identification and analysis of sequences within them. Sequences are interesting in this context since they enable the understanding of the evolving character of event data records over time. They can reveal trends, relationships and similarities across the data, allow for comparisons to be made within and between the records, and can also help predict forthcoming events.The presented work has researched methods for identifying and exploring such event-sequences which are based on modern visualization, interaction and data mining techniques.

An interactive visualization environment that facilitates analysis and exploration of event-based data has been designed and developed, which permits a user to freely explore different aspects of this data and visually identify interesting features and trends. Visual data mining methods have been developed within this environment, that facilitate the automatic identification and exploration of interesting sequences as patterns. The first method makes use of a sequence mining algorithm that identifies sequences of events as patterns, in an iterative fashion, according to certain user-defined constraints. The resulting patterns can then be displayed and interactively explored by the user.The second method has been inspired by web-mining algorithms and the use of graph similarity. A tree-inspired visual exploration environment has been developed that allows a user to systematically and interactively explore interesting event-sequences.Having identified interesting sequences as patterns it becomes interesting to further explore how these are incorporated across the data and classify the records based on the similarities in the way these sequences are manifested within them. In the final method developed in this work, a set of similarity metrics has been identified for characterizing event-sequences, which are then used within a clustering algorithm in order to find similarly behavinggroups. The resulting clusters, as well as attributes of the clusteringparameters and data records, are displayed in a set of linked views allowing the user to interactively explore relationships within these.

The research has been focused on the exploration of activity diary data for the study of individuals' time-use and has resulted in a powerful research tool facilitating understanding and thorough analysis of the complexity of everyday life.

Place, publisher, year, edition, pages
Norrköping: Linköping University Electronic Press , 2010. , 76 p.
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1331
Keyword [en]
Event-based data, activity diary data, event-sequences, interactive exploration, sequence identification, visual data mining
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-58311ISBN: 978-91-7393-343-8 (print)OAI: oai:DiVA.org:liu-58311DiVA: diva2:338152
Public defence
2010-09-10, Domteater, Norrköpings Visualiseringscenter C, Kungsgatan 54, 602 33 Norrköping, 09:15 (English)
Opponent
Supervisors
Available from: 2010-09-01 Created: 2010-08-10 Last updated: 2018-01-12Bibliographically approved
List of papers
1. Capturing patterns of everyday life: presentation of the visualization method VISUAL-TimePAcTS
Open this publication in new window or tab >>Capturing patterns of everyday life: presentation of the visualization method VISUAL-TimePAcTS
2006 (English)In: International Association for Time Use Research Annual Conference, Copenhagen, Denmark: Danish National Institute of Social Research , 2006Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Copenhagen, Denmark: Danish National Institute of Social Research, 2006
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-35802 (URN)28606 (Local ID)28606 (Archive number)28606 (OAI)
Conference
International Association for Time-Use Research Conference, Copenhagen, Denmark
Note
CD proceedings hence no page numbers.Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2015-06-02
2. 2D and 3D Representations for Feature Recognition in Time Geographical Diary Data
Open this publication in new window or tab >>2D and 3D Representations for Feature Recognition in Time Geographical Diary Data
2010 (English)In: Information Visualization, ISSN 1473-8716, E-ISSN 1473-8724, Vol. 9, no 4, 263-276 p.Article in journal (Refereed) Published
Abstract [en]

Time geographical representations are becoming a common approach to analysing spatio-temporal data. Such representations appear intuitive in the process of identifying patterns and features as paths of populations form tracks through the 3D space, which can be seen converging and diverging over time. In this article, we compare 2D and 3D representations within a time geographical visual analysis tool for activity diary data. We identify a representative task and evaluate task performance between the two representations. The results show that the 3D representation has benefits over the 2D representation for feature identification but also indicate that these benefits can be lost if the 3D representation is not carefully constructed to help the user to see them.

Place, publisher, year, edition, pages
Palgrave Macmillan Ltd, 2010
Keyword
evaluation, 2D and 3D representation, time geography, time cube, time diary data
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-52360 (URN)10.1057/ivs.2009.30 (DOI)000284461700004 ()
Available from: 2009-12-17 Created: 2009-12-17 Last updated: 2017-12-12
3. Everyday Life Discoveries: Mining and Visualizing Activity Patterns in Social Science Diary Data
Open this publication in new window or tab >>Everyday Life Discoveries: Mining and Visualizing Activity Patterns in Social Science Diary Data
2007 (English)In: IEEE International Conference on Information Visualisation,2007: ISSN 1550-6037 / [ed] Ebad Banissi, Remo Aslak Burkhard, Georges Grinstein, Urska Cvek, Marjan Trutschl, Liz Stuart, Theodor G Wyeld, Gennady Andrienko, Jason Dykes, Mikael Jern, Dennis Groth and Anna Ursyn, Los Alamitos, CA, USA: IEEE , 2007, 130-138 p.Conference paper, Published paper (Refereed)
Abstract [en]

The ability to identify and examine patterns of activities is a key tool for social and behavioural science. In the past this has been done by statistical or purely visual methods but automated sequential pattern analysis through sophisticated data mining and visualization tools for pattern location and evaluation can open up new possibilities for interactive exploration of the data. This paper describes the addition of a sequential pattern identification method to the visual activity-analysis tool, VISUAL-TimePAcTS, and its effectiveness in the process of pattern analysis in social science diary data. The results have shown that the method correctly identifies patterns and conveys them effectively to the social scientist in a manner that allows them quick and easy understanding of the significance of the patterns.

Place, publisher, year, edition, pages
Los Alamitos, CA, USA: IEEE, 2007
Series
Information Visualization, ISSN 1550-6037
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-40111 (URN)10.1109/IV.2007.48 (DOI)52273 (Local ID)0-7695-2900-3 (ISBN)52273 (Archive number)52273 (OAI)
Conference
11th International Conference in Information Visualization, 4-6 July, Zürich, Switzerland
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2015-06-02Bibliographically approved
4. Exploring time diaries using semi-automated activity pattern extraction
Open this publication in new window or tab >>Exploring time diaries using semi-automated activity pattern extraction
2009 (English)In: electronic International Journal of Time Use Research (eIJTUR), Vol. 6, no 1, 1-25 p.Article in journal (Refereed) Published
Abstract [en]

Identifying patterns of activities in time diaries in order to understand the variety of daily life in terms of combinationsof activities performed by individuals in different groups is of interest in time use research. So far, activitypatterns have mostly been identified by visually inspecting representations of activity data or by using sequencecomparison methods, such as sequence alignment, in order to cluster similar data and then extract representativepatterns from these clusters. Both these methods are sensitive to data size, pure visual methods becometoo cluttered and sequence comparison methods become too time consuming. Furthermore, the patterns identifiedby both methods represent mostly general trends of activity in a population, while detail and unexpectedfeatures hidden in the data are often never revealed. We have implemented an algorithm that searches the timediaries and automatically extracts all activity patterns meeting user-defined criteria of what constitutes a validpattern of interest for the user’s research question. Amongst the many criteria which can be applied are a timewindow containing the pattern, minimum and maximum occurrences of the pattern, and number of people thatperform it. The extracted activity patterns can then be interactively filtered, visualized and analyzed to revealinteresting insights. Exploration of the results of each pattern search may result in new hypotheses which can besubsequently explored by altering the search criteria. To demonstrate the value of the presented approach weconsider and discuss sequential activity patterns at a population level, from a single day perspective.

Keyword
Time-geography, diaries, everyday life, activity patterns, visualization, data mining, sequential pattern mining
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-51231 (URN)
Available from: 2009-10-22 Created: 2009-10-22 Last updated: 2015-03-20
5. Seeing Beyond Statistics: Visual Exploration of Productivity on a Construction Site
Open this publication in new window or tab >>Seeing Beyond Statistics: Visual Exploration of Productivity on a Construction Site
2008 (English)In: Vis 2008, Visualisation: Visualisation in Built and Rural Enviroments, Los Alamitos, CA, USA: IEEE Computer Society, 2008, 37-42 p.Conference paper, Published paper (Refereed)
Abstract [en]

Work on the construction site is known to be inefficient due to workers spending much time waiting for materials, transporting materials and from frequent interruptions of tasks. Studies on the construction site typically use statistical measures to analyse the sampled data about work and such measures, while very useful, can overlook important features of the data. In this paper we apply a previously developed approach, derived from Time Geographical methods, to visually represent the sampled construction productivity data and show that this method may enable the analyst to better understand the distribution of activities, and how they are interrelated and dependent upon each other. 

Place, publisher, year, edition, pages
Los Alamitos, CA, USA: IEEE Computer Society, 2008
Keyword
construction productivity, visual exploration, visualization, work sampling
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-43044 (URN)10.1109/VIS.2008.27 (DOI)000258052600006 ()71159 (Local ID)978-0-7695-3271-4 (ISBN)71159 (Archive number)71159 (OAI)
Conference
International Conference Visualisation (VIS 2008), London, UK, 9-11 July 2008
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2015-09-22Bibliographically approved
6. ActiviTree: Interactive Visual Exploration of Sequences in Event-Based Data Using Graph Similarity
Open this publication in new window or tab >>ActiviTree: Interactive Visual Exploration of Sequences in Event-Based Data Using Graph Similarity
2009 (English)In: IEEE Transactions on Visualization and Computer Graphics, ISSN 1077-2626, E-ISSN 1941-0506, ISSN 1077-2626, Vol. 15, no 6, 945-952 p.Article in journal (Refereed) Published
Abstract [en]

The identification of significant sequences in large and complex event-based temporal data is a challenging problem with applications in many areas of todays information intensive society. Pure visual representations can be used for the analysis, but are constrained to small data sets. Algorithmic search mechanisms used for larger data sets become expensive as the data size increases and typically focus on frequency of occurrence to reduce the computational complexity, often overlooking important infrequent sequences and outliers. In this paper we introduce an interactive visual data mining approach based on an adaptation of techniques developed for web searching, combined with an intuitive visual interface, to facilitate user-centred exploration of the data and identification of sequences significant to that user. The search algorithm used in the exploration executes in negligible time, even for large data, and so no pre-processing of the selected data is required, making this a completely interactive experience for the user. Our particular application area is social science diary data but the technique is applicable across many other disciplines.

Keyword
Interactive visual exploration, event-based data, sequence identification, graph similarity, node similarity
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-51476 (URN)10.1109/TVCG.2009.117 (DOI)
Available from: 2009-11-04 Created: 2009-11-04 Last updated: 2017-12-12
7. Behaviour-driven clustering based on event-sequence similarity metrics
Open this publication in new window or tab >>Behaviour-driven clustering based on event-sequence similarity metrics
2010 (English)Manuscript (preprint) (Other academic)
Abstract [en]

When analysing event data two key objectives are to first identify interesting subsequences in the data records and then to retrieve groups of records that exhibit similar behaviour. This is especially true when the focus of the exploration is the human, for example when using activity diaries to reveal sub-populations with similar behaviour, medical records to identify groups with similar medical conditions, or web sessions to find groups with similar web-surfing habits. In this paper we propose a visual exploration approach, based on sequence similarity metrics and clustering techniques, that will allow an analyst to interactively explore the distribution of sequences along event data records as well as group the results according to user-selected similarity preferences. We have identified a set of similarity metrics that are specific to event-sequences which we use as input into a clustering algorithm. The user can choose which metrics to use and assign weighting factors to them, which results in groupings that exhibit similar behaviour according to their definition of similarity and interestingness. The resulting clusters can be interactively explored in a multiple linked-view environment showing the clusters, the cluster quality, the similarity metrics and meta (background) information describing the clustered individuals in order to make comparisons within and between groups. Using such an interactive approach that considers user preferences and takes advantage of background knowledge gives a basis for enhanced analytical reasoning by providing a more complete understanding of the retrieved groupings and can lead to a more thorough analysis and accurate assessments.

Keyword
Event-based data, activity diary data, event sequences, similarity metrics, clustering, interactive exploration
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-58310 (URN)
Available from: 2010-08-10 Created: 2010-08-10 Last updated: 2015-09-22

Open Access in DiVA

Everyday mining : Exploring sequences in event-based data(854 kB)5047 downloads
File information
File name FULLTEXT01.pdfFile size 854 kBChecksum SHA-512
99d9dfb5ae1a010975d96c514cecd2f1cb189e524fd622b689f14954073b89f3159a7c7f12b048e2330b877a47611bc4820a7c50a6eb61c771d6f38a0b815c5e
Type fulltextMimetype application/pdf
Cover(572 kB)34 downloads
File information
File name COVER01.pdfFile size 572 kBChecksum SHA-512
85069355884f97fd1ea23a462afb930e71b32b211e26886648bbc95a780c5cc9a9ad715f62e201666882b0908aa173f73b7c7c343865a3baf3d0c879f686e294
Type coverMimetype application/pdf

Authority records BETA

Vrotsou, Katerina

Search in DiVA

By author/editor
Vrotsou, Katerina
By organisation
Visual Information Technology and Applications (VITA)The Institute of Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 5047 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 3954 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf