liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Shape Grammar Extraction for Efficient Query-by-Sketch Pattern Matching in Long Time Series
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering. (Information Visualization)
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0003-4761-8601
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
2016 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Long time-series, involving thousands or even millions of time steps, are common in many application domains but remain very difficult to explore interactively. Often the analytical task in such data is to identify specific patterns, but this is a very complex and computationally difficult problem and so focusing the search in order to only identify interesting patterns is a common solution. We propose an efficient method for exploring user-sketched patterns, incorporating the domain expert’s knowledge, in time series data through a shape grammar based approach. The shape grammar is extracted from the time series by considering the data as a combination of basic elementary shapes positioned across different am- plitudes. We represent these basic shapes using a ratio value, perform binning on ratio values and apply a symbolic approximation. Our proposed method for pattern matching is amplitude-, scale- and translation-invariant and, since the pattern search and pattern con- straint relaxation happen at the symbolic level, is very efficient permitting its use in a real-time/online system. We demonstrate the effectiveness of our method in a case study on stock market data although it is applicable to any numeric time series data.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2016. , p. 10p. 121-130
Series
IEEE Conference on Visual Analytics Science and Technology, ISSN 2325-9442
Keywords [en]
User-queries, Sketching, Time Series, Symbolic ap-proximation, Regular Expression, Shape Grammar
National Category
Engineering and Technology Computer Sciences Computer Systems Computer Vision and Robotics (Autonomous Systems) Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:liu:diva-134334DOI: 10.1109/VAST.2016.7883518ISI: 000402056500013ISBN: 978-1-5090-5661-3 (print)OAI: oai:DiVA.org:liu-134334DiVA, id: diva2:1071346
Conference
2016 IEEE CONFERENCE ON VISUAL ANALYTICS SCIENCE AND TECHNOLOGY (VAST), October 23-28, Baltimore, USA
Funder
Swedish Research Council, 2013-4939Available from: 2017-02-03 Created: 2017-02-03 Last updated: 2019-11-25Bibliographically approved
In thesis
1. Data Abstraction and Pattern Identification in Time-series Data
Open this publication in new window or tab >>Data Abstraction and Pattern Identification in Time-series Data
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Data sources such as simulations, sensor networks across many application domains generate large volumes of time-series data which exhibit characteristics that evolve over time. Visual data analysis methods can help us in exploring and understanding the underlying patterns present in time-series data but, due to their ever-increasing size, the visual data analysis process can become complex. Large data sets can be handled using data abstraction techniques by transforming the raw data into a simpler format while, at the same time, preserving significant features that are important for the user. When dealing with time-series data, abstraction techniques should also take into account the underlying temporal characteristics.  

This thesis focuses on different data abstraction and pattern identification methods particularly in the cases of large 1D time-series and 2D spatio-temporal time-series data which exhibit spatiotemporal discontinuity. Based on the dimensionality and characteristics of the data, this thesis proposes a variety of efficient data-adaptive and user-controlled data abstraction methods that transform the raw data into a symbol sequence. The transformation of raw time-series into a symbol sequence can act as input to different sequence analysis methods from data mining and machine learning communities to identify interesting patterns of user behavior.  

In the case of very long duration 1D time-series, locally adaptive and user-controlled data approximation methods were presented to simplify the data, while at the same time retaining the perceptually important features. The simplified data were converted into a symbol sequence and a sketch-based pattern identification was then used to identify patterns in the symbolic data using regular expression based pattern matching. The method was applied to financial time-series and patterns such as head-and-shoulders, double and triple-top patterns were identified using hand drawn sketches in an interactive manner. Through data smoothing, the data approximation step also enables visualization of inherent patterns in the time-series representation while at the same time retaining perceptually important points.  

Very long duration 2D spatio-temporal eye tracking data sets that exhibit spatio-temporal discontinuity was transformed into symbolic data using scalable clustering and hierarchical cluster merging processes, each of which can be parallelized. The raw data is transformed into a symbol sequence with each symbol representing a region of interest in the eye gaze data. The identified regions of interest can also be displayed in a Space-Time Cube (STC) that captures both the temporal and contextual information. Through interactive filtering, zooming and geometric transformation, the STC representation along with linked views enables interactive data exploration. Using different sequence analysis methods, the symbol sequences are analyzed further to identify temporal patterns in the data set. Data collected from air traffic control officers from the domain of Air traffic control were used as application examples to demonstrate the results.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2019. p. 58
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2030
National Category
Media Engineering
Identifiers
urn:nbn:se:liu:diva-162220 (URN)10.3384/diss.diva-162220 (DOI)9789179299651 (ISBN)
Public defence
2019-12-13, Domteatern, Visualiseringscenter C, Kungsgatan 54, 60233 Norrköping, Norrköping, 09:15 (English)
Opponent
Supervisors
Available from: 2019-11-25 Created: 2019-11-25 Last updated: 2019-11-25Bibliographically approved

Open Access in DiVA

Shape Grammar Extraction for Efficient Query-by-Sketch Pattern Matching in Long Time Series(1700 kB)176 downloads
File information
File name FULLTEXT02.pdfFile size 1700 kBChecksum SHA-512
68107c8e4ed8e101b0abfc19eb73c658931ec5de4ed3dd3cbe6c0ce8a4ccff3188187a1d547b661e8a1122c53ea9fa48b00041948aa0eaa6143b4943a1f6e6f8
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Muthumanickam, PrithivirajVrotsou, KaterinaCooper, MatthewJohansson, Jimmy
By organisation
Media and Information TechnologyFaculty of Science & Engineering
Engineering and TechnologyComputer SciencesComputer SystemsComputer Vision and Robotics (Autonomous Systems)Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 176 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 194 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf