liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Semi-automatic Annotation of Objects in Visual-Thermal Video
Linköping University, Department of Electrical Engineering, Computer Vision. Termisk Systemteknik AB, Linköping, Sweden.ORCID iD: 0000-0002-6591-9400
Linköping University, Department of Electrical Engineering, Computer Vision. Zenuity AB, Göteborg, Sweden.
Linköping University, Department of Electrical Engineering, Computer Vision. Grenoble INP, France.
Linköping University, Department of Electrical Engineering, Computer Vision. Termisk Systemteknik AB, Linköping, Sweden.ORCID iD: 0000-0002-6763-5487
Show others and affiliations
2019 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Deep learning requires large amounts of annotated data. Manual annotation of objects in video is, regardless of annotation type, a tedious and time-consuming process. In particular, for scarcely used image modalities human annotationis hard to justify. In such cases, semi-automatic annotation provides an acceptable option.

In this work, a recursive, semi-automatic annotation method for video is presented. The proposed method utilizesa state-of-the-art video object segmentation method to propose initial annotations for all frames in a video based on only a few manual object segmentations. In the case of a multi-modal dataset, the multi-modality is exploited to refine the proposed annotations even further. The final tentative annotations are presented to the user for manual correction.

The method is evaluated on a subset of the RGBT-234 visual-thermal dataset reducing the workload for a human annotator with approximately 78% compared to full manual annotation. Utilizing the proposed pipeline, sequences are annotated for the VOT-RGBT 2019 challenge.

Place, publisher, year, edition, pages
2019.
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:liu:diva-161076OAI: oai:DiVA.org:liu-161076DiVA, id: diva2:1362582
Conference
IEEE International Conference on Computer Vision Workshop (ICCVW)
Funder
Swedish Research Council, 2013-5703Swedish Foundation for Strategic Research Wallenberg AI, Autonomous Systems and Software Program (WASP)Vinnova, VS1810-QAvailable from: 2019-10-21 Created: 2019-10-21 Last updated: 2019-10-23
In thesis
1. Learning to Analyze what is Beyond the Visible Spectrum
Open this publication in new window or tab >>Learning to Analyze what is Beyond the Visible Spectrum
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing camera price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as there exists a measurable temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.

This thesis addresses the problem of automatic image analysis in thermal infrared images with a focus on machine learning methods. The main purpose of this thesis is to study the variations of processing required due to the thermal infrared data modality. In particular, three different problems are addressed: visual object tracking, anomaly detection, and modality transfer. All these are research areas that have been and currently are subject to extensive research. Furthermore, they are all highly relevant for a number of different real-world applications.

The first addressed problem is visual object tracking, a problem for which no prior information other than the initial location of the object is given. The main contribution concerns benchmarking of short-term single-object (STSO) visual object tracking methods in thermal infrared images. The proposed dataset, LTIR (Linköping Thermal Infrared), was integrated in the VOT-TIR2015 challenge, introducing the first ever organized challenge on STSO tracking in thermal infrared video. Another contribution also related to benchmarking is a novel, recursive, method for semi-automatic annotation of multi-modal video sequences. Based on only a few initial annotations, a video object segmentation (VOS) method proposes segmentations for all remaining frames and difficult parts in need for additional manual annotation are automatically detected. The third contribution to the problem of visual object tracking is a template tracking method based on a non-parametric probability density model of the object's thermal radiation using channel representations.

The second addressed problem is anomaly detection, i.e., detection of rare objects or events. The main contribution is a method for truly unsupervised anomaly detection based on Generative Adversarial Networks (GANs). The method employs joint training of the generator and an observation to latent space encoder, enabling stratification of the latent space and, thus, also separation of normal and anomalous samples. The second contribution is the previously unaddressed problem of obstacle detection in front of moving trains using a train-mounted thermal camera. Adaptive correlation filters are updated continuously and missed detections of background are treated as detections of anomalies, or obstacles. The third contribution to the problem of anomaly detection is a method for characterization and classification of automatically detected district heat leakages for the purpose of false alarm reduction.

Finally, the thesis addresses the problem of modality transfer between thermal infrared and visual spectrum images, a previously unaddressed problem. The contribution is a method based on Convolutional Neural Networks (CNNs), enabling perceptually realistic transformations of thermal infrared to visual images. By careful design of the loss function the method becomes robust to image pair misalignments. The method exploits the lower acuity for color differences than for luminance possessed by the human visual system, separating the loss into a luminance and a chrominance part.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2019. p. 94
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2024
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-161077 (URN)10.3384/diss.diva-161077 (DOI)9789179299811 (ISBN)
Public defence
2019-12-18, Ada Lovelace, B-huset, Campus Valla, Linköping, 13:15 (English)
Opponent
Supervisors
Funder
Swedish Research Council, D0570301
Available from: 2019-11-13 Created: 2019-10-23 Last updated: 2019-12-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Search in DiVA

By author/editor
Berg, AmandaJohnander, JoakimAhlberg, JörgenFelsberg, Michael
By organisation
Computer Vision
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 49 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf