liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Coloring Channel Representations for Visual Tracking
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-6096-3648
2015 (English)In: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen, Kim S. Pedersen, Springer, 2015, Vol. 9127, p. 117-129Conference paper, Published paper (Refereed)
Abstract [en]

Visual object tracking is a classical, but still open research problem in computer vision, with many real world applications. The problem is challenging due to several factors, such as illumination variation, occlusions, camera motion and appearance changes. Such problems can be alleviated by constructing robust, discriminative and computationally efficient visual features. Recently, biologically-inspired channel representations \cite{felsberg06PAMI} have shown to provide promising results in many applications ranging from autonomous driving to visual tracking.

This paper investigates the problem of coloring channel representations for visual tracking. We evaluate two strategies, channel concatenation and channel product, to construct channel coded color representations. The proposed channel coded color representations are generic and can be used beyond tracking.

Experiments are performed on 41 challenging benchmark videos. Our experiments clearly suggest that a careful selection of color feature together with an optimal fusion strategy, significantly outperforms the standard luminance based channel representation. Finally, we show promising results compared to state-of-the-art tracking methods in the literature.

Place, publisher, year, edition, pages
Springer, 2015. Vol. 9127, p. 117-129
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 9127
Keywords [en]
Visual tracking, channel coding, color names
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:liu:diva-121003DOI: 10.1007/978-3-319-19665-7_10ISBN: 978-3-319-19664-0 (print)ISBN: 978-3-319-19665-7 (print)OAI: oai:DiVA.org:liu-121003DiVA, id: diva2:850742
Conference
Scandinavian Conference on Image Analysis
Available from: 2015-09-02 Created: 2015-09-02 Last updated: 2018-04-25Bibliographically approved
In thesis
1. Learning Convolution Operators for Visual Tracking
Open this publication in new window or tab >>Learning Convolution Operators for Visual Tracking
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Visual tracking is one of the fundamental problems in computer vision. Its numerous applications include robotics, autonomous driving, augmented reality and 3D reconstruction. In essence, visual tracking can be described as the problem of estimating the trajectory of a target in a sequence of images. The target can be any image region or object of interest. While humans excel at this task, requiring little effort to perform accurate and robust visual tracking, it has proven difficult to automate. It has therefore remained one of the most active research topics in computer vision.

In its most general form, no prior knowledge about the object of interest or environment is given, except for the initial target location. This general form of tracking is known as generic visual tracking. The unconstrained nature of this problem makes it particularly difficult, yet applicable to a wider range of scenarios. As no prior knowledge is given, the tracker must learn an appearance model of the target on-the-fly. Cast as a machine learning problem, it imposes several major challenges which are addressed in this thesis.

The main purpose of this thesis is the study and advancement of the, so called, Discriminative Correlation Filter (DCF) framework, as it has shown to be particularly suitable for the tracking application. By utilizing properties of the Fourier transform, a correlation filter is discriminatively learned by efficiently minimizing a least-squares objective. The resulting filter is then applied to a new image in order to estimate the target location.

This thesis contributes to the advancement of the DCF methodology in several aspects. The main contribution regards the learning of the appearance model: First, the problem of updating the appearance model with new training samples is covered. Efficient update rules and numerical solvers are investigated for this task. Second, the periodic assumption induced by the circular convolution in DCF is countered by proposing a spatial regularization component. Third, an adaptive model of the training set is proposed to alleviate the impact of corrupted or mislabeled training samples. Fourth, a continuous-space formulation of the DCF is introduced, enabling the fusion of multiresolution features and sub-pixel accurate predictions. Finally, the problems of computational complexity and overfitting are addressed by investigating dimensionality reduction techniques.

As a second contribution, different feature representations for tracking are investigated. A particular focus is put on the analysis of color features, which had been largely overlooked in prior tracking research. This thesis also studies the use of deep features in DCF-based tracking. While many vision problems have greatly benefited from the advent of deep learning, it has proven difficult to harvest the power of such representations for tracking. In this thesis it is shown that both shallow and deep layers contribute positively. Furthermore, the problem of fusing their complementary properties is investigated.

The final major contribution of this thesis regards the prediction of the target scale. In many applications, it is essential to track the scale, or size, of the target since it is strongly related to the relative distance. A thorough analysis of how to integrate scale estimation into the DCF framework is performed. A one-dimensional scale filter is proposed, enabling efficient and accurate scale estimation.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2018. p. 71
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1926
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-147543 (URN)10.3384/diss.diva-147543 (DOI)9789176853320 (ISBN)
Public defence
2018-06-11, Ada Lovelace, B-huset, Campus Valla, Linköping, 13:00 (English)
Opponent
Supervisors
Available from: 2018-05-03 Created: 2018-04-25 Last updated: 2018-09-19Bibliographically approved

Open Access in DiVA

fulltext(1009 kB)137 downloads
File information
File name FULLTEXT02.pdfFile size 1009 kBChecksum SHA-512
4eeb1cdc6c6529da78c7969b8810ad674bb4c80812fed04edc50e9604e328ab8ba594ec6a5c14b38160516ee17102be81be1c50ba4c1df650c0dc7f014050423
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records BETA

Häger, GustavKhan, Fahad ShahbazFelsberg, Michael

Search in DiVA

By author/editor
Häger, GustavKhan, Fahad ShahbazFelsberg, Michael
By organisation
Computer VisionFaculty of Science & Engineering
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 137 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 1094 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf