liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-6096-3648
2016 (English)In: 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CPVR), IEEE , 2016, 1430-1438 p.Conference paper (Refereed)
Abstract [en]

Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.

Place, publisher, year, edition, pages
IEEE , 2016. 1430-1438 p.
Series
IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:liu:diva-137882DOI: 10.1109/CVPR.2016.159ISI: 000400012301051ISBN: 978-1-4673-8851-1 OAI: oai:DiVA.org:liu-137882DiVA: diva2:1104732
Conference
29th IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Note

Funding Agencies|SSF (CUAS); VR (EMC2); VR (ELLIIT); Wallenberg Autonomous Systems Program; NSC; Nvidia

Available from: 2017-06-01 Created: 2017-06-01 Last updated: 2017-06-01

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Danelljan, MartinHäger, GustavKhan, FahadFelsberg, Michael
By organisation
Computer VisionFaculty of Science & Engineering
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 30 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf