liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
Publications (10 of 14) Show all publications
Häger, G. (2021). Learning visual perception for autonomous systems. (Doctoral dissertation). Linköping: Linköping University Electronic Press
Open this publication in new window or tab >>Learning visual perception for autonomous systems
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In the last decade, developments in hardware, sensors and software have made it possible to create increasingly autonomous systems. These systems can be as simple as limited driver assistance software lane-following in cars, or limited collision warning systems for otherwise manually piloted drones. On the other end of the spectrum exist fully autonomous cars, boats or helicopters. With increasing abilities to function autonomously, the demands to operate with minimal human supervision in unstructured environments increase accordingly.

Common to most, if not all, autonomous systems is that they require an accurate model of the surrounding world. While there is currently a large number of possible sensors useful to create such models available, cameras are one of the most versatile. From a sensing perspective cameras have several advantages over other sensors in that they require no external infrastructure, are relatively cheap and can be used to extract such information as the relative positions of other objects, their movements over time, create accurate maps and locate the autonomous system within these maps.

Using cameras to produce a model of the surroundings require solving a number of technical problems. Often these problems have a basis in recognizing that an object or region of interest is the same over time or in novel viewpoints. In visual tracking this type of recognition is required to follow an object of interest through a sequence of images. In geometric problems it is often a requirement to recognize corresponding image regions in order to perform 3D reconstruction or localization. 

The first set of contributions in this thesis is related to the improvement of a class of on-line learned visual object trackers based on discriminative correlation filters. In visual tracking estimation of the objects size is important for reliable tracking, the first contribution in this part of the thesis investigates this problem. The performance of discriminative correlation filters is highly dependent on what feature representation is used by the filter. The second tracking contribution investigates the performance impact of different features derived from a deep neural network.

A second set of contributions relate to the evaluation of visual object trackers. The first of these are the visual object tracking challenge. This challenge is a yearly comparison of state-of-the art visual tracking algorithms. A second contribution is an investigation into the possible issues when using bounding-box representations for ground-truth data.

In real world settings tracking typically occur over longer time sequences than is common in benchmarking datasets. In such settings it is common that the model updates of many tracking algorithms cause the tracker to fail silently. For this reason it is important to have an estimate of the trackers performance even in cases when no ground-truth annotations exist. The first of the final three contributions investigates this problem in a robotics setting, by fusing information from a pre-trained object detector in a state-estimation framework. An additional contribution describes how to dynamically re-weight the data used for the appearance model of a tracker. A final contribution investigates how to obtain an estimate of how certain detections are in a setting where geometrical limitations can be imposed on the search region. The proposed solution learns to accurately predict stereo disparities along with accurate assessments of each predictions certainty.

Abstract [sv]

De senaste årens allt snabbare utveckling av beräkningshårdvara, sensorer och mjukvarutekniker har gjort det möjligt att skapa allt mer autonoma system. Sådana kan variera i autonomigrad från ett antisladdsystem för en i övrigt manuellt kontrollerad bil, till system för kollisionsundvikning i en manuellt kontrollerad drönare, till en helt autonom bil eller annan farkost. Med en ökande förmåga att arbeta självständigt utan mänsklig övervakning ökar också bredden på möjliga situationer som systemen förväntas hantera. 

Gemensamt för många, om inte alla, autonoma system är att de behöver en korrekt och updaterad bild av sin omgivning för att kunna agera på ett intelligent sätt. En lång rad av sensorer som gör detta möjligt finns tillgängliga, där kameror är en av de mest mångsidiga. Jämfört med andra typer av sensorer har kameror en rad fördelar, som att de är relativt billiga, passiva, och kan användas utan krav på extern infrastruktur. Det visuella data som kameror genererar kan användas för att följa externa objekt, bestämma positionen för kameran själv, eller beräkna avstånd. 

Att framgångsrikt utnyttja möjligheterna i denna information kräver dock att en lång rad tekniska problem hanteras. Många av dessa problem är grundar sig i att kunna känna igen att två bildregioner från olika tidpunkter eller betraktningsvinklar avbildar samma sak. 

Ett typexempel på ett sådant problem är det visuella följningsproblemet. I det visuella följningsproblemet är målet att bestämma ett objekts position och storlek för alla bilder i en sekvens av bilder. I allmänhet är objektets utseende inte känt av algoritmen, utan en utseendemodell måste skapas succesivt med hjälp av maskininlärning. 

Problem som liknar detta förekommer inom många andra områden av datorseende, speciellt inom geometri. Inom många geometriska problem krävs det till exempel att man finner korresponderande punkter i ett flertal bilder. 

Den första samlingen av bidrag i denna avhandling behandlar det visuella följningsproblemet. De föreslagna metoderna är baserade på en adaptiv utseendemodell kallad diskriminativa korrelationsfilter. I det första bidraget till sådana metoder utökas ramverket till att skatta ett objekts storlek såväl som position. Ett andra bidrag undersöker hur korrelationsfilterbaserade metoder kan utökas till att även utnyttja visuella särdrag som har framställt med hjälp av maskininlärning. 

En andra samling med bidrag behandlar utvärdering av metoder för visuell följning. Dels inom den årligt förekommande tävlingen visual object tracking challenge. Ett andra bidrag till utvärderingsmetodig inom visuell följning syftar till att unvdika fallgropar som lätt uppkommer då metoder anpassas allt för väl för måtten som används för att utvärdera dem. 

En tredje samling med bidrag relaterar till olika sätt att hantera situationer då inlärningsprocessen i de tidigare beskrivna följningsmetoderna introducerar felaktiga data i modellen. Detta görs i ett första bidrag i ett robotiksystem för följning av människor i en ostrukturerad miljö. Ett andra bidrag är baserat på dynamisk omviktning av tidigare samlad data för att dynamiskt vikta ned datapunkter som inte representerar det följda objektet väl. I ett sista bidrag undersöks hur en prediktions osäkerhet kan skattas samtidigt som prediktionen själv.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2021. p. 49
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2138
Keywords
computer vision, visual object tracking, tracking, machine learning, deep learning
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-175177 (URN)10.3384/diss.diva-175177 (DOI)9789179296711 (ISBN)
Public defence
2021-06-04, Ada Lovelace, B-Building, Campus Valla, Linköping, 09:15 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2021-05-04 Created: 2021-04-20 Last updated: 2025-02-07Bibliographically approved
Persson, M., Häger, G., Ovrén, H. & Forssén, P.-E. (2021). Practical Pose Trajectory Splines With Explicit Regularization. In: 2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021): . Paper presented at 9th International Conference on 3D Vision (3DV), ELECTR NETWORK, dec 01-03, 2021 (pp. 156-165). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Practical Pose Trajectory Splines With Explicit Regularization
2021 (English)In: 2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 156-165Conference paper, Published paper (Refereed)
Abstract [en]

We investigate spline-based continuous-time pose trajectory estimation using non-linear explicit motion priors. Current regularization priors either linearize the orientation, rely on the implicit regularization obtained from the used spline basis function, or use sampling based regularization schemes. The latter is a special case of a Riemann sum approximation, and we demonstrate when and why this can fail, and propose a way to avoid these issues. In addition we provide a number of novel practically useful theoretical contributions, including requirements on knot spacing for orientation splines, new basis functions for constant velocity extrapolation, and a generalization of the popular P-Spline penalty to orientation. We analyze the properties of the proposed approach using synthetic data. We validate our system using the standard task of visual-inertial calibration, and apply it to stereo visual odometry where we demonstrate real-time performance on KITTI.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Series
International Conference on 3D Vision, ISSN 2378-3826, E-ISSN 2475-7888
National Category
Computer graphics and computer vision Computer Sciences
Identifiers
urn:nbn:se:liu:diva-182729 (URN)10.1109/3DV53792.2021.00026 (DOI)000786496000016 ()2-s2.0-85125011120 (Scopus ID)9781665426886 (ISBN)9781665426893 (ISBN)
Conference
9th International Conference on 3D Vision (3DV), ELECTR NETWORK, dec 01-03, 2021
Funder
Vinnova
Note

Funding: Vinnova through the Visual Sweden networkVinnova [Dnr 2019-02261]

Available from: 2022-02-07 Created: 2022-02-07 Last updated: 2026-03-16Bibliographically approved
Häger, G., Persson, M. & Felsberg, M. (2021). Predicting Disparity Distributions. In: 2021 IEEE International Conference on Robotics and Automation (ICRA): . Paper presented at IEEE Conference on robotics and automation 2021,Xi'an, China, 30 May-5 June 2021. IEEE
Open this publication in new window or tab >>Predicting Disparity Distributions
2021 (English)In: 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2021Conference paper, Published paper (Refereed)
Abstract [en]

We investigate a novel deep-learning-based approach to estimate uncertainty in stereo disparity prediction networks. Current state-of-the-art methods often formulate disparity prediction as a regression problem with a single scalar output in each pixel. This can be problematic in practical applications as in many cases there might not exist a single well defined disparity, for example in cases of occlusions or at depth-boundaries. While current neural-network-based disparity estimation approaches  obtain good performance on benchmarks, the disparity prediction is treated as a black box at inference time. In this paper we show that by formulating the learning problem as a regression with a distribution target, we obtain a robust estimate of the uncertainty in each pixel, while maintaining the performance of the original method. The proposed method is evaluated both on a large-scale standard benchmark, as well on our own data. We also show that the uncertainty estimate significantly improves by maximizing the uncertainty in those pixels that have no well defined disparity during learning.

Place, publisher, year, edition, pages
IEEE, 2021
Series
IEEE International Conference on Robotics and Automation (ICRA), ISSN 1050-4729, E-ISSN 2577-087X
Keywords
Uncertainty, Automation, Conferences, Estimation, Benchmark testing, Standards
National Category
Robotics and automation Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-179770 (URN)10.1109/ICRA48506.2021.9561617 (DOI)000765738803062 ()2-s2.0-85125504242 (Scopus ID)978-1-7281-9077-8 (ISBN)978-1-7281-9078-5 (ISBN)
Conference
IEEE Conference on robotics and automation 2021,Xi'an, China, 30 May-5 June 2021
Available from: 2021-10-01 Created: 2021-10-01 Last updated: 2025-02-05
Häger, G., Felsberg, M. & Khan, F. S. (2018). Countering bias in tracking evaluations. In: Francisco Imai, Alain Tremeau and Jose Braz (Ed.), Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications: . Paper presented at 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, January 27-29, Funchal, Madeira (pp. 581-587). Science and Technology Publications, Lda, 5
Open this publication in new window or tab >>Countering bias in tracking evaluations
2018 (English)In: Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications / [ed] Francisco Imai, Alain Tremeau and Jose Braz, Science and Technology Publications, Lda , 2018, Vol. 5, p. 581-587Conference paper, Published paper (Refereed)
Abstract [en]

Recent years have witnessed a significant leap in visual object tracking performance mainly due to powerfulfeatures, sophisticated learning methods and the introduction of benchmark datasets. Despite this significantimprovement, the evaluation of state-of-the-art object trackers still relies on the classical intersection overunion (IoU) score. In this work, we argue that the object tracking evaluations based on classical IoU score aresub-optimal. As our first contribution, we theoretically prove that the IoU score is biased in the case of largetarget objects and favors over-estimated target prediction sizes. As our second contribution, we propose a newscore that is unbiased with respect to target prediction size. We systematically evaluate our proposed approachon benchmark tracking data with variations in relative target size. Our empirical results clearly suggest thatthe proposed score is unbiased in general.

Place, publisher, year, edition, pages
Science and Technology Publications, Lda, 2018
National Category
Signal Processing
Identifiers
urn:nbn:se:liu:diva-151306 (URN)10.5220/0006714805810587 (DOI)000576679800066 ()9789897582905 (ISBN)
Conference
13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, January 27-29, Funchal, Madeira
Available from: 2018-09-17 Created: 2018-09-17 Last updated: 2021-07-15Bibliographically approved
Danelljan, M., Häger, G., Khan, F. S. & Felsberg, M. (2017). Discriminative Scale Space Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(8), 1561-1575
Open this publication in new window or tab >>Discriminative Scale Space Tracking
2017 (English)In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 39, no 8, p. 1561-1575Article in journal (Refereed) Published
Abstract [en]

Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation. The explicit scale filter is learned online using the target appearance sampled at a set of different scales. Contrary to standard approaches, our method directly learns the appearance change induced by variations in the target scale. Additionally, we investigate strategies to reduce the computational cost of our approach. Extensive experiments are performed on the OTB and the VOT2014 datasets. Compared to the standard exhaustive scale search, our approach achieves a gain of 2.5 percent in average overlap precision on the OTB dataset. Additionally, our method is computationally efficient, operating at a 50 percent higher frame rate compared to the exhaustive scale search. Our method obtains the top rank in performance by outperforming 19 state-of-the-art trackers on OTB and 37 state-of-the-art trackers on VOT2014.

Place, publisher, year, edition, pages
IEEE COMPUTER SOC, 2017
Keywords
Visual tracking; scale estimation; correlation filters
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-139382 (URN)10.1109/TPAMI.2016.2609928 (DOI)000404606300006 ()27654137 (PubMedID)
Note

Funding Agencies|Swedish Foundation for Strategic Research; Swedish Research Council; Strategic Vehicle Research and Innovation (FFI); Wallenberg Autonomous Systems Program; National Supercomputer Centre; Nvidia

Available from: 2017-08-07 Created: 2017-08-07 Last updated: 2025-02-07Bibliographically approved
Danelljan, M., Häger, G., Khan, F. S. & Felsberg, M. (2016). Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR): . Paper presented at 29th IEEE Conference on Computer Vision and Pattern Recognition, 27-30 June 2016, Las Vegas, NV, USA (pp. 1430-1438). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
2016 (English)In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 1430-1438Conference paper, Published paper (Refereed)
Abstract [en]

Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2016
Series
IEEE Conference on Computer Vision and Pattern Recognition, E-ISSN 1063-6919 ; 2016
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-137882 (URN)10.1109/CVPR.2016.159 (DOI)000400012301051 ()9781467388511 (ISBN)9781467388528 (ISBN)
Conference
29th IEEE Conference on Computer Vision and Pattern Recognition, 27-30 June 2016, Las Vegas, NV, USA
Note

Funding Agencies|SSF (CUAS); VR (EMC2); VR (ELLIIT); Wallenberg Autonomous Systems Program; NSC; Nvidia

Available from: 2017-06-01 Created: 2017-06-01 Last updated: 2025-02-07Bibliographically approved
Berg, A., Felsberg, M., Häger, G. & Ahlberg, J. (2016). An Overview of the Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge. In: : . Paper presented at Swedish Symposium on Image Analysis.
Open this publication in new window or tab >>An Overview of the Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge
2016 (English)Conference paper, Oral presentation only (Other academic)
Abstract [en]

The Thermal Infrared Visual Object Tracking (VOT-TIR2015) Challenge was organized in conjunction with ICCV2015. It was the first benchmark on short-term,single-target tracking in thermal infrared (TIR) sequences. The challenge aimed at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. It was based on the VOT2013 Challenge, but introduced the following novelties: (i) the utilization of the LTIR (Linköping TIR) dataset, (ii) adaption of the VOT2013 attributes to thermal data, (iii) a similar evaluation to that of VOT2015. This paper provides an overview of the VOT-TIR2015 Challenge as well as the results of the 24 participating trackers.

Series
Svenska sällskapet för automatiserad bildanalys (SSBA)
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-127598 (URN)
Conference
Swedish Symposium on Image Analysis
Available from: 2016-05-03 Created: 2016-05-03 Last updated: 2025-02-07Bibliographically approved
Häger, G., Bhat, G., Danelljan, M., Khan, F. S., Felsberg, M., Rudol, P. & Doherty, P. (2016). Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV. In: Proceedings of the 12th International Symposium on Advances in Visual Computing: . Paper presented at International Symposium on Advances in Visual Computing. Springer
Open this publication in new window or tab >>Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV
Show others...
2016 (English)In: Proceedings of the 12th International Symposium on Advances in Visual Computing, Springer, 2016Conference paper, Published paper (Refereed)
Abstract [en]

Visual object tracking performance has improved significantly in recent years. Most trackers are based on either of two paradigms: online learning of an appearance model or the use of a pre-trained object detector. Methods based on online learning provide high accuracy, but are prone to model drift. The model drift occurs when the tracker fails to correctly estimate the tracked object’s position. Methods based on a detector on the other hand typically have good long-term robustness, but reduced accuracy compared to online methods.

Despite the complementarity of the aforementioned approaches, the problem of fusing them into a single framework is largely unexplored. In this paper, we propose a novel fusion between an online tracker and a pre-trained detector for tracking humans from a UAV. The system operates at real-time on a UAV platform. In addition we present a novel dataset for long-term tracking in a UAV setting, that includes scenarios that are typically not well represented in standard visual tracking datasets.

Place, publisher, year, edition, pages
Springer, 2016
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-137897 (URN)10.1007/978-3-319-50835-1_50 (DOI)2-s2.0-85007039301 (Scopus ID)978-3-319-50834-4 (ISBN)978-3-319-50835-1 (ISBN)
Conference
International Symposium on Advances in Visual Computing
Available from: 2017-05-31 Created: 2017-05-31 Last updated: 2025-02-07Bibliographically approved
Danelljan, M., Häger, G., Khan, F. S. & Felsberg, M. (2015). Coloring Channel Representations for Visual Tracking. In: Rasmus R. Paulsen, Kim S. Pedersen (Ed.), 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings: . Paper presented at Scandinavian Conference on Image Analysis (pp. 117-129). Springer, 9127
Open this publication in new window or tab >>Coloring Channel Representations for Visual Tracking
2015 (English)In: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen, Kim S. Pedersen, Springer, 2015, Vol. 9127, p. 117-129Conference paper, Published paper (Refereed)
Abstract [en]

Visual object tracking is a classical, but still open research problem in computer vision, with many real world applications. The problem is challenging due to several factors, such as illumination variation, occlusions, camera motion and appearance changes. Such problems can be alleviated by constructing robust, discriminative and computationally efficient visual features. Recently, biologically-inspired channel representations \cite{felsberg06PAMI} have shown to provide promising results in many applications ranging from autonomous driving to visual tracking.

This paper investigates the problem of coloring channel representations for visual tracking. We evaluate two strategies, channel concatenation and channel product, to construct channel coded color representations. The proposed channel coded color representations are generic and can be used beyond tracking.

Experiments are performed on 41 challenging benchmark videos. Our experiments clearly suggest that a careful selection of color feature together with an optimal fusion strategy, significantly outperforms the standard luminance based channel representation. Finally, we show promising results compared to state-of-the-art tracking methods in the literature.

Place, publisher, year, edition, pages
Springer, 2015
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 9127
Keywords
Visual tracking, channel coding, color names
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-121003 (URN)10.1007/978-3-319-19665-7_10 (DOI)978-3-319-19664-0 (ISBN)978-3-319-19665-7 (ISBN)
Conference
Scandinavian Conference on Image Analysis
Available from: 2015-09-02 Created: 2015-09-02 Last updated: 2025-02-07Bibliographically approved
Danelljan, M., Häger, G., Khan, F. S. & Felsberg, M. (2015). Convolutional Features for Correlation Filter Based Visual Tracking. In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW): . Paper presented at 15th IEEE International Conference on Computer Vision Workshops, ICCVW 2015, 7-13 December 2015, Santiago, Chile (pp. 621-629). IEEE conference proceedings
Open this publication in new window or tab >>Convolutional Features for Correlation Filter Based Visual Tracking
2015 (English)In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), IEEE conference proceedings, 2015, p. 621-629Conference paper, Published paper (Refereed)
Abstract [en]

Visual object tracking is a challenging computer vision problem with numerous real-world applications. This paper investigates the impact of convolutional features for the visual tracking problem. We propose to use activations from the convolutional layer of a CNN in discriminative correlation filter based tracking frameworks. These activations have several advantages compared to the standard deep features (fully connected layers). Firstly, they mitigate the need of task specific fine-tuning. Secondly, they contain structural information crucial for the tracking problem. Lastly, these activations have low dimensionality. We perform comprehensive experiments on three benchmark datasets: OTB, ALOV300++ and the recently introduced VOT2015. Surprisingly, different to image classification, our results suggest that activations from the first layer provide superior tracking performance compared to the deeper layers. Our results further show that the convolutional features provide improved results compared to standard handcrafted features. Finally, results comparable to state-of-theart trackers are obtained on all three benchmark datasets.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2015
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-128869 (URN)10.1109/ICCVW.2015.84 (DOI)000380434700075 ()9781467397117 (ISBN)9781467397100 (ISBN)
Conference
15th IEEE International Conference on Computer Vision Workshops, ICCVW 2015, 7-13 December 2015, Santiago, Chile
Available from: 2016-06-02 Created: 2016-06-02 Last updated: 2025-02-07Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6199-9362

Search in DiVA

Show all publications