liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
BETA
Bhat, Goutam
Publications (7 of 7) Show all publications
Danelljan, M., Bhat, G., Gladh, S., Khan, F. S. & Felsberg, M. (2019). Deep motion and appearance cues for visual tracking. Pattern Recognition Letters, 124, 74-81
Open this publication in new window or tab >>Deep motion and appearance cues for visual tracking
Show others...
2019 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 124, p. 74-81Article in journal (Refereed) Published
Abstract [en]

Generic visual tracking is a challenging computer vision problem, with numerous applications. Most existing approaches rely on appearance information by employing either hand-crafted features or deep RGB features extracted from convolutional neural networks. Despite their success, these approaches struggle in case of ambiguous appearance information, leading to tracking failure. In such cases, we argue that motion cue provides discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. In this paper, we investigate the impact of deep motion features in a tracking-by-detection framework. We also evaluate the fusion of hand-crafted, deep RGB, and deep motion features and show that they contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly demonstrate that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.

Place, publisher, year, edition, pages
Elsevier, 2019
Keywords
Visual tracking, Deep learning, Optical flow, Discriminative correlation filters
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:liu:diva-148015 (URN)10.1016/j.patrec.2018.03.009 (DOI)000469427700008 ()2-s2.0-85044328745 (Scopus ID)
Note

Funding agencies: Swedish Foundation for Strategic Research; Swedish Research Council [2016-05543]; Wallenberg Autonomous Systems Program; Swedish National Infrastructure for Computing (SNIC); Nvidia

Available from: 2018-05-24 Created: 2018-05-24 Last updated: 2019-06-24Bibliographically approved
Bhat, G., Danelljan, M., Khan, F. S. & Felsberg, M. (2018). Combining Local and Global Models for Robust Re-detection. In: Proceedings of AVSS 2018. 2018 IEEE International Conference on Advanced Video and Signal-based Surveillance, Auckland, New Zealand, 27-30 November 2018: . Paper presented at 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 27-30 November, Auckland, New Zealand (pp. 25-30). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Combining Local and Global Models for Robust Re-detection
2018 (English)In: Proceedings of AVSS 2018. 2018 IEEE International Conference on Advanced Video and Signal-based Surveillance, Auckland, New Zealand, 27-30 November 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 25-30Conference paper, Published paper (Refereed)
Abstract [en]

Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual tracking. However, these methods still struggle in occlusion and out-of-view scenarios due to the absence of a re-detection component. While such a component requires global knowledge of the scene to ensure robust re-detection of the target, the standard DCF is only trained on the local target neighborhood. In this paper, we augment the state-of-the-art DCF tracking framework with a re-detection component based on a global appearance model. First, we introduce a tracking confidence measure to detect target loss. Next, we propose a hard negative mining strategy to extract background distractors samples, used for training the global model. Finally, we propose a robust re-detection strategy that combines the global and local appearance model predictions. We perform comprehensive experiments on the challenging UAV123 and LTB35 datasets. Our approach shows consistent improvements over the baseline tracker, setting a new state-of-the-art on both datasets.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2018
National Category
Computer Vision and Robotics (Autonomous Systems) Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-158403 (URN)10.1109/AVSS.2018.8639159 (DOI)000468081400005 ()9781538692943 (ISBN)9781538692936 (ISBN)9781538692950 (ISBN)
Conference
15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 27-30 November, Auckland, New Zealand
Note

Funding Agencies|SSF (SymbiCloud); VR (EMC2) [2016-05543]; CENIIT grant [18.14]; SNIC; WASP

Available from: 2019-06-28 Created: 2019-06-28 Last updated: 2019-10-30Bibliographically approved
Johnander, J., Bhat, G., Danelljan, M., Khan, F. S. & Felsberg, M. (2018). On the Optimization of Advanced DCF-Trackers. In: Laura Leal-TaixéStefan Roth (Ed.), Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8-14, 2018, Proceedings, Part I. Paper presented at Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8-14 September, 2018 (pp. 54-69). Cham: Springer Publishing Company
Open this publication in new window or tab >>On the Optimization of Advanced DCF-Trackers
Show others...
2018 (English)In: Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8-14, 2018, Proceedings, Part I / [ed] Laura Leal-TaixéStefan Roth, Cham: Springer Publishing Company, 2018, p. 54-69Conference paper, Published paper (Refereed)
Abstract [en]

Trackers based on discriminative correlation filters (DCF) have recently seen widespread success and in this work we dive into their numerical core. DCF-based trackers interleave learning of the target detector and target state inference based on this detector. Whereas the original formulation includes a closed-form solution for the filter learning, recently introduced improvements to the framework no longer have known closed-form solutions. Instead a large-scale linear least squares problem must be solved each time the detector is updated. We analyze the procedure used to optimize the detector and let the popular scheme introduced with ECO serve as a baseline. The ECO implementation is revisited in detail and several mechanisms are provided with alternatives. With comprehensive experiments we show which configurations are superior in terms of tracking capabilities and optimization performance.

Place, publisher, year, edition, pages
Cham: Springer Publishing Company, 2018
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11129
National Category
Engineering and Technology Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-161036 (URN)10.1007/978-3-030-11009-3_2 (DOI)9783030110086 (ISBN)9783030110093 (ISBN)
Conference
Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8-14 September, 2018
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2019-10-17 Created: 2019-10-17 Last updated: 2019-10-30Bibliographically approved
Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Zajc, L. C., . . . He, Z. (2018). The Sixth Visual Object Tracking VOT2018 Challenge Results. In: Laura Leal-Taixé and Stefan Roth (Ed.), Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8–14, 2018 Proceedings, Part I. Paper presented at Computer Vision – ECCV 2018 Workshops, Munich, Germany, September 8–14, 2018 (pp. 3-53). Cham: Springer Publishing Company
Open this publication in new window or tab >>The Sixth Visual Object Tracking VOT2018 Challenge Results
Show others...
2018 (English)In: Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8–14, 2018 Proceedings, Part I / [ed] Laura Leal-Taixé and Stefan Roth, Cham: Springer Publishing Company, 2018, p. 3-53Conference paper, Published paper (Refereed)
Abstract [en]

The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Results of over eighty trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis and a “real-time” experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. A long-term tracking subchallenge has been introduced to the set of standard VOT sub-challenges. The new subchallenge focuses on long-term tracking properties, namely coping with target disappearance and reappearance. A new dataset has been compiled and a performance evaluation methodology that focuses on long-term tracking capabilities has been adopted. The VOT toolkit has been updated to support both standard short-term and the new long-term tracking subchallenges. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).

Place, publisher, year, edition, pages
Cham: Springer Publishing Company, 2018
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11129
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Sciences
Identifiers
urn:nbn:se:liu:diva-161343 (URN)10.1007/978-3-030-11009-3_1 (DOI)9783030110086 (ISBN)9783030110093 (ISBN)
Conference
Computer Vision – ECCV 2018 Workshops, Munich, Germany, September 8–14, 2018
Available from: 2019-10-30 Created: 2019-10-30 Last updated: 2020-01-22Bibliographically approved
Bhat, G., Johnander, J., Danelljan, M., Khan, F. S. & Felsberg, M. (2018). Unveiling the power of deep tracking. In: Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu and Yair Weiss (Ed.), Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part II. Paper presented at 15th European Conference on Computer Vision (ECCV). Munich, Germany, 8-14 September, 2018 (pp. 493-509). Cham: Springer Publishing Company
Open this publication in new window or tab >>Unveiling the power of deep tracking
Show others...
2018 (English)In: Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part II / [ed] Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu and Yair Weiss, Cham: Springer Publishing Company, 2018, p. 493-509Conference paper, Published paper (Refereed)
Abstract [en]

In the field of generic object tracking numerous attempts have been made to exploit deep features. Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features. In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking. We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness. We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy. Extensive experiments are performed on four challenging datasets. On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of >17% in EAO.

Place, publisher, year, edition, pages
Cham: Springer Publishing Company, 2018
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11206
National Category
Computer Vision and Robotics (Autonomous Systems) Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-161032 (URN)10.1007/978-3-030-01216-8_30 (DOI)9783030012151 (ISBN)9783030012168 (ISBN)
Conference
15th European Conference on Computer Vision (ECCV). Munich, Germany, 8-14 September, 2018
Available from: 2019-10-17 Created: 2019-10-17 Last updated: 2019-10-30Bibliographically approved
Järemo-Lawin, F., Danelljan, M., Tosteberg, P., Bhat, G., Khan, F. S. & Felsberg, M. (2017). Deep Projective 3D Semantic Segmentation. In: Michael Felsberg, Anders Heyden and Norbert Krüger (Ed.), Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I. Paper presented at 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I (pp. 95-107). Springer
Open this publication in new window or tab >>Deep Projective 3D Semantic Segmentation
Show others...
2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, p. 95-107Conference paper, Published paper (Refereed)
Abstract [en]

Semantic segmentation of 3D point clouds is a challenging problem with numerous real-world applications. While deep learning has revolutionized the field of image semantic segmentation, its impact on point cloud data has been limited so far. Recent attempts, based on 3D deep learning approaches (3D-CNNs), have achieved below-expected results. Such methods require voxelizations of the underlying point cloud data, leading to decreased spatial resolution and increased memory consumption. Additionally, 3D-CNNs greatly suffer from the limited availability of annotated datasets.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10424
Keywords
Point clouds, Semantic segmentation, Deep learning, Multi-stream deep networks
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Engineering
Identifiers
urn:nbn:se:liu:diva-145374 (URN)10.1007/978-3-319-64689-3_8 (DOI)000432085900008 ()2-s2.0-85028506569 (Scopus ID)9783319646886 (ISBN)9783319646893 (ISBN)
Conference
17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I
Note

Funding agencies: EU [644839]; Swedish Research Council [2014-6227]; Swedish Foundation for Strategic Research [RIT 15-0097]; VR starting grant [2016-05543]

Available from: 2018-02-26 Created: 2018-02-26 Last updated: 2018-10-10Bibliographically approved
Danelljan, M., Bhat, G., Khan, F. S. & Felsberg, M. (2017). ECO: Efficient Convolution Operators for Tracking. In: Proceedings 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR): . Paper presented at 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017, Honolulu, HI, USA (pp. 6931-6939). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>ECO: Efficient Convolution Operators for Tracking
2017 (English)In: Proceedings 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 6931-6939Conference paper, Published paper (Refereed)
Abstract [en]

In recent years, Discriminative Correlation Filter (DCF) based methods have significantly advanced the state-of-the-art in tracking. However, in the pursuit of ever increasing tracking performance, their characteristic speed and real-time capability have gradually faded. Further, the increasingly complex models, with massive number of trainable parameters, have introduced the risk of severe over-fitting. In this work, we tackle the key causes behind the problems of computational complexity and over-fitting, with the aim of simultaneously improving both speed and performance. We revisit the core DCF formulation and introduce: (i) a factorized convolution operator, which drastically reduces the number of parameters in the model; (ii) a compact generative model of the training sample distribution, that significantly reduces memory and time complexity, while providing better diversity of samples; (iii) a conservative model update strategy with improved robustness and reduced complexity. We perform comprehensive experiments on four benchmarks: VOT2016, UAV123, OTB-2015, and Temple-Color. When using expensive deep features, our tracker provides a 20-fold speedup and achieves a 13.0% relative gain in Expected Average Overlap compared to the top ranked method [12] in the VOT2016 challenge. Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65.0% AUC on OTB-2015.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017
Series
IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919 ; 2017
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-144284 (URN)10.1109/CVPR.2017.733 (DOI)000418371407004 ()9781538604571 (ISBN)9781538604588 (ISBN)
Conference
30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017, Honolulu, HI, USA
Note

Funding Agencies|SSF (SymbiCloud); VR (EMC2) [2016-05543]; SNIC; WASP; Visual Sweden; Nvidia

Available from: 2018-01-12 Created: 2018-01-12 Last updated: 2019-06-26Bibliographically approved
Organisations

Search in DiVA

Show all publications