liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
BETA
Khan, Fahad Shahbaz
Alternative names
Publications (10 of 33) Show all publications
Danelljan, M., Bhat, G., Gladh, S., Khan, F. S. & Felsberg, M. (2019). Deep motion and appearance cues for visual tracking. Pattern Recognition Letters, 124, 74-81
Open this publication in new window or tab >>Deep motion and appearance cues for visual tracking
Show others...
2019 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 124, p. 74-81Article in journal (Refereed) Published
Abstract [en]

Generic visual tracking is a challenging computer vision problem, with numerous applications. Most existing approaches rely on appearance information by employing either hand-crafted features or deep RGB features extracted from convolutional neural networks. Despite their success, these approaches struggle in case of ambiguous appearance information, leading to tracking failure. In such cases, we argue that motion cue provides discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. In this paper, we investigate the impact of deep motion features in a tracking-by-detection framework. We also evaluate the fusion of hand-crafted, deep RGB, and deep motion features and show that they contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly demonstrate that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.

Place, publisher, year, edition, pages
Elsevier, 2019
Keywords
Visual tracking, Deep learning, Optical flow, Discriminative correlation filters
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:liu:diva-148015 (URN)10.1016/j.patrec.2018.03.009 (DOI)000469427700008 ()2-s2.0-85044328745 (Scopus ID)
Note

Funding agencies: Swedish Foundation for Strategic Research; Swedish Research Council [2016-05543]; Wallenberg Autonomous Systems Program; Swedish National Infrastructure for Computing (SNIC); Nvidia

Available from: 2018-05-24 Created: 2018-05-24 Last updated: 2019-06-24Bibliographically approved
Häger, G., Felsberg, M. & Khan, F. S. (2018). Countering bias in tracking evaluations. In: Francisco Imai, Alain Tremeau and Jose Braz (Ed.), Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications: . Paper presented at 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, January 27-29, Funchal, Madeira (pp. 581-587). Science and Technology Publications, Lda, 5
Open this publication in new window or tab >>Countering bias in tracking evaluations
2018 (English)In: Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications / [ed] Francisco Imai, Alain Tremeau and Jose Braz, Science and Technology Publications, Lda , 2018, Vol. 5, p. 581-587Conference paper, Published paper (Refereed)
Abstract [en]

Recent years have witnessed a significant leap in visual object tracking performance mainly due to powerfulfeatures, sophisticated learning methods and the introduction of benchmark datasets. Despite this significantimprovement, the evaluation of state-of-the-art object trackers still relies on the classical intersection overunion (IoU) score. In this work, we argue that the object tracking evaluations based on classical IoU score aresub-optimal. As our first contribution, we theoretically prove that the IoU score is biased in the case of largetarget objects and favors over-estimated target prediction sizes. As our second contribution, we propose a newscore that is unbiased with respect to target prediction size. We systematically evaluate our proposed approachon benchmark tracking data with variations in relative target size. Our empirical results clearly suggest thatthe proposed score is unbiased in general.

Place, publisher, year, edition, pages
Science and Technology Publications, Lda, 2018
National Category
Signal Processing
Identifiers
urn:nbn:se:liu:diva-151306 (URN)10.5220/0006714805810587 (DOI)9789897582905 (ISBN)
Conference
13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, January 27-29, Funchal, Madeira
Available from: 2018-09-17 Created: 2018-09-17 Last updated: 2019-06-26Bibliographically approved
Järemo Lawin, F., Danelljan, M., Khan, F. S., Forssén, P.-E. & Felsberg, M. (2018). Density Adaptive Point Set Registration. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition: . Paper presented at The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, United States, 18-22 June, 2018 (pp. 3829-3837). IEEE
Open this publication in new window or tab >>Density Adaptive Point Set Registration
Show others...
2018 (English)In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018, p. 3829-3837Conference paper, Published paper (Refereed)
Abstract [en]

Probabilistic methods for point set registration have demonstrated competitive results in recent years. These techniques estimate a probability distribution model of the point clouds. While such a representation has shown promise, it is highly sensitive to variations in the density of 3D points. This fundamental problem is primarily caused by changes in the sensor location across point sets.    We revisit the foundations of the probabilistic registration paradigm. Contrary to previous works, we model the underlying structure of the scene as a latent probability distribution, and thereby induce invariance to point set density changes. Both the probabilistic model of the scene and the registration parameters are inferred by minimizing the Kullback-Leibler divergence in an Expectation Maximization based framework. Our density-adaptive registration successfully handles severe density variations commonly encountered in terrestrial Lidar applications. We perform extensive experiments on several challenging real-world Lidar datasets. The results demonstrate that our approach outperforms state-of-the-art probabilistic methods for multi-view registration, without the need of re-sampling.

Place, publisher, year, edition, pages
IEEE, 2018
Series
IEEE Conference on Computer Vision and Pattern Recognition
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:liu:diva-149774 (URN)10.1109/CVPR.2018.00403 (DOI)000457843603101 ()978-1-5386-6420-9 (ISBN)
Conference
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, United States, 18-22 June, 2018
Note

Funding Agencies|EUs Horizon 2020 Programme [644839]; CENIIT grant [18.14]; VR grant: EMC2 [2014-6227]; VR grant [2016-05543]; VR grant: LCMM [2014-5928]

Available from: 2018-07-18 Created: 2018-07-18 Last updated: 2019-06-19Bibliographically approved
Eldesokey, A., Felsberg, M. & Khan, F. S. (2018). Propagating Confidences through CNNs for Sparse Data Regression. In: : . Paper presented at The 29th British Machine Vision Conference (BMVC), Northumbria University, Newcastle upon Tyne, England, UK, 3-6 September, 2018.
Open this publication in new window or tab >>Propagating Confidences through CNNs for Sparse Data Regression
2018 (English)Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

In most computer vision applications, convolutional neural networks (CNNs) operate on dense image data generated by ordinary cameras. Designing CNNs for sparse and irregularly spaced input data is still an open problem with numerous applications in autonomous driving, robotics, and surveillance. To tackle this challenging problem, we introduce an algebraically-constrained convolution layer for CNNs with sparse input and demonstrate its capabilities for the scene depth completion task. We propose novel strategies for determining the confidence from the convolution operation and propagating it to consecutive layers. Furthermore, we propose an objective function that simultaneously minimizes the data error while maximizing the output confidence. Comprehensive experiments are performed on the KITTI depth benchmark and the results clearly demonstrate that the proposed approach achieves superior performance while requiring three times fewer parameters than the state-of-the-art methods. Moreover, our approach produces a continuous pixel-wise confidence map enabling information fusion, state inference, and decision support.

National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-149648 (URN)
Conference
The 29th British Machine Vision Conference (BMVC), Northumbria University, Newcastle upon Tyne, England, UK, 3-6 September, 2018
Available from: 2018-07-13 Created: 2018-07-13 Last updated: 2018-10-09Bibliographically approved
Johnander, J., Danelljan, M., Khan, F. S. & Felsberg, M. (2017). DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking. In: Michael Felsberg, Anders Heyden and Norbert Krüger (Ed.), Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I. Paper presented at 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I (pp. 55-67). Springer, 10424
Open this publication in new window or tab >>DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking
2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10424, p. 55-67Conference paper, Published paper (Refereed)
Abstract [en]

Discriminative Correlation Filter (DCF) based methods have shown competitive performance on tracking benchmarks in recent years. Generally, DCF based trackers learn a rigid appearance model of the target. However, this reliance on a single rigid appearance model is insufficient in situations where the target undergoes non-rigid transformations. In this paper, we propose a unified formulation for learning a deformable convolution filter. In our framework, the deformable filter is represented as a linear combination of sub-filters. Both the sub-filter coefficients and their relative locations are inferred jointly in our formulation. Experiments are performed on three challenging tracking benchmarks: OTB-2015, TempleColor and VOT2016. Our approach improves the baseline method, leading to performance comparable to state-of-the-art.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10424
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Engineering
Identifiers
urn:nbn:se:liu:diva-145373 (URN)10.1007/978-3-319-64689-3_5 (DOI)000432085900005 ()9783319646886 (ISBN)9783319646893 (ISBN)
Conference
17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I
Note

Funding agencies: SSF (SymbiCloud); VR (EMC2) [2016-05543]; SNIC; WASP; Nvidia

Available from: 2018-02-26 Created: 2018-02-26 Last updated: 2018-10-16Bibliographically approved
Järemo-Lawin, F., Danelljan, M., Tosteberg, P., Bhat, G., Khan, F. S. & Felsberg, M. (2017). Deep Projective 3D Semantic Segmentation. In: Michael Felsberg, Anders Heyden and Norbert Krüger (Ed.), Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I. Paper presented at 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I (pp. 95-107). Springer
Open this publication in new window or tab >>Deep Projective 3D Semantic Segmentation
Show others...
2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, p. 95-107Conference paper, Published paper (Refereed)
Abstract [en]

Semantic segmentation of 3D point clouds is a challenging problem with numerous real-world applications. While deep learning has revolutionized the field of image semantic segmentation, its impact on point cloud data has been limited so far. Recent attempts, based on 3D deep learning approaches (3D-CNNs), have achieved below-expected results. Such methods require voxelizations of the underlying point cloud data, leading to decreased spatial resolution and increased memory consumption. Additionally, 3D-CNNs greatly suffer from the limited availability of annotated datasets.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10424
Keywords
Point clouds, Semantic segmentation, Deep learning, Multi-stream deep networks
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Engineering
Identifiers
urn:nbn:se:liu:diva-145374 (URN)10.1007/978-3-319-64689-3_8 (DOI)000432085900008 ()2-s2.0-85028506569 (Scopus ID)9783319646886 (ISBN)9783319646893 (ISBN)
Conference
17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I
Note

Funding agencies: EU [644839]; Swedish Research Council [2014-6227]; Swedish Foundation for Strategic Research [RIT 15-0097]; VR starting grant [2016-05543]

Available from: 2018-02-26 Created: 2018-02-26 Last updated: 2018-10-10Bibliographically approved
Danelljan, M., Bhat, G., Khan, F. S. & Felsberg, M. (2017). ECO: Efficient Convolution Operators for Tracking. In: Proceedings 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR): . Paper presented at 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017, Honolulu, HI, USA (pp. 6931-6939). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>ECO: Efficient Convolution Operators for Tracking
2017 (English)In: Proceedings 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 6931-6939Conference paper, Published paper (Refereed)
Abstract [en]

In recent years, Discriminative Correlation Filter (DCF) based methods have significantly advanced the state-of-the-art in tracking. However, in the pursuit of ever increasing tracking performance, their characteristic speed and real-time capability have gradually faded. Further, the increasingly complex models, with massive number of trainable parameters, have introduced the risk of severe over-fitting. In this work, we tackle the key causes behind the problems of computational complexity and over-fitting, with the aim of simultaneously improving both speed and performance. We revisit the core DCF formulation and introduce: (i) a factorized convolution operator, which drastically reduces the number of parameters in the model; (ii) a compact generative model of the training sample distribution, that significantly reduces memory and time complexity, while providing better diversity of samples; (iii) a conservative model update strategy with improved robustness and reduced complexity. We perform comprehensive experiments on four benchmarks: VOT2016, UAV123, OTB-2015, and Temple-Color. When using expensive deep features, our tracker provides a 20-fold speedup and achieves a 13.0% relative gain in Expected Average Overlap compared to the top ranked method [12] in the VOT2016 challenge. Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65.0% AUC on OTB-2015.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017
Series
IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919 ; 2017
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-144284 (URN)10.1109/CVPR.2017.733 (DOI)000418371407004 ()9781538604571 (ISBN)9781538604588 (ISBN)
Conference
30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017, Honolulu, HI, USA
Note

Funding Agencies|SSF (SymbiCloud); VR (EMC2) [2016-05543]; SNIC; WASP; Visual Sweden; Nvidia

Available from: 2018-01-12 Created: 2018-01-12 Last updated: 2019-06-26Bibliographically approved
Eldesokey, A., Felsberg, M. & Khan, F. S. (2017). Ellipse Detection for Visual Cyclists Analysis “In the Wild”. In: Michael Felsberg, Anders Heyden and Norbert Krüger (Ed.), Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I. Paper presented at 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I (pp. 319-331). Springer, 10424
Open this publication in new window or tab >>Ellipse Detection for Visual Cyclists Analysis “In the Wild”
2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10424, p. 319-331Conference paper, Published paper (Refereed)
Abstract [en]

Autonomous driving safety is becoming a paramount issue due to the emergence of many autonomous vehicle prototypes. The safety measures ensure that autonomous vehicles are safe to operate among pedestrians, cyclists and conventional vehicles. While safety measures for pedestrians have been widely studied in literature, little attention has been paid to safety measures for cyclists. Visual cyclists analysis is a challenging problem due to the complex structure and dynamic nature of the cyclists. The dynamic model used for cyclists analysis heavily relies on the wheels. In this paper, we investigate the problem of ellipse detection for visual cyclists analysis in the wild. Our first contribution is the introduction of a new challenging annotated dataset for bicycle wheels, collected in real-world urban environment. Our second contribution is a method that combines reliable arcs selection and grouping strategies for ellipse detection. The reliable selection and grouping mechanism leads to robust ellipse detections when combined with the standard least square ellipse fitting approach. Our experiments clearly demonstrate that our method provides improved results, both in terms of accuracy and robustness in challenging urban environment settings.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10424
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Engineering
Identifiers
urn:nbn:se:liu:diva-145372 (URN)10.1007/978-3-319-64689-3_26 (DOI)000432085900026 ()9783319646886 (ISBN)9783319646893 (ISBN)
Conference
17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I
Note

Funding agencies: VR (EMC2, ELLIIT, starting grant) [2016-05543]; Vinnova (Cykla)

Available from: 2018-02-26 Created: 2018-02-26 Last updated: 2018-10-17Bibliographically approved
Danelljan, M., Meneghetti, G., Khan, F. S. & Felsberg, M. (2016). A Probabilistic Framework for Color-Based Point Set Registration. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR): . Paper presented at 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27-30 June 2016, Las Vegas, NV, USA (pp. 1818-1826). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>A Probabilistic Framework for Color-Based Point Set Registration
2016 (English)In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 1818-1826Conference paper, Published paper (Refereed)
Abstract [en]

In recent years, sensors capable of measuring both color and depth information have become increasingly popular. Despite the abundance of colored point set data, state-of-the-art probabilistic registration techniques ignore the available color information. In this paper, we propose a probabilistic point set registration framework that exploits available color information associated with the points. Our method is based on a model of the joint distribution of 3D-point observations and their color information. The proposed model captures discriminative color information, while being computationally efficient. We derive an EM algorithm for jointly estimating the model parameters and the relative transformations. Comprehensive experiments are performed on the Stanford Lounge dataset, captured by an RGB-D camera, and two point sets captured by a Lidar sensor. Our results demonstrate a significant gain in robustness and accuracy when incorporating color information. On the Stanford Lounge dataset, our approach achieves a relative reduction of the failure rate by 78% compared to the baseline. Furthermore, our proposed model outperforms standard strategies for combining color and 3D-point information, leading to state-of-the-art results.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2016
Series
IEEE Conference on Computer Vision and Pattern Recognition, E-ISSN 1063-6919 ; 2016
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-137883 (URN)10.1109/CVPR.2016.201 (DOI)000400012301093 ()9781467388511 (ISBN)9781467388528 (ISBN)
Conference
29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27-30 June 2016, Las Vegas, NV, USA
Note

Funding Agencies|SSF (VPS); VR (EMC2); Vinnova (iQMatic); EUs Horizon 2020 RI program grant [644839]; Wallenberg Autonomous Systems Program; NSC; Nvidia

Available from: 2017-06-01 Created: 2017-06-01 Last updated: 2019-06-26Bibliographically approved
Danelljan, M., Häger, G., Khan, F. S. & Felsberg, M. (2016). Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR): . Paper presented at 29th IEEE Conference on Computer Vision and Pattern Recognition, 27-30 June 2016, Las Vegas, NV, USA (pp. 1430-1438). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
2016 (English)In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 1430-1438Conference paper, Published paper (Refereed)
Abstract [en]

Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2016
Series
IEEE Conference on Computer Vision and Pattern Recognition, E-ISSN 1063-6919 ; 2016
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-137882 (URN)10.1109/CVPR.2016.159 (DOI)000400012301051 ()9781467388511 (ISBN)9781467388528 (ISBN)
Conference
29th IEEE Conference on Computer Vision and Pattern Recognition, 27-30 June 2016, Las Vegas, NV, USA
Note

Funding Agencies|SSF (CUAS); VR (EMC2); VR (ELLIIT); Wallenberg Autonomous Systems Program; NSC; Nvidia

Available from: 2017-06-01 Created: 2017-06-01 Last updated: 2019-06-27Bibliographically approved
Organisations

Search in DiVA

Show all publications