liu.seSök publikationer i DiVA
Ändra sökning
Avgränsa sökresultatet
1 - 50 av 50
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Two-Stream Part-based Deep Representation for Human Attribute Recognition2018Ingår i: 2018 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), IEEE , 2018, s. 90-97Konferensbidrag (Refereegranskat)
    Abstract [en]

    Recognizing human attributes in unconstrained environments is a challenging computer vision problem. State-of-the-art approaches to human attribute recognition are based on convolutional neural networks (CNNs). The de facto practice when training these CNNs on a large labeled image dataset is to take RGB pixel values of an image as input to the network. In this work, we propose a two-stream part-based deep representation for human attribute classification. Besides the standard RGB stream, we train a deep network by using mapped coded images with explicit texture information, that complements the standard RGB deep model. To integrate human body parts knowledge, we employ the deformable part-based models together with our two-stream deep model. Experiments are performed on the challenging Human Attributes (HAT-27) Dataset consisting of 27 different human attributes. Our results clearly show that (a) the two-stream deep network provides consistent gain in performance over the standard RGB model and (b) that the attribute classification results are further improved with our two-stream part-based deep representations, leading to state-of-the-art results.

  • 2.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    van de Weijer, Joost
    Univ Autonoma Barcelona, Spain.
    Molinier, Matthieu
    VTT Tech Res Ctr Finland Ltd, Finland.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification2018Ingår i: ISPRS journal of photogrammetry and remote sensing (Print), ISSN 0924-2716, E-ISSN 1872-8235, Vol. 138, s. 74-85Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification. (C) 2018 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.

  • 3.
    Bhat, Goutam
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Incept Inst Artificial Intelligence, U Arab Emirates.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Combining Local and Global Models for Robust Re-detection2018Ingår i: Proceedings of AVSS 2018. 2018 IEEE International Conference on Advanced Video and Signal-based Surveillance, Auckland, New Zealand, 27-30 November 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, s. 25-30Konferensbidrag (Refereegranskat)
    Abstract [en]

    Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual tracking. However, these methods still struggle in occlusion and out-of-view scenarios due to the absence of a re-detection component. While such a component requires global knowledge of the scene to ensure robust re-detection of the target, the standard DCF is only trained on the local target neighborhood. In this paper, we augment the state-of-the-art DCF tracking framework with a re-detection component based on a global appearance model. First, we introduce a tracking confidence measure to detect target loss. Next, we propose a hard negative mining strategy to extract background distractors samples, used for training the global model. Finally, we propose a robust re-detection strategy that combines the global and local appearance model predictions. We perform comprehensive experiments on the challenging UAV123 and LTB35 datasets. Our approach shows consistent improvements over the baseline tracker, setting a new state-of-the-art on both datasets.

  • 4.
    Bhat, Goutam
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Johnander, Joakim
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Unveiling the power of deep tracking2018Ingår i: Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part II / [ed] Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu and Yair Weiss, Cham: Springer Publishing Company, 2018, s. 493-509Konferensbidrag (Refereegranskat)
    Abstract [en]

    In the field of generic object tracking numerous attempts have been made to exploit deep features. Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features. In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking. We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness. We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy. Extensive experiments are performed on four challenging datasets. On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of >17% in EAO.

  • 5.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Bhat, Goutam
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Gladh, Susanna
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Deep motion and appearance cues for visual tracking2019Ingår i: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 124, s. 74-81Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Generic visual tracking is a challenging computer vision problem, with numerous applications. Most existing approaches rely on appearance information by employing either hand-crafted features or deep RGB features extracted from convolutional neural networks. Despite their success, these approaches struggle in case of ambiguous appearance information, leading to tracking failure. In such cases, we argue that motion cue provides discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. In this paper, we investigate the impact of deep motion features in a tracking-by-detection framework. We also evaluate the fusion of hand-crafted, deep RGB, and deep motion features and show that they contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly demonstrate that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.

  • 6.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Bhat, Goutam
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    ECO: Efficient Convolution Operators for Tracking2017Ingår i: Proceedings 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2017, s. 6931-6939Konferensbidrag (Refereegranskat)
    Abstract [en]

    In recent years, Discriminative Correlation Filter (DCF) based methods have significantly advanced the state-of-the-art in tracking. However, in the pursuit of ever increasing tracking performance, their characteristic speed and real-time capability have gradually faded. Further, the increasingly complex models, with massive number of trainable parameters, have introduced the risk of severe over-fitting. In this work, we tackle the key causes behind the problems of computational complexity and over-fitting, with the aim of simultaneously improving both speed and performance. We revisit the core DCF formulation and introduce: (i) a factorized convolution operator, which drastically reduces the number of parameters in the model; (ii) a compact generative model of the training sample distribution, that significantly reduces memory and time complexity, while providing better diversity of samples; (iii) a conservative model update strategy with improved robustness and reduced complexity. We perform comprehensive experiments on four benchmarks: VOT2016, UAV123, OTB-2015, and Temple-Color. When using expensive deep features, our tracker provides a 20-fold speedup and achieves a 13.0% relative gain in Expected Average Overlap compared to the top ranked method [12] in the VOT2016 challenge. Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65.0% AUC on OTB-2015.

  • 7.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Accurate Scale Estimation for Robust Visual Tracking2014Ingår i: Proceedings of the British Machine Vision Conference 2014 / [ed] Michel Valstar, Andrew French and Tony Pridmore, BMVA Press , 2014Konferensbidrag (Refereegranskat)
    Abstract [en]

    Robust scale estimation is a challenging problem in visual object tracking. Most existing methods fail to handle large scale variations in complex image sequences. This paper presents a novel approach for robust scale estimation in a tracking-by-detection framework. The proposed approach works by learning discriminative correlation filters based on a scale pyramid representation. We learn separate filters for translation and scale estimation, and show that this improves the performance compared to an exhaustive scale search. Our scale estimation approach is generic as it can be incorporated into any tracking method with no inherent scale estimation.

    Experiments are performed on 28 benchmark sequences with significant scale variations. Our results show that the proposed approach significantly improves the performance by 18.8 % in median distance precision compared to our baseline. Finally, we provide both quantitative and qualitative comparison of our approach with state-of-the-art trackers in literature. The proposed method is shown to outperform the best existing tracker by 16.6 % in median distance precision, while operating at real-time.

  • 8.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Discriminative Scale Space Tracking2017Ingår i: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 39, nr 8, s. 1561-1575Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation. The explicit scale filter is learned online using the target appearance sampled at a set of different scales. Contrary to standard approaches, our method directly learns the appearance change induced by variations in the target scale. Additionally, we investigate strategies to reduce the computational cost of our approach. Extensive experiments are performed on the OTB and the VOT2014 datasets. Compared to the standard exhaustive scale search, our approach achieves a gain of 2.5 percent in average overlap precision on the OTB dataset. Additionally, our method is computationally efficient, operating at a 50 percent higher frame rate compared to the exhaustive scale search. Our method obtains the top rank in performance by outperforming 19 state-of-the-art trackers on OTB and 37 state-of-the-art trackers on VOT2014.

  • 9.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking2016Ingår i: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2016, s. 1430-1438Konferensbidrag (Refereegranskat)
    Abstract [en]

    Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.

  • 10.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Coloring Channel Representations for Visual Tracking2015Ingår i: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen, Kim S. Pedersen, Springer, 2015, Vol. 9127, s. 117-129Konferensbidrag (Refereegranskat)
    Abstract [en]

    Visual object tracking is a classical, but still open research problem in computer vision, with many real world applications. The problem is challenging due to several factors, such as illumination variation, occlusions, camera motion and appearance changes. Such problems can be alleviated by constructing robust, discriminative and computationally efficient visual features. Recently, biologically-inspired channel representations \cite{felsberg06PAMI} have shown to provide promising results in many applications ranging from autonomous driving to visual tracking.

    This paper investigates the problem of coloring channel representations for visual tracking. We evaluate two strategies, channel concatenation and channel product, to construct channel coded color representations. The proposed channel coded color representations are generic and can be used beyond tracking.

    Experiments are performed on 41 challenging benchmark videos. Our experiments clearly suggest that a careful selection of color feature together with an optimal fusion strategy, significantly outperforms the standard luminance based channel representation. Finally, we show promising results compared to state-of-the-art tracking methods in the literature.

  • 11.
    Danelljan, Martin
    et al.
    Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Häger, Gustav
    Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Khan, Fahad Shahbaz
    Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Felsberg, Michael
    Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Convolutional Features for Correlation Filter Based Visual Tracking2015Ingår i: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), IEEE conference proceedings, 2015, s. 621-629Konferensbidrag (Refereegranskat)
    Abstract [en]

    Visual object tracking is a challenging computer vision problem with numerous real-world applications. This paper investigates the impact of convolutional features for the visual tracking problem. We propose to use activations from the convolutional layer of a CNN in discriminative correlation filter based tracking frameworks. These activations have several advantages compared to the standard deep features (fully connected layers). Firstly, they mitigate the need of task specific fine-tuning. Secondly, they contain structural information crucial for the tracking problem. Lastly, these activations have low dimensionality. We perform comprehensive experiments on three benchmark datasets: OTB, ALOV300++ and the recently introduced VOT2015. Surprisingly, different to image classification, our results suggest that activations from the first layer provide superior tracking performance compared to the deeper layers. Our results further show that the convolutional features provide improved results compared to standard handcrafted features. Finally, results comparable to state-of-theart trackers are obtained on all three benchmark datasets.

  • 12.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Learning Spatially Regularized Correlation Filters for Visual Tracking2015Ingår i: Proceedings of the International Conference in Computer Vision (ICCV), 2015, IEEE Computer Society, 2015, s. 4310-4318Konferensbidrag (Refereegranskat)
    Abstract [en]

    Robust and accurate visual tracking is one of the most challenging computer vision problems. Due to the inherent lack of training data, a robust approach for constructing a target appearance model is crucial. Recently, discriminatively learned correlation filters (DCF) have been successfully applied to address this problem for tracking. These methods utilize a periodic assumption of the training samples to efficiently learn a classifier on all patches in the target neighborhood. However, the periodic assumption also introduces unwanted boundary effects, which severely degrade the quality of the tracking model.

    We propose Spatially Regularized Discriminative Correlation Filters (SRDCF) for tracking. A spatial regularization component is introduced in the learning to penalize correlation filter coefficients depending on their spatial location. Our SRDCF formulation allows the correlation filters to be learned on a significantly larger set of negative training samples, without corrupting the positive samples. We further propose an optimization strategy, based on the iterative Gauss-Seidel method, for efficient online learning of our SRDCF. Experiments are performed on four benchmark datasets: OTB-2013, ALOV++, OTB-2015, and VOT2014. Our approach achieves state-of-the-art results on all four datasets. On OTB-2013 and OTB-2015, we obtain an absolute gain of 8.0% and 8.2% respectively, in mean overlap precision, compared to the best existing trackers.

  • 13.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV.
    Granström, Karl
    Linköpings universitet, Institutionen för systemteknik, Reglerteknik. Linköpings universitet, Tekniska högskolan.
    Heintz, Fredrik
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska högskolan.
    Rudol, Piotr
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska högskolan.
    Wzorek, Mariusz
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska högskolan.
    Kvarnström, Jonas
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska högskolan.
    Doherty, Patrick
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska högskolan.
    A Low-Level Active Vision Framework for Collaborative Unmanned Aircraft Systems2015Ingår i: COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I / [ed] Lourdes Agapito, Michael M. Bronstein and Carsten Rother, Springer Publishing Company, 2015, Vol. 8925, s. 223-237Konferensbidrag (Refereegranskat)
    Abstract [en]

    Micro unmanned aerial vehicles are becoming increasingly interesting for aiding and collaborating with human agents in myriads of applications, but in particular they are useful for monitoring inaccessible or dangerous areas. In order to interact with and monitor humans, these systems need robust and real-time computer vision subsystems that allow to detect and follow persons.

    In this work, we propose a low-level active vision framework to accomplish these challenging tasks. Based on the LinkQuad platform, we present a system study that implements the detection and tracking of people under fully autonomous flight conditions, keeping the vehicle within a certain distance of a person. The framework integrates state-of-the-art methods from visual detection and tracking, Bayesian filtering, and AI-based control. The results from our experiments clearly suggest that the proposed framework performs real-time detection and tracking of persons in complex scenarios

  • 14.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Meneghetti, Giulia
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    A Probabilistic Framework for Color-Based Point Set Registration2016Ingår i: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2016, s. 1818-1826Konferensbidrag (Refereegranskat)
    Abstract [en]

    In recent years, sensors capable of measuring both color and depth information have become increasingly popular. Despite the abundance of colored point set data, state-of-the-art probabilistic registration techniques ignore the available color information. In this paper, we propose a probabilistic point set registration framework that exploits available color information associated with the points. Our method is based on a model of the joint distribution of 3D-point observations and their color information. The proposed model captures discriminative color information, while being computationally efficient. We derive an EM algorithm for jointly estimating the model parameters and the relative transformations. Comprehensive experiments are performed on the Stanford Lounge dataset, captured by an RGB-D camera, and two point sets captured by a Lidar sensor. Our results demonstrate a significant gain in robustness and accuracy when incorporating color information. On the Stanford Lounge dataset, our approach achieves a relative reduction of the failure rate by 78% compared to the baseline. Furthermore, our proposed model outperforms standard strategies for combining color and 3D-point information, leading to state-of-the-art results.

  • 15.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Meneghetti, Giulia
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Aligning the Dissimilar: A Probabilistic Feature-Based Point Set Registration Approach2016Ingår i: Proceedings of the 23rd International Conference on Pattern Recognition (ICPR) 2016, IEEE, 2016, s. 247-252Konferensbidrag (Refereegranskat)
    Abstract [en]

    3D-point set registration is an active area of research in computer vision. In recent years, probabilistic registration approaches have demonstrated superior performance for many challenging applications. Generally, these probabilistic approaches rely on the spatial distribution of the 3D-points, and only recently color information has been integrated into such a framework, significantly improving registration accuracy. Other than local color information, high-dimensional 3D shape features have been successfully employed in many applications such as action recognition and 3D object recognition. In this paper, we propose a probabilistic framework to integrate high-dimensional 3D shape features with color information for point set registration. The 3D shape features are distinctive and provide complementary information beneficial for robust registration. We validate our proposed framework by performing comprehensive experiments on the challenging Stanford Lounge dataset, acquired by a RGB-D sensor, and an outdoor dataset captured by a Lidar sensor. The results clearly demonstrate that our approach provides superior results both in terms of robustness and accuracy compared to state-of-the-art probabilistic methods.

  • 16.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Robinson, Andreas
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV.
    Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking2016Ingår i: Computer Vision – ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part V / [ed] Bastian Leibe, Jiri Matas, Nicu Sebe and Max Welling, Cham: Springer, 2016, s. 472-488Konferensbidrag (Refereegranskat)
    Abstract [en]

    Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual object tracking. The key to their success is the ability to efficiently exploit available negative data by including all shifted versions of a training sample. However, the underlying DCF formulation is restricted to single-resolution feature maps, significantly limiting its potential. In this paper, we go beyond the conventional DCF framework and introduce a novel formulation for training continuous convolution filters. We employ an implicit interpolation model to pose the learning problem in the continuous spatial domain. Our proposed formulation enables efficient integration of multi-resolution deep feature maps, leading to superior results on three object tracking benchmarks: OTB-2015 (+5.1% in mean OP), Temple-Color (+4.6% in mean OP), and VOT2015 (20% relative reduction in failure rate). Additionally, our approach is capable of sub-pixel localization, crucial for the task of accurate feature point tracking. We also demonstrate the effectiveness of our learning formulation in extensive feature point tracking experiments.

  • 17.
    Danelljan, Martin
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Shahbaz Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    van de Weijer, Joost
    Computer Vision Center, CS Dept. Universitat Autonoma de Barcelona, Spain.
    Adaptive Color Attributes for Real-Time Visual Tracking2014Ingår i: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2014, IEEE Computer Society, 2014, s. 1090-1097Konferensbidrag (Refereegranskat)
    Abstract [en]

    Visual tracking is a challenging problem in computer vision. Most state-of-the-art visual trackers either rely on luminance information or use simple color representations for image description. Contrary to visual tracking, for object recognition and detection, sophisticated color features when combined with luminance have shown to provide excellent performance. Due to the complexity of the tracking problem, the desired color feature should be computationally efficient, and possess a certain amount of photometric invariance while maintaining high discriminative power.

    This paper investigates the contribution of color in a tracking-by-detection framework. Our results suggest that color attributes provides superior performance for visual tracking. We further propose an adaptive low-dimensional variant of color attributes. Both quantitative and attributebased evaluations are performed on 41 challenging benchmark color sequences. The proposed approach improves the baseline intensity-based tracker by 24% in median distance precision. Furthermore, we show that our approach outperforms state-of-the-art tracking methods while running at more than 100 frames per second.

  • 18.
    Eldesokey, Abdelrahman
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Confidence Propagation through CNNs for Guided Sparse Depth Regression2019Ingår i: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0182-8828Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Generally, convolutional neural networks (CNNs) process data on a regular grid, e.g. data generated by ordinary cameras. Designing CNNs for sparse and irregularly spaced input data is still an open research problem with numerous applications in autonomous driving, robotics, and surveillance. In this paper, we propose an algebraically-constrained normalized convolution layer for CNNs with highly sparse input that has a smaller number of network parameters compared to related work. We propose novel strategies for determining the confidence from the convolution operation and propagating it to consecutive layers. We also propose an objective function that simultaneously minimizes the data error while maximizing the output confidence. To integrate structural information, we also investigate fusion strategies to combine depth and RGB information in our normalized convolution network framework. In addition, we introduce the use of output confidence as an auxiliary information to improve the results. The capabilities of our normalized convolution network framework are demonstrated for the problem of scene depth completion. Comprehensive experiments are performed on the KITTI-Depth and the NYU-Depth-v2 datasets. The results clearly demonstrate that the proposed approach achieves superior performance while requiring only about 1-5% of the number of parameters compared to the state-of-the-art methods.

  • 19.
    Eldesokey, Abdelrahman
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Ellipse Detection for Visual Cyclists Analysis “In the Wild”2017Ingår i: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10424, s. 319-331Konferensbidrag (Refereegranskat)
    Abstract [en]

    Autonomous driving safety is becoming a paramount issue due to the emergence of many autonomous vehicle prototypes. The safety measures ensure that autonomous vehicles are safe to operate among pedestrians, cyclists and conventional vehicles. While safety measures for pedestrians have been widely studied in literature, little attention has been paid to safety measures for cyclists. Visual cyclists analysis is a challenging problem due to the complex structure and dynamic nature of the cyclists. The dynamic model used for cyclists analysis heavily relies on the wheels. In this paper, we investigate the problem of ellipse detection for visual cyclists analysis in the wild. Our first contribution is the introduction of a new challenging annotated dataset for bicycle wheels, collected in real-world urban environment. Our second contribution is a method that combines reliable arcs selection and grouping strategies for ellipse detection. The reliable selection and grouping mechanism leads to robust ellipse detections when combined with the standard least square ellipse fitting approach. Our experiments clearly demonstrate that our method provides improved results, both in terms of accuracy and robustness in challenging urban environment settings.

  • 20.
    Eldesokey, Abdelrahman
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Inception Institute of Artificial Intelligence Abu Dhabi, UAE.
    Propagating Confidences through CNNs for Sparse Data Regression2019Ingår i: British Machine Vision Conference 2018, BMVC 2018, BMVA Press , 2019Konferensbidrag (Refereegranskat)
    Abstract [en]

    In most computer vision applications, convolutional neural networks (CNNs) operate on dense image data generated by ordinary cameras. Designing CNNs for sparse and irregularly spaced input data is still an open problem with numerous applications in autonomous driving, robotics, and surveillance. To tackle this challenging problem, we introduce an algebraically-constrained convolution layer for CNNs with sparse input and demonstrate its capabilities for the scene depth completion task. We propose novel strategies for determining the confidence from the convolution operation and propagating it to consecutive layers. Furthermore, we propose an objective function that simultaneously minimizes the data error while maximizing the output confidence. Comprehensive experiments are performed on the KITTI depth benchmark and the results clearly demonstrate that the proposed approach achieves superior performance while requiring three times fewer parameters than the state-of-the-art methods. Moreover, our approach produces a continuous pixel-wise confidence map enabling information fusion, state inference, and decision support.

  • 21.
    Felsberg, Michael
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Berg, Amanda
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Termisk Systemteknik AB, Linköping, Sweden.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Ahlberg, Jörgen
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Termisk Systemteknik AB, Linköping, Sweden.
    Kristan, Matej
    University of Ljubljana, Slovenia.
    Matas, Jiri
    Czech Technical University, Czech Republic.
    Leonardis, Ales
    University of Birmingham, United Kingdom.
    Cehovin, Luka
    University of Ljubljana, Slovenia.
    Fernandez, Gustavo
    Austrian Institute of Technology, Austria.
    Vojır, Tomas
    Czech Technical University, Czech Republic.
    Nebehay, Georg
    Austrian Institute of Technology, Austria.
    Pflugfelder, Roman
    Austrian Institute of Technology, Austria.
    Lukezic, Alan
    University of Ljubljana, Slovenia.
    Garcia-Martin8, Alvaro
    Universidad Autonoma de Madrid, Spain.
    Saffari, Amir
    Affectv, United Kingdom.
    Li, Ang
    Xi’an Jiaotong University.
    Solıs Montero, Andres
    University of Ottawa, Canada.
    Zhao, Baojun
    Beijing Institute of Technology, China.
    Schmid, Cordelia
    INRIA Grenoble Rhˆone-Alpes, France.
    Chen, Dapeng
    Xi’an Jiaotong University.
    Du, Dawei
    University at Albany, USA.
    Shahbaz Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Porikli, Fatih
    Australian National University, Australia.
    Zhu, Gao
    Australian National University, Australia.
    Zhu, Guibo
    NLPR, Chinese Academy of Sciences, China.
    Lu, Hanqing
    NLPR, Chinese Academy of Sciences, China.
    Kieritz, Hilke
    Fraunhofer IOSB, Germany.
    Li, Hongdong
    Australian National University, Australia.
    Qi, Honggang
    University at Albany, USA.
    Jeong, Jae-chan
    Electronics and Telecommunications Research Institute, Korea.
    Cho, Jae-il
    Electronics and Telecommunications Research Institute, Korea.
    Lee, Jae-Yeong
    Electronics and Telecommunications Research Institute, Korea.
    Zhu, Jianke
    Zhejiang University, China.
    Li, Jiatong
    University of Technology, Australia.
    Feng, Jiayi
    Institute of Automation, Chinese Academy of Sciences, China.
    Wang, Jinqiao
    NLPR, Chinese Academy of Sciences, China.
    Kim, Ji-Wan
    Electronics and Telecommunications Research Institute, Korea.
    Lang, Jochen
    University of Ottawa, Canada.
    Martinez, Jose M.
    Universidad Aut´onoma de Madrid, Spain.
    Xue, Kai
    INRIA Grenoble Rhˆone-Alpes, France.
    Alahari, Karteek
    INRIA Grenoble Rhˆone-Alpes, France.
    Ma, Liang
    Harbin Engineering University, China.
    Ke, Lipeng
    University at Albany, USA.
    Wen, Longyin
    University at Albany, USA.
    Bertinetto, Luca
    Oxford University, United Kingdom.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Arens, Michael
    Fraunhofer IOSB, Germany.
    Tang, Ming
    Institute of Automation, Chinese Academy of Sciences, China.
    Chang, Ming-Ching
    University at Albany, USA.
    Miksik, Ondrej
    Oxford University, United Kingdom.
    Torr, Philip H S
    Oxford University, United Kingdom.
    Martin-Nieto, Rafael
    Universidad Aut´onoma de Madrid, Spain.
    Laganiere, Robert
    University of Ottawa, Canada.
    Hare, Sam
    Obvious Engineering, United Kingdom.
    Lyu, Siwei
    University at Albany, USA.
    Zhu, Song-Chun
    University of California, USA.
    Becker, Stefan
    Fraunhofer IOSB, Germany.
    Hicks, Stephen L
    Oxford University, United Kingdom.
    Golodetz, Stuart
    Oxford University, United Kingdom.
    Choi, Sunglok
    Electronics and Telecommunications Research Institute, Korea.
    Wu, Tianfu
    University of California, USA.
    Hubner, Wolfgang
    Fraunhofer IOSB, Germany.
    Zhao, Xu
    Institute of Automation, Chinese Academy of Sciences, China.
    Hua, Yang
    INRIA Grenoble Rhˆone-Alpes, France.
    Li, Yang
    Zhejiang University, China.
    Lu, Yang
    University of California, USA.
    Li, Yuezun
    University at Albany, USA.
    Yuan, Zejian
    Xi’an Jiaotong University.
    Hong, Zhibin
    University of Technology, Australia.
    The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results2015Ingår i: Proceedings of the IEEE International Conference on Computer Vision, Institute of Electrical and Electronics Engineers (IEEE), 2015, s. 639-651Konferensbidrag (Refereegranskat)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking challenge 2015, VOTTIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply prelearned models of object appearance. VOT-TIR2015 is the first benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2015 challenge is based on the VOT2013 challenge, but introduces the following novelties: (i) the newly collected LTIR (Linköping TIR) dataset is used, (ii) the VOT2013 attributes are adapted to TIR data, (iii) the evaluation is performed using insights gained during VOT2013 and VOT2014 and is similar to VOT2015.

  • 22.
    Felsberg, Michael
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV.
    Kristan, Matej
    University of Ljubljana, Slovenia.
    Matas, Jiri
    Czech Technical University, Czech Republic.
    Leonardis, Ales
    University of Birmingham, England.
    Pflugfelder, Roman
    Austrian Institute Technology, Austria.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Berg, Amanda
    Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för systemteknik, Datorseende. Termisk Syst Tekn AB, Linkoping, Sweden.
    Eldesokey, Abdelrahman
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Ahlberg, Jörgen
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Termisk Syst Tekn AB, Linkoping, Sweden.
    Cehovin, Luka
    University of Ljubljana, Slovenia.
    Vojir, Tomas
    Czech Technical University, Czech Republic.
    Lukezic, Alan
    University of Ljubljana, Slovenia.
    Fernandez, Gustavo
    Austrian Institute Technology, Austria.
    Petrosino, Alfredo
    Parthenope University of Naples, Italy.
    Garcia-Martin, Alvaro
    University of Autonoma Madrid, Spain.
    Solis Montero, Andres
    University of Ottawa, Canada.
    Varfolomieiev, Anton
    Kyiv Polytech Institute, Ukraine.
    Erdem, Aykut
    Hacettepe University, Turkey.
    Han, Bohyung
    POSTECH, South Korea.
    Chang, Chang-Ming
    University of Albany, GA USA.
    Du, Dawei
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Erdem, Erkut
    Hacettepe University, Turkey.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Porikli, Fatih
    ARC Centre Excellence Robot Vis, Australia; CSIRO, Australia.
    Zhao, Fei
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Bunyak, Filiz
    University of Missouri, MO 65211 USA.
    Battistone, Francesco
    Parthenope University of Naples, Italy.
    Zhu, Gao
    University of Missouri, Columbia, USA.
    Seetharaman, Guna
    US Navy, DC 20375 USA.
    Li, Hongdong
    ARC Centre Excellence Robot Vis, Australia.
    Qi, Honggang
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Bischof, Horst
    Graz University of Technology, Austria.
    Possegger, Horst
    Graz University of Technology, Austria.
    Nam, Hyeonseob
    NAVER Corp, South Korea.
    Valmadre, Jack
    University of Oxford, England.
    Zhu, Jianke
    Zhejiang University, Peoples R China.
    Feng, Jiayi
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Lang, Jochen
    University of Ottawa, Canada.
    Martinez, Jose M.
    University of Autonoma Madrid, Spain.
    Palaniappan, Kannappan
    University of Missouri, MO 65211 USA.
    Lebeda, Karel
    University of Surrey, England.
    Gao, Ke
    University of Missouri, MO 65211 USA.
    Mikolajczyk, Krystian
    Imperial Coll London, England.
    Wen, Longyin
    University of Albany, GA USA.
    Bertinetto, Luca
    University of Oxford, England.
    Poostchi, Mahdieh
    University of Missouri, MO 65211 USA.
    Maresca, Mario
    Parthenope University of Naples, Italy.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Arens, Michael
    Fraunhofer IOSB, Germany.
    Tang, Ming
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Baek, Mooyeol
    POSTECH, South Korea.
    Fan, Nana
    Harbin Institute Technology, Peoples R China.
    Al-Shakarji, Noor
    University of Missouri, MO 65211 USA.
    Miksik, Ondrej
    University of Oxford, England.
    Akin, Osman
    Hacettepe University, Turkey.
    Torr, Philip H. S.
    University of Oxford, England.
    Huang, Qingming
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Martin-Nieto, Rafael
    University of Autonoma Madrid, Spain.
    Pelapur, Rengarajan
    University of Missouri, MO 65211 USA.
    Bowden, Richard
    University of Surrey, England.
    Laganiere, Robert
    University of Ottawa, Canada.
    Krah, Sebastian B.
    Fraunhofer IOSB, Germany.
    Li, Shengkun
    University of Albany, GA USA.
    Yao, Shizeng
    University of Missouri, MO 65211 USA.
    Hadfield, Simon
    University of Surrey, England.
    Lyu, Siwei
    University of Albany, GA USA.
    Becker, Stefan
    Fraunhofer IOSB, Germany.
    Golodetz, Stuart
    University of Oxford, England.
    Hu, Tao
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Mauthner, Thomas
    Graz University of Technology, Austria.
    Santopietro, Vincenzo
    Parthenope University of Naples, Italy.
    Li, Wenbo
    Lehigh University, PA 18015 USA.
    Huebner, Wolfgang
    Fraunhofer IOSB, Germany.
    Li, Xin
    Harbin Institute Technology, Peoples R China.
    Li, Yang
    Zhejiang University, Peoples R China.
    Xu, Zhan
    Zhejiang University, Peoples R China.
    He, Zhenyu
    Harbin Institute Technology, Peoples R China.
    The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results2016Ingår i: Computer Vision – ECCV 2016 Workshops. ECCV 2016. / [ed] Hua G., Jégou H., SPRINGER INT PUBLISHING AG , 2016, s. 824-849Konferensbidrag (Refereegranskat)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking challenge 2016, VOT-TIR2016, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2016 is the second benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2016 challenge is similar to the 2015 challenge, the main difference is the introduction of new, more difficult sequences into the dataset. Furthermore, VOT-TIR2016 evaluation adopted the improvements regarding overlap calculation in VOT2016. Compared to VOT-TIR2015, a significant general improvement of results has been observed, which partly compensate for the more difficult sequences. The dataset, the evaluation kit, as well as the results are publicly available at the challenge website.

  • 23.
    Gladh, Susanna
    et al.
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Danelljan, Martin
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Khan, Fahad Shahbaz
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Felsberg, Michael
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Deep motion features for visual tracking2016Ingår i: Proceedings of the 23rd International Conference on, Pattern Recognition (ICPR), 2016, Institute of Electrical and Electronics Engineers (IEEE), 2016, s. 1243-1248Konferensbidrag (Refereegranskat)
    Abstract [en]

    Robust visual tracking is a challenging computer vision problem, with many real-world applications. Most existing approaches employ hand-crafted appearance features, such as HOG or Color Names. Recently, deep RGB features extracted from convolutional neural networks have been successfully applied for tracking. Despite their success, these features only capture appearance information. On the other hand, motion cues provide discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. This paper presents an investigation of the impact of deep motion features in a tracking-by-detection framework. We further show that hand-crafted, deep RGB, and deep motion features contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly suggest that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.

  • 24.
    Grelsson, Bertil
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Robinson, Andreas
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    HorizonNet for visual terrain navigation2018Ingår i: Proceedings of 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS), Institute of Electrical and Electronics Engineers (IEEE), 2018, s. 149-155Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper investigates the problem of position estimation of unmanned surface vessels (USVs) operating in coastal areas or in the archipelago. We propose a position estimation method where the horizon line is extracted in a 360 degree panoramic image around the USV. We design a CNN architecture to determine an approximate horizon line in the image and implicitly determine the camera orientation (the pitch and roll angles). The panoramic image is warped to compensate for the camera orientation and to generate an image from an approximately level camera. A second CNN architecture is designed to extract the pixelwise horizon line in the warped image. The extracted horizon line is correlated with digital elevation model (DEM) data in the Fourier domain using a MOSSE correlation filter. Finally, we determine the location of the maximum correlation score over the search area to estimate the position of the USV. Comprehensive experiments are performed in a field trial in the archipelago. Our approach provides promising results by achieving position estimates with GPS-level accuracy.

  • 25.
    Häger, Gustav
    et al.
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Bhat, Goutam
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Danelljan, Martin
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Khan, Fahad Shahbaz
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Felsberg, Michael
    Linköpings universitet, Tekniska högskolan. Linköpings universitet, Institutionen för systemteknik, Datorseende.
    Rudol, Piotr
    Linköpings universitet, Tekniska högskolan.
    Doherty, Patrick
    Linköpings universitet, Tekniska högskolan.
    Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV2016Ingår i: Proceedings of the 12th International Symposium on Advances in Visual Computing, 2016Konferensbidrag (Refereegranskat)
    Abstract [en]

    Visual object tracking performance has improved significantly in recent years. Most trackers are based on either of two paradigms: online learning of an appearance model or the use of a pre-trained object detector. Methods based on online learning provide high accuracy, but are prone to model drift. The model drift occurs when the tracker fails to correctly estimate the tracked object’s position. Methods based on a detector on the other hand typically have good long-term robustness, but reduced accuracy compared to online methods.

    Despite the complementarity of the aforementioned approaches, the problem of fusing them into a single framework is largely unexplored. In this paper, we propose a novel fusion between an online tracker and a pre-trained detector for tracking humans from a UAV. The system operates at real-time on a UAV platform. In addition we present a novel dataset for long-term tracking in a UAV setting, that includes scenarios that are typically not well represented in standard visual tracking datasets.

  • 26.
    Häger, Gustav
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Countering bias in tracking evaluations2018Ingår i: Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications / [ed] Francisco Imai, Alain Tremeau and Jose Braz, Science and Technology Publications, Lda , 2018, Vol. 5, s. 581-587Konferensbidrag (Refereegranskat)
    Abstract [en]

    Recent years have witnessed a significant leap in visual object tracking performance mainly due to powerfulfeatures, sophisticated learning methods and the introduction of benchmark datasets. Despite this significantimprovement, the evaluation of state-of-the-art object trackers still relies on the classical intersection overunion (IoU) score. In this work, we argue that the object tracking evaluations based on classical IoU score aresub-optimal. As our first contribution, we theoretically prove that the IoU score is biased in the case of largetarget objects and favors over-estimated target prediction sizes. As our second contribution, we propose a newscore that is unbiased with respect to target prediction size. We systematically evaluate our proposed approachon benchmark tracking data with variations in relative target size. Our empirical results clearly suggest thatthe proposed score is unbiased in general.

  • 27.
    Johnander, Joakim
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Bhat, Goutam
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    On the Optimization of Advanced DCF-Trackers2018Ingår i: Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8-14, 2018, Proceedings, Part I / [ed] Laura Leal-TaixéStefan Roth, Cham: Springer Publishing Company, 2018, s. 54-69Konferensbidrag (Refereegranskat)
    Abstract [en]

    Trackers based on discriminative correlation filters (DCF) have recently seen widespread success and in this work we dive into their numerical core. DCF-based trackers interleave learning of the target detector and target state inference based on this detector. Whereas the original formulation includes a closed-form solution for the filter learning, recently introduced improvements to the framework no longer have known closed-form solutions. Instead a large-scale linear least squares problem must be solved each time the detector is updated. We analyze the procedure used to optimize the detector and let the popular scheme introduced with ECO serve as a baseline. The ECO implementation is revisited in detail and several mechanisms are provided with alternatives. With comprehensive experiments we show which configurations are superior in terms of tracking capabilities and optimization performance.

  • 28.
    Johnander, Joakim
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking2017Ingår i: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10424, s. 55-67Konferensbidrag (Refereegranskat)
    Abstract [en]

    Discriminative Correlation Filter (DCF) based methods have shown competitive performance on tracking benchmarks in recent years. Generally, DCF based trackers learn a rigid appearance model of the target. However, this reliance on a single rigid appearance model is insufficient in situations where the target undergoes non-rigid transformations. In this paper, we propose a unified formulation for learning a deformable convolution filter. In our framework, the deformable filter is represented as a linear combination of sub-filters. Both the sub-filter coefficients and their relative locations are inferred jointly in our formulation. Experiments are performed on three challenging tracking benchmarks: OTB-2015, TempleColor and VOT2016. Our approach improves the baseline method, leading to performance comparable to state-of-the-art.

  • 29.
    Järemo Lawin, Felix
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Forssén, Per-Erik
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Density Adaptive Point Set Registration2018Ingår i: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018, s. 3829-3837Konferensbidrag (Refereegranskat)
    Abstract [en]

    Probabilistic methods for point set registration have demonstrated competitive results in recent years. These techniques estimate a probability distribution model of the point clouds. While such a representation has shown promise, it is highly sensitive to variations in the density of 3D points. This fundamental problem is primarily caused by changes in the sensor location across point sets.    We revisit the foundations of the probabilistic registration paradigm. Contrary to previous works, we model the underlying structure of the scene as a latent probability distribution, and thereby induce invariance to point set density changes. Both the probabilistic model of the scene and the registration parameters are inferred by minimizing the Kullback-Leibler divergence in an Expectation Maximization based framework. Our density-adaptive registration successfully handles severe density variations commonly encountered in terrestrial Lidar applications. We perform extensive experiments on several challenging real-world Lidar datasets. The results demonstrate that our approach outperforms state-of-the-art probabilistic methods for multi-view registration, without the need of re-sampling.

  • 30.
    Järemo-Lawin, Felix
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Tosteberg, Patrik
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Bhat, Goutam
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Deep Projective 3D Semantic Segmentation2017Ingår i: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, s. 95-107Konferensbidrag (Refereegranskat)
    Abstract [en]

    Semantic segmentation of 3D point clouds is a challenging problem with numerous real-world applications. While deep learning has revolutionized the field of image semantic segmentation, its impact on point cloud data has been limited so far. Recent attempts, based on 3D deep learning approaches (3D-CNNs), have achieved below-expected results. Such methods require voxelizations of the underlying point cloud data, leading to decreased spatial resolution and increased memory consumption. Additionally, 3D-CNNs greatly suffer from the limited availability of annotated datasets.

  • 31.
    Khan, Fahad Shahbaz
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Anwer, Rao Muhammad
    Universitat Autonoma de Barcelona, Spain.
    van de Weijer, Joost
    Universitat Autonoma de Barcelona, Spain.
    Bagdanov, Andrew D.
    Universitat Autonoma de Barcelona, Spain.
    Vanrell, Maria
    Universitat Autonoma de Barcelona, Spain.
    Lopez, Antonio M.
    Universitat Autonoma de Barcelona, Spain.
    Color Attributes for Object Detection2012Ingår i: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2012, IEEE , 2012, s. 3306-3313Konferensbidrag (Refereegranskat)
    Abstract [en]

    State-of-the-art object detectors typically use shape information as a low level feature representation to capture the local structure of an object. This paper shows that early fusion of shape and color, as is popular in image classification, leads to a significant drop in performance for object detection. Moreover, such approaches also yields suboptimal results for object categories with varying importance of color and shape. In this paper we propose the use of color attributes as an explicit color representation for object detection. Color attributes are compact, computationally efficient, and when combined with traditional shape features provide state-of-the-art results for object detection. Our method is tested on the PASCAL VOC 2007 and 2009 datasets and results clearly show that our method improves over state-of-the-art techniques despite its simplicity. We also introduce a new dataset consisting of cartoon character images in which color plays a pivotal role. On this dataset, our approach yields a significant gain of 14% in mean AP over conventional state-of-the-art methods.

  • 32.
    Khan, Fahad Shahbaz
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Beigpour, Shida
    Norwegian Colour and Visual Computing Laboratory, Gjovik University College, Gjøvik, Norway.
    van de Weijer, Joost
    Computer Vision Center, CS Dept. Universitat Autonoma de Barcelona, Spain.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV.
    Painting-91: a large scale database for computational painting categorization2014Ingår i: Machine Vision and Applications, ISSN 0932-8092, E-ISSN 1432-1769, Vol. 25, nr 6, s. 1385-1397Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Computer analysis of visual art, especially paintings, is an interesting cross-disciplinary research domain. Most of the research in the analysis of paintings involve medium to small range datasets with own specific settings. Interestingly, significant progress has been made in the field of object and scene recognition lately. A key factor in this success is the introduction and availability of benchmark datasets for evaluation. Surprisingly, such a benchmark setup is still missing in the area of computational painting categorization. In this work, we propose a novel large scale dataset of digital paintings. The dataset consists of paintings from 91 different painters. We further show three applications of our dataset namely: artist categorization, style classification and saliency detection. We investigate how local and global features popular in image classification perform for the tasks of artist and style categorization. For both categorization tasks, our experimental results suggest that combining multiple features significantly improves the final performance. We show that state-of-the-art computer vision methods can correctly classify 50 % of unseen paintings to its painter in a large dataset and correctly attribute its artistic style in over 60 % of the cases. Additionally, we explore the task of saliency detection on paintings and show experimental findings using state-of-the-art saliency estimation algorithms.

  • 33.
    Khan, Fahad Shahbaz
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Muhammad Anwer, Rao
    Department of Information and Computer Science, Aalto University School of Science, Finland.
    van de Weijer, Joost
    Computer Vision Center, CS Dept. Universitat Autonoma de Barcelona, Spain.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV. Linköpings universitet, Tekniska högskolan.
    Laaksonen, Jorma
    Department of Information and Computer Science, Aalto University School of Science, Finland.
    Compact color–texture description for texture classification2015Ingår i: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 51, s. 16-22Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Describing textures is a challenging problem in computer vision and pattern recognition. The classification problem involves assigning a category label to the texture class it belongs to. Several factors such as variations in scale, illumination and viewpoint make the problem of texture description extremely challenging. A variety of histogram based texture representations exists in literature. However, combining multiple texture descriptors and assessing their complementarity is still an open research problem. In this paper, we first show that combining multiple local texture descriptors significantly improves the recognition performance compared to using a single best method alone. This gain in performance is achieved at the cost of high-dimensional final image representation. To counter this problem, we propose to use an information-theoretic compression technique to obtain a compact texture description without any significant loss in accuracy. In addition, we perform a comprehensive evaluation of pure color descriptors, popular in object recognition, for the problem of texture classification. Experiments are performed on four challenging texture datasets namely, KTH-TIPS-2a, KTH-TIPS-2b, FMD and Texture-10. The experiments clearly demonstrate that our proposed compact multi-texture approach outperforms the single best texture method alone. In all cases, discriminative color names outperforms other color features for texture classification. Finally, we show that combining discriminative color names with compact texture representation outperforms state-of-the-art methods by 7.8%,4.3%7.8%,4.3% and 5.0%5.0% on KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets respectively.

  • 34.
    Khan, Fahad Shahbaz
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Rao, Muhammad Anwer
    Computer vision Center Barcelona, Universitat Autonoma de Barcelona, Spain.
    van de Weijer, Joost
    Computer vision Center Barcelona, Universitat Autonoma de Barcelona, Spain.
    Bagdanov, Andrew
    Media Integration and Communication Center, University of Florence, Florence, Italy.
    Lopez, Antonio
    Computer vision Center Barcelona, Universitat Autonoma de Barcelona, Spain.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV.
    Coloring Action Recognition in Still Images2013Ingår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 105, nr 3, s. 205-221Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this article we investigate the problem of human action recognition in static images. By action recognition we intend a class of problems which includes both action classification and action detection (i.e. simultaneous localization and classification). Bag-of-words image representations yield promising results for action classification, and deformable part models perform very well object detection. The representations for action recognition typically use only shape cues and ignore color information. Inspired by the recent success of color in image classification and object detection, we investigate the potential of color for action classification and detection in static images. We perform a comprehensive evaluation of color descriptors and fusion approaches for action recognition. Experiments were conducted on the three datasets most used for benchmarking action recognition in still images: Willow, PASCAL VOC 2010 and Stanford-40. Our experiments demonstrate that incorporating color information considerably improves recognition performance, and that a descriptor based on color names outperforms pure color descriptors. Our experiments demonstrate that late fusion of color and shape information outperforms other approaches on action recognition. Finally, we show that the different color–shape fusion approaches result in complementary information and combining them yields state-of-the-art performance for action classification.

  • 35.
    Khan, Fahad Shahbaz
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Rao, Muhammad Anwer
    Department of Information and Computer Science, Aalto University School of Science, Aalto, Finland.
    van de Weijer, Joost
    Computer Vision Center, CS Department, Universitet Autonoma de Barcelona, Barcelona, Spain.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Laaksonen, Jorma
    Department of Information and Computer Science, Aalto University School of Science, Aalto, Finland.
    Deep Semantic Pyramids for Human Attributes and Action Recognition2015Ingår i: Image Analysis: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Paulsen, Rasmus R., Pedersen, Kim S., Springer, 2015, Vol. 9127, s. 341-353Konferensbidrag (Refereegranskat)
    Abstract [en]

    Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features.

    We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature.

  • 36.
    Khan, Fahad Shahbaz
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Van de Weijer, Joost
    Universitat Autonoma de Barcelona, Spain .
    Ali, Sadiq
    Universitat Autonoma de Barcelona, Spain .
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV.
    Evaluating the Impact of Color on Texture Recognition2013Ingår i: Computer Analysis of Images and Patterns: 15th International Conference, CAIP 2013, York, UK, August 27-29, 2013, Proceedings, Part I / [ed] Richard Wilson, Edwin Hancock, Adrian Bors, William Smith, Springer Berlin/Heidelberg, 2013, s. 154-162Konferensbidrag (Refereegranskat)
    Abstract [en]

    State-of-the-art texture descriptors typically operate on grey scale images while ignoring color information. A common way to obtain a joint color-texture representation is to combine the two visual cues at the pixel level. However, such an approach provides sub-optimal results for texture categorisation task.

    In this paper we investigate how to optimally exploit color information for texture recognition. We evaluate a variety of color descriptors, popular in image classification, for texture categorisation. In addition we analyze different fusion approaches to combine color and texture cues. Experiments are conducted on the challenging scenes and 10 class texture datasets. Our experiments clearly suggest that in all cases color names provide the best performance. Late fusion is the best strategy to combine color and texture. By selecting the best color descriptor with optimal fusion strategy provides a gain of 5% to 8% compared to texture alone on scenes and texture datasets.

  • 37.
    Khan, Fahad
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Van, De Weijer J.
    Computer Vision Center, CS Department, Universitat Autonoma de Barcelona, Spain.
    Bagdanov, A.D.
    Computer Vision Center, CS Department, Universitat Autonoma de Barcelona, Spain.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV.
    Scale coding bag-of-words for action recognition2014Ingår i: Pattern Recognition (ICPR), 2014 22nd International Conference on, Institute of Electrical and Electronics Engineers Inc. , 2014, nr 6976979, s. 1514-1519Konferensbidrag (Refereegranskat)
    Abstract [en]

    Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image. Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant strategy is sub-optimal since it ignores the multi-scale information available with each bounding box of a person. This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music, riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.

  • 38.
    Khan, Fahad
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    van de Weijer, Joost
    University of Autonoma Barcelona, Spain.
    Muhammad Anwer, Rao
    Aalto University, Finland.
    Bagdanov, Andrew D.
    University of Florence, Italy.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Laaksonen, Jorma
    Aalto University, Finland.
    Scale coding bag of deep features for human attribute and action recognition2018Ingår i: Machine Vision and Applications, ISSN 0932-8092, E-ISSN 1432-1769, Vol. 29, nr 1, s. 55-71Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a bag of deep features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state of the art.

  • 39.
    Khan, Fahad
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    van de Weijer, Joost
    Comp Vis Centre, Spain .
    Muhammad Anwer, Rao
    Aalto University, Finland .
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Gatta, Carlo
    Comp Vis Centre, Spain .
    Semantic Pyramids for Gender and Action Recognition2014Ingår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 23, nr 8, s. 3633-3645Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.

  • 40.
    Khan, Fahad
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Xu, Jiaolong
    Comp Vis Centre Barcelona, Spain.
    van de Weijer, Joost
    Comp Vis Centre Barcelona, Spain.
    Bagdanov, Andrew D.
    Comp Vis Centre Barcelona, Spain.
    Muhammad Anwer, Rao
    Aalto University, Finland.
    Lopez, Antonio M.
    Comp Vis Centre Barcelona, Spain.
    Recognizing Actions Through Action-Specific Person Detection2015Ingår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 24, nr 11, s. 4422-4432Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Action recognition in still images is a challenging problem in computer vision. To facilitate comparative evaluation independently of person detection, the standard evaluation protocol for action recognition uses an oracle person detector to obtain perfect bounding box information at both training and test time. The assumption is that, in practice, a general person detector will provide candidate bounding boxes for action recognition. In this paper, we argue that this paradigm is suboptimal and that action class labels should already be considered during the detection stage. Motivated by the observation that body pose is strongly conditioned on action class, we show that: 1) the existing state-of-the-art generic person detectors are not adequate for proposing candidate bounding boxes for action classification; 2) due to limited training examples, the direct training of action-specific person detectors is also inadequate; and 3) using only a small number of labeled action examples, the transfer learning is able to adapt an existing detector to propose higher quality bounding boxes for subsequent action classification. To the best of our knowledge, we are the first to investigate transfer learning for the task of action-specific person detection in still images. We perform extensive experiments on two benchmark data sets: 1) Stanford-40 and 2) PASCAL VOC 2012. For the action detection task (i.e., both person localization and classification of the action performed), our approach outperforms methods based on general person detection by 5.7% mean average precision (MAP) on Stanford-40 and 2.1% MAP on PASCAL VOC 2012. Our approach also significantly outperforms the state of the art with a MAP of 45.4% on Stanford-40 and 31.4% on PASCAL VOC 2012. We also evaluate our action detection approach for the task of action classification (i.e., recognizing actions without localizing them). For this task, our approach, without using any ground-truth person localization at test time, outperforms on both data sets state-of-the-art methods, which do use person locations.

  • 41.
    Khan, Rahat
    et al.
    Université de Saint- Étienne, France.
    Van de Weijer, Joost
    Computer Vision Center, Barcelona, Spain.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska högskolan.
    Muselet, Damien
    Université de Saint- Étienne, France.
    Ducottet, Christophe
    Université de Saint- Étienne, France.
    Barat, Cecile
    Université de Saint- Étienne, France.
    Discriminative Color Descriptors2013Ingår i: Computer Vision and Pattern Recognition (CVPR), 2013, IEEE Computer Society, 2013, s. 2866-2873Konferensbidrag (Refereegranskat)
    Abstract [en]

    Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.

  • 42.
    Kristan, Matej
    et al.
    University of Ljubljana, Slovenia.
    Leonardis, Ales
    University of Birmingham, England.
    Matas, Jiri
    Czech Technical University, Czech Republic.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Pflugfelder, Roman
    Austrian Institute Technology, Austria.
    Cehovin, Luka
    University of Ljubljana, Slovenia.
    Vojir, Tomas
    Czech Technical University, Czech Republic.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Lukezic, Alan
    University of Ljubljana, Slovenia.
    Fernandez, Gustavo
    Austrian Institute Technology, Austria.
    Gupta, Abhinav
    Carnegie Mellon University, PA 15213 USA.
    Petrosino, Alfredo
    Parthenope University of Naples, Italy.
    Memarmoghadam, Alireza
    University of Isfahan, Iran.
    Garcia-Martin, Alvaro
    University of Autonoma Madrid, Spain.
    Solis Montero, Andres
    University of Ottawa, Canada.
    Vedaldi, Andrea
    University of Oxford, England.
    Robinson, Andreas
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Ma, Andy J.
    Hong Kong Baptist University, Peoples R China.
    Varfolomieiev, Anton
    Kyiv Polytech Institute, Ukraine.
    Alatan, Aydin
    Middle East Technical University, Çankaya, Turkey.
    Erdem, Aykut
    Hacettepe University, Turkey.
    Ghanem, Bernard
    KAUST, Saudi Arabia.
    Liu, Bin
    Moshanghua Technology Co, Peoples R China.
    Han, Bohyung
    POSTECH, South Korea.
    Martinez, Brais
    University of Nottingham, England.
    Chang, Chang-Ming
    University of Albany, GA USA.
    Xu, Changsheng
    Chinese Academic Science, Peoples R China.
    Sun, Chong
    Dalian University of Technology, Peoples R China.
    Kim, Daijin
    POSTECH, South Korea.
    Chen, Dapeng
    Xi An Jiao Tong University, Peoples R China.
    Du, Dawei
    University of Chinese Academic Science, Peoples R China.
    Mishra, Deepak
    Indian Institute Space Science and Technology, India.
    Yeung, Dit-Yan
    Hong Kong University of Science and Technology, Peoples R China.
    Gundogdu, Erhan
    Aselsan Research Centre, Turkey.
    Erdem, Erkut
    Hacettepe University, Turkey.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Porikli, Fatih
    ARC Centre Excellence Robot Vis, Australia; Australian National University, Australia; CSIRO, Australia.
    Zhao, Fei
    Chinese Academic Science, Peoples R China.
    Bunyak, Filiz
    University of Missouri, MO 65211 USA.
    Battistone, Francesco
    Parthenope University of Naples, Italy.
    Zhu, Gao
    Australian National University, Australia.
    Roffo, Giorgio
    University of Verona, Italy.
    Sai Subrahmanyam, Gorthi R. K.
    Indian Institute Space Science and Technology, India.
    Bastos, Guilherme
    University of Federal Itajuba, Brazil.
    Seetharaman, Guna
    US Navy, DC 20375 USA.
    Medeiros, Henry
    Marquette University, WI 53233 USA.
    Li, Hongdong
    ARC Centre Excellence Robot Vis, Australia; Australian National University, Australia.
    Qi, Honggang
    University of Chinese Academic Science, Peoples R China.
    Bischof, Horst
    Graz University of Technology, Austria.
    Possegger, Horst
    Graz University of Technology, Austria.
    Lu, Huchuan
    Dalian University of Technology, Peoples R China.
    Lee, Hyemin
    POSTECH, South Korea.
    Nam, Hyeonseob
    NAVER Corp, South Korea.
    Jin Chang, Hyung
    Imperial Coll London, England.
    Drummond, Isabela
    University of Federal Itajuba, Brazil.
    Valmadre, Jack
    University of Oxford, England.
    Jeong, Jae-chan
    ASRI, South Korea; Elect and Telecommun Research Institute, South Korea.
    Cho, Jae-il
    Elect and Telecommun Research Institute, South Korea.
    Lee, Jae-Yeong
    Elect and Telecommun Research Institute, South Korea.
    Zhu, Jianke
    Zhejiang University, Peoples R China.
    Feng, Jiayi
    Chinese Academic Science, Peoples R China.
    Gao, Jin
    Chinese Academic Science, Peoples R China.
    Young Choi, Jin
    ASRI, South Korea.
    Xiao, Jingjing
    University of Birmingham, England.
    Kim, Ji-Wan
    Elect and Telecommun Research Institute, South Korea.
    Jeong, Jiyeoup
    ASRI, South Korea; Elect and Telecommun Research Institute, South Korea.
    Henriques, Joao F.
    University of Oxford, England.
    Lang, Jochen
    University of Ottawa, Canada.
    Choi, Jongwon
    ASRI, South Korea.
    Martinez, Jose M.
    University of Autonoma Madrid, Spain.
    Xing, Junliang
    Chinese Academic Science, Peoples R China.
    Gao, Junyu
    Chinese Academic Science, Peoples R China.
    Palaniappan, Kannappan
    University of Missouri, MO 65211 USA.
    Lebeda, Karel
    University of Surrey, England.
    Gao, Ke
    University of Missouri, MO 65211 USA.
    Mikolajczyk, Krystian
    Imperial Coll London, England.
    Qin, Lei
    Chinese Academic Science, Peoples R China.
    Wang, Lijun
    Dalian University of Technology, Peoples R China.
    Wen, Longyin
    University of Albany, GA USA.
    Bertinetto, Luca
    University of Oxford, England.
    Kumar Rapuru, Madan
    Indian Institute Space Science and Technology, India.
    Poostchi, Mahdieh
    University of Missouri, MO 65211 USA.
    Maresca, Mario
    Parthenope University of Naples, Italy.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Mueller, Matthias
    KAUST, Saudi Arabia.
    Zhang, Mengdan
    Chinese Academic Science, Peoples R China.
    Arens, Michael
    Fraunhofer IOSB, Germany.
    Valstar, Michel
    University of Nottingham, England.
    Tang, Ming
    Chinese Academic Science, Peoples R China.
    Baek, Mooyeol
    POSTECH, South Korea.
    Haris Khan, Muhammad
    University of Nottingham, England.
    Wang, Naiyan
    Hong Kong University of Science and Technology, Peoples R China.
    Fan, Nana
    Harbin Institute Technology, Peoples R China.
    Al-Shakarji, Noor
    University of Missouri, MO 65211 USA.
    Miksik, Ondrej
    University of Oxford, England.
    Akin, Osman
    Hacettepe University, Turkey.
    Moallem, Payman
    University of Isfahan, Iran.
    Senna, Pedro
    University of Federal Itajuba, Brazil.
    Torr, Philip H. S.
    University of Oxford, England.
    Yuen, Pong C.
    Hong Kong Baptist University, Peoples R China.
    Huang, Qingming
    Harbin Institute Technology, Peoples R China; University of Chinese Academic Science, Peoples R China.
    Martin-Nieto, Rafael
    University of Autonoma Madrid, Spain.
    Pelapur, Rengarajan
    University of Missouri, MO 65211 USA.
    Bowden, Richard
    University of Surrey, England.
    Laganiere, Robert
    University of Ottawa, Canada.
    Stolkin, Rustam
    University of Birmingham, England.
    Walsh, Ryan
    Marquette University, WI 53233 USA.
    Krah, Sebastian B.
    Fraunhofer IOSB, Germany.
    Li, Shengkun
    Hong Kong University of Science and Technology, Peoples R China; University of Albany, GA USA.
    Zhang, Shengping
    Harbin Institute Technology, Peoples R China.
    Yao, Shizeng
    University of Missouri, MO 65211 USA.
    Hadfield, Simon
    University of Surrey, England.
    Melzi, Simone
    University of Verona, Italy.
    Lyu, Siwei
    University of Albany, GA USA.
    Li, Siyi
    Hong Kong University of Science and Technology, Peoples R China; University of Albany, GA USA.
    Becker, Stefan
    Fraunhofer IOSB, Germany.
    Golodetz, Stuart
    University of Oxford, England.
    Kakanuru, Sumithra
    Indian Institute Space Science and Technology, India.
    Choi, Sunglok
    Elect and Telecommun Research Institute, South Korea.
    Hu, Tao
    University of Chinese Academic Science, Peoples R China.
    Mauthner, Thomas
    Graz University of Technology, Austria.
    Zhang, Tianzhu
    Chinese Academic Science, Peoples R China.
    Pridmore, Tony
    University of Nottingham, England.
    Santopietro, Vincenzo
    Parthenope University of Naples, Italy.
    Hu, Weiming
    Chinese Academic Science, Peoples R China.
    Li, Wenbo
    Lehigh University, PA 18015 USA.
    Huebner, Wolfgang
    Fraunhofer IOSB, Germany.
    Lan, Xiangyuan
    Hong Kong Baptist University, Peoples R China.
    Wang, Xiaomeng
    University of Nottingham, England.
    Li, Xin
    Harbin Institute Technology, Peoples R China.
    Li, Yang
    Zhejiang University, Peoples R China.
    Demiris, Yiannis
    Imperial Coll London, England.
    Wang, Yifan
    Dalian University of Technology, Peoples R China.
    Qi, Yuankai
    Harbin Institute Technology, Peoples R China.
    Yuan, Zejian
    Xi An Jiao Tong University, Peoples R China.
    Cai, Zexiong
    Hong Kong Baptist University, Peoples R China.
    Xu, Zhan
    Zhejiang University, Peoples R China.
    He, Zhenyu
    Harbin Institute Technology, Peoples R China.
    Chi, Zhizhen
    Dalian University of Technology, Peoples R China.
    The Visual Object Tracking VOT2016 Challenge Results2016Ingår i: COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, SPRINGER INT PUBLISHING AG , 2016, Vol. 9914, s. 777-823Konferensbidrag (Refereegranskat)
    Abstract [en]

    The Visual Object Tracking challenge VOT2016 aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 70 trackers are presented, with a large number of trackers being published at major computer vision conferences and journals in the recent years. The number of tested state-of-the-art trackers makes the VOT 2016 the largest and most challenging benchmark on short-term tracking to date. For each participating tracker, a short description is provided in the Appendix. The VOT2016 goes beyond its predecessors by (i) introducing a new semi-automatic ground truth bounding box annotation methodology and (ii) extending the evaluation system with the no-reset experiment.

  • 43.
    Kristan, Matej
    et al.
    Univ Ljubljana, Slovenia.
    Leonardis, Ales
    Univ Birmingham, England.
    Matas, Jiri
    Czech Tech Univ, Czech Republic.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Pflugfelder, Roman
    Austrian Inst Technol, Austria.
    Zajc, Luka Cehovin
    Univ Ljubljana, Slovenia.
    Vojir, Tomas
    Czech Tech Univ, Czech Republic.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Lukezic, Alan
    Univ Ljubljana, Slovenia.
    Eldesokey, Abdelrahman
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Fernandez, Gustavo
    Austrian Inst Technol, Austria.
    Garcia-Martin, Alvaro
    Univ Autonoma Madrid, Spain.
    Muhic, A.
    Univ Ljubljana, Slovenia.
    Petrosino, Alfredo
    Univ Parthenope Naples, Italy.
    Memarmoghadam, Alireza
    Univ Isfahan, Iran.
    Vedaldi, Andrea
    Univ Oxford, England.
    Manzanera, Antoine
    Univ Paris Saclay, France.
    Tran, Antoine
    Univ Paris Saclay, France.
    Alatan, Aydin
    Middle East Tech Univ, Turkey.
    Mocanu, Bogdan
    Univ Politehn Bucuresti, Romania.
    Chen, Boyu
    Dalian Univ Technol, Peoples R China.
    Huang, Chang
    Horizon Robot Inc, Peoples R China.
    Xu, Changsheng
    Chinese Acad Sci, Peoples R China.
    Sun, Chong
    Dalian Univ Technol, Peoples R China.
    Du, Dalong
    Horizon Robot Inc, Peoples R China; Univ Chinese Acad Sci, Peoples R China.
    Zhang, David
    Hong Kong Polytech Univ, Peoples R China.
    Du, Dawei
    Horizon Robot Inc, Peoples R China; Univ Chinese Acad Sci, Peoples R China.
    Mishra, Deepak
    Indian Inst Space Sci and Technol Trivandrum, India.
    Gundogdu, Erhan
    Aselsan Res Ctr, Turkey; Middle East Tech Univ, Turkey.
    Velasco-Salido, Erik
    Univ Autonoma Madrid, Spain.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Battistone, Francesco
    Univ Parthenope Naples, Italy.
    Subrahmanyam, Gorthi R. K. Sai
    Indian Inst Space Sci and Technol Trivandrum, India.
    Bhat, Goutam
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Huang, Guan
    Horizon Robot Inc, Peoples R China.
    Bastos, Guilherme
    Univ Fed Itajuba, Brazil.
    Seetharaman, Guna
    Naval Res Lab, DC 20375 USA.
    Zhang, Hongliang
    Natl Univ Def Technol, Peoples R China.
    Li, Houqiang
    Univ Sci and Technol China, Peoples R China.
    Lu, Huchuan
    Dalian Univ Technol, Peoples R China.
    Drummond, Isabela
    Univ Fed Itajuba, Brazil.
    Valmadre, Jack
    Univ Oxford, England.
    Jeong, Jae-Chan
    ETRI, South Korea.
    Cho, Jae-Il
    ETRI, South Korea.
    Lee, Jae-Yeong
    ETRI, South Korea.
    Noskova, Jana
    Czech Tech Univ, Czech Republic.
    Zhu, Jianke
    Zhejiang Univ, Peoples R China.
    Gao, Jin
    Chinese Acad Sci, Peoples R China.
    Liu, Jingyu
    Chinese Acad Sci, Peoples R China.
    Kim, Ji-Wan
    ETRI, South Korea.
    Henriques, Joao F.
    Univ Oxford, England.
    Martinez, Jose M.
    Univ Autonoma Madrid, Spain.
    Zhuang, Junfei
    Beijing Univ Posts and Telecommun, Peoples R China.
    Xing, Junliang
    Chinese Acad Sci, Peoples R China.
    Gao, Junyu
    Chinese Acad Sci, Peoples R China.
    Chen, Kai
    Huazhong Univ Sci and Technol, Peoples R China.
    Palaniappan, Kannappan
    Univ Missouri Columbia, MO USA.
    Lebeda, Karel
    The Foundry, England.
    Gao, Ke
    Univ Missouri Columbia, MO USA.
    Kitani, Kris M.
    Carnegie Mellon Univ, PA 15213 USA.
    Zhang, Lei
    Hong Kong Polytech Univ, Peoples R China.
    Wang, Lijun
    Dalian Univ Technol, Peoples R China.
    Yang, Lingxiao
    Hong Kong Polytech Univ, Peoples R China.
    Wen, Longyin
    GE Global Res, NY USA.
    Bertinetto, Luca
    Univ Oxford, England.
    Poostchi, Mahdieh
    Univ Missouri Columbia, MO USA.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Mueller, Matthias
    KAUST, Saudi Arabia.
    Zhang, Mengdan
    Chinese Acad Sci, Peoples R China.
    Yang, Ming-Hsuan
    Univ Calif Merced, CA USA.
    Xie, Nianhao
    Natl Univ Def Technol, Peoples R China.
    Wang, Ning
    Univ Sci and Technol China, Peoples R China.
    Miksik, Ondrej
    Univ Oxford, England.
    Moallem, P.
    Univ Isfahan, Iran.
    Venugopal, Pallavi M.
    Indian Inst Space Sci and Technol Trivandrum, India.
    Senna, Pedro
    Univ Fed Itajuba, Brazil.
    Torr, Philip H. S.
    Univ Oxford, England.
    Wang, Qiang
    Chinese Acad Sci, Peoples R China.
    Yu, Qifeng
    Natl Univ Def Technol, Peoples R China.
    Huang, Qingming
    Univ Chinese Acad Sci, Peoples R China.
    Martin-Nieto, Rafael
    Univ Autonoma Madrid, Spain.
    Bowden, Richard
    Univ Surrey, England.
    Liu, Risheng
    Dalian Univ Technol, Peoples R China.
    Tapu, Ruxandra
    Univ Politehn Bucuresti, Romania.
    Hadfield, Simon
    Univ Surrey, England.
    Lyu, Siwei
    SUNY Albany, NY 12222 USA.
    Golodetz, Stuart
    Univ Oxford, England.
    Choi, Sunglok
    ETRI, South Korea.
    Zhang, Tianzhu
    Chinese Acad Sci, Peoples R China.
    Zaharia, Titus
    Inst. Mines-Telecom/ TelecomSudParis, France.
    Santopietro, Vincenzo
    Univ Parthenope Naples, Italy.
    Zou, Wei
    Chinese Acad Sci, Peoples R China.
    Hu, Weiming
    Chinese Acad Sci, Peoples R China.
    Tao, Wenbing
    Huazhong Univ Sci and Technol, Peoples R China.
    Li, Wenbo
    SUNY Albany, NY 12222 USA.
    Zhou, Wengang
    Univ Sci and Technol China, Peoples R China.
    Yu, Xianguo
    Natl Univ Def Technol, Peoples R China.
    Bian, Xiao
    GE Global Res, NY USA.
    Li, Yang
    Zhejiang Univ, Peoples R China.
    Xing, Yifan
    Carnegie Mellon Univ, PA 15213 USA.
    Fan, Yingruo
    Beijing Univ Posts and Telecommun, Peoples R China.
    Zhu, Zheng
    Chinese Acad Sci, Peoples R China; Univ Chinese Acad Sci, Peoples R China.
    Zhang, Zhipeng
    Chinese Acad Sci, Peoples R China.
    He, Zhiqun
    Beijing Univ Posts and Telecommun, Peoples R China.
    The Visual Object Tracking VOT2017 challenge results2017Ingår i: 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), IEEE , 2017, s. 1949-1972Konferensbidrag (Refereegranskat)
    Abstract [en]

    The Visual Object Tracking challenge VOT2017 is the fifth annual tracker benchmarking activity organized by the VOT initiative. Results of 51 trackers are presented; many are state-of-the-art published at major computer vision conferences or journals in recent years. The evaluation included the standard VOT and other popular methodologies and a new "real-time" experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The VOT2017 goes beyond its predecessors by (i) improving the VOT public dataset and introducing a separate VOT2017 sequestered dataset, (ii) introducing a realtime tracking experiment and (iii) releasing a redesigned toolkit that supports complex experiments. The dataset, the evaluation kit and the results are publicly available at the challenge website(1).

  • 44.
    Kristan, Matej
    et al.
    University of Ljubljana, Slovenia.
    Leonardis, Aleš
    University of Birmingham, United Kingdom.
    Matas, Jirí
    Czech Technical University, Czech Republic.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Pflugfelder, Roman
    Austrian Institute of Technology, Austria / TU Wien, Austria.
    Zajc, Luka Cehovin
    University of Ljubljana, Slovenia.
    Vojírì, Tomáš
    Czech Technical University, Czech Republic.
    Bhat, Goutam
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Lukezič, Alan
    University of Ljubljana, Slovenia.
    Eldesokey, Abdelrahman
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Fernández, Gustavo
    García-Martín, Álvaro
    Iglesias-Arias, Álvaro
    Alatan, A. Aydin
    González-García, Abel
    Petrosino, Alfredo
    Memarmoghadam, Alireza
    Vedaldi, Andrea
    Muhič, Andrej
    He, Anfeng
    Smeulders, Arnold
    Perera, Asanka G.
    Li, Bo
    Chen, Boyu
    Kim, Changick
    Xu, Changsheng
    Xiong, Changzhen
    Tian, Cheng
    Luo, Chong
    Sun, Chong
    Hao, Cong
    Kim, Daijin
    Mishra, Deepak
    Chen, Deming
    Wang, Dong
    Wee, Dongyoon
    Gavves, Efstratios
    Gundogdu, Erhan
    Velasco-Salido, Erik
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Yang, Fan
    Zhao, Fei
    Li, Feng
    Battistone, Francesco
    De Ath, George
    Subrahmanyam, Gorthi R. K. S.
    Bastos, Guilherme
    Ling, Haibin
    Galoogahi, Hamed Kiani
    Lee, Hankyeol
    Li, Haojie
    Zhao, Haojie
    Fan, Heng
    Zhang, Honggang
    Possegger, Horst
    Li, Houqiang
    Lu, Huchuan
    Zhi, Hui
    Li, Huiyun
    Lee, Hyemin
    Chang, Hyung Jin
    Drummond, Isabela
    Valmadre, Jack
    Martin, Jaime Spencer
    Chahl, Javaan
    Choi, Jin Young
    Li, Jing
    Wang, Jinqiao
    Qi, Jinqing
    Sung, Jinyoung
    Johnander, Joakim
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Henriques, Joao
    Choi, Jongwon
    van de Weijer, Joost
    Herranz, Jorge Rodríguez
    Martínez, José M.
    Kittler, Josef
    Zhuang, Junfei
    Gao, Junyu
    Grm, Klemen
    Zhang, Lichao
    Wang, Lijun
    Yang, Lingxiao
    Rout, Litu
    Si, Liu
    Bertinetto, Luca
    Chu, Lutao
    Che, Manqiang
    Maresca, Mario Edoardo
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Yang, Ming-Hsuan
    Abdelpakey, Mohamed
    Shehata, Mohamed
    Kang, Myunggu
    Lee, Namhoon
    Wang, Ning
    Miksik, Ondrej
    Moallem, P.
    Vicente-Moñivar, Pablo
    Senna, Pedro
    Li, Peixia
    Torr, Philip
    Raju, Priya Mariam
    Ruihe, Qian
    Wang, Qiang
    Zhou, Qin
    Guo, Qing
    Martín-Nieto, Rafael
    Gorthi, Rama Krishna
    Tao, Ran
    Bowden, Richard
    Everson, Richard
    Wang, Runling
    Yun, Sangdoo
    Choi, Seokeon
    Vivas, Sergio
    Bai, Shuai
    Huang, Shuangping
    Wu, Sihang
    Hadfield, Simon
    Wang, Siwen
    Golodetz, Stuart
    Ming, Tang
    Xu, Tianyang
    Zhang, Tianzhu
    Fischer, Tobias
    Santopietro, Vincenzo
    Štruc, Vitomir
    Wei, Wang
    Zuo, Wangmeng
    Feng, Wei
    Wu, Wei
    Zou, Wei
    Hu, Weiming
    Zhou, Wengang
    Zeng, Wenjun
    Zhang, Xiaofan
    Wu, Xiaohe
    Wu, Xiao-Jun
    Tian, Xinmei
    Li, Yan
    Lu, Yan
    Law, Yee Wei
    Wu, Yi
    Demiris, Yiannis
    Yang, Yicai
    Jiao, Yifan
    Li, Yuhong
    Zhang, Yunhua
    Sun, Yuxuan
    Zhang, Zheng
    Zhu, Zheng
    Feng, Zhen-Hua
    Wang, Zhihui
    He, Zhiqun
    The Sixth Visual Object Tracking VOT2018 Challenge Results2019Ingår i: Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8–14, 2018 Proceedings, Part I / [ed] Laura Leal-Taixé and Stefan Roth, Cham: Springer Publishing Company, 2019, s. 3-53Konferensbidrag (Refereegranskat)
    Abstract [en]

    The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Results of over eighty trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis and a “real-time” experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. A long-term tracking subchallenge has been introduced to the set of standard VOT sub-challenges. The new subchallenge focuses on long-term tracking properties, namely coping with target disappearance and reappearance. A new dataset has been compiled and a performance evaluation methodology that focuses on long-term tracking capabilities has been adopted. The VOT toolkit has been updated to support both standard short-term and the new long-term tracking subchallenges. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).

  • 45.
    Kristan, Matej
    et al.
    University of Ljubljana, Slovenia.
    Matas, Jiri
    Czech Technical University, Czech Republic.
    Leonardis, Ales
    University of Birmingham, England.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Centrum för medicinsk bildvetenskap och visualisering, CMIV.
    Cehovin, Luka
    University of Ljubljana, Slovenia.
    Fernandez, Gustavo
    Austrian Institute Technology, Austria.
    Vojir, Tomas
    Czech Technical University, Czech Republic.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Nebehay, Georg
    Austrian Institute Technology, Austria.
    Pflugfelder, Roman
    Austrian Institute Technology, Austria.
    Gupta, Abhinav
    Carnegie Mellon University, PA 15213 USA.
    Bibi, Adel
    King Abdullah University of Science and Technology, Saudi Arabia.
    Lukezic, Alan
    University of Ljubljana, Slovenia.
    Garcia-Martins, Alvaro
    University of Autonoma Madrid, Spain.
    Saffari, Amir
    Affectv, England.
    Petrosino, Alfredo
    Parthenope University of Naples, Italy.
    Solis Montero, Andres
    University of Ottawa, Canada.
    Varfolomieiev, Anton
    National Technical University of Ukraine, Ukraine.
    Baskurt, Atilla
    University of Lyon, France.
    Zhao, Baojun
    Beijing Institute Technology, Peoples R China.
    Ghanem, Bernard
    King Abdullah University of Science and Technology, Saudi Arabia.
    Martinez, Brais
    University of Nottingham, England.
    Lee, ByeongJu
    Seoul National University, South Korea.
    Han, Bohyung
    POSTECH, South Korea.
    Wang, Chaohui
    University of Paris Est, France.
    Garcia, Christophe
    LIRIS, France.
    Zhang, Chunyuan
    National University of Def Technology, Peoples R China; National Key Lab Parallel and Distributed Proc, Peoples R China.
    Schmid, Cordelia
    INRIA Grenoble Rhone Alpes, France.
    Tao, Dacheng
    University of Technology Sydney, Australia.
    Kim, Daijin
    POSTECH, South Korea.
    Huang, Dafei
    National University of Def Technology, Peoples R China; National Key Lab Parallel and Distributed Proc, Peoples R China.
    Prokhorov, Danil
    Toyota Research Institute, Japan.
    Du, Dawei
    SUNY Albany, NY USA; Chinese Academic Science, Peoples R China.
    Yeung, Dit-Yan
    Hong Kong University of Science and Technology, Peoples R China.
    Ribeiro, Eraldo
    Florida Institute Technology, FL USA.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Porikli, Fatih
    Australian National University, Australia; NICTA, Australia.
    Bunyak, Filiz
    University of Missouri, MO 65211 USA.
    Zhu, Gao
    Australian National University, Australia.
    Seetharaman, Guna
    Naval Research Lab, DC 20375 USA.
    Kieritz, Hilke
    Fraunhofer IOSB, Germany.
    Tuen Yau, Hing
    Chinese University of Hong Kong, Peoples R China.
    Li, Hongdong
    Australian National University, Australia; ARC Centre Excellence Robot Vis, Australia.
    Qi, Honggang
    SUNY Albany, NY USA; Chinese Academic Science, Peoples R China.
    Bischof, Horst
    Graz University of Technology, Austria.
    Possegger, Horst
    Graz University of Technology, Austria.
    Lee, Hyemin
    POSTECH, South Korea.
    Nam, Hyeonseob
    POSTECH, South Korea.
    Bogun, Ivan
    Florida Institute Technology, FL USA.
    Jeong, Jae-chan
    Elect and Telecommun Research Institute, South Korea.
    Cho, Jae-il
    Elect and Telecommun Research Institute, South Korea.
    Lee, Jae-Young
    Elect and Telecommun Research Institute, South Korea.
    Zhu, Jianke
    Zhejiang University, Peoples R China.
    Shi, Jianping
    CUHK, Peoples R China.
    Li, Jiatong
    Beijing Institute Technology, Peoples R China; University of Technology Sydney, Australia.
    Jia, Jiaya
    CUHK, Peoples R China.
    Feng, Jiayi
    Chinese Academic Science, Peoples R China.
    Gao, Jin
    Chinese Academic Science, Peoples R China.
    Young Choi, Jin
    Seoul National University, South Korea.
    Kim, Ji-Wan
    Elect and Telecommun Research Institute, South Korea.
    Lang, Jochen
    University of Ottawa, Canada.
    Martinez, Jose M.
    University of Autonoma Madrid, Spain.
    Choi, Jongwon
    Seoul National University, South Korea.
    Xing, Junliang
    Chinese Academic Science, Peoples R China.
    Xue, Kai
    Harbin Engn University, Peoples R China.
    Palaniappan, Kannappan
    University of Missouri, MO 65211 USA.
    Lebeda, Karel
    University of Surrey, England.
    Alahari, Karteek
    INRIA Grenoble Rhone Alpes, France.
    Gao, Ke
    University of Missouri, MO 65211 USA.
    Yun, Kimin
    Seoul National University, South Korea.
    Hong Wong, Kin
    Chinese University of Hong Kong, Peoples R China.
    Luo, Lei
    National University of Def Technology, Peoples R China.
    Ma, Liang
    Harbin Engn University, Peoples R China.
    Ke, Lipeng
    SUNY Albany, NY USA; Chinese Academic Science, Peoples R China.
    Wen, Longyin
    SUNY Albany, NY USA.
    Bertinetto, Luca
    University of Oxford, England.
    Pootschi, Mandieh
    University of Missouri, MO 65211 USA.
    Maresca, Mario
    Parthenope University of Naples, Italy.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Wen, Mei
    National University of Def Technology, Peoples R China; National Key Lab Parallel and Distributed Proc, Peoples R China.
    Zhang, Mengdan
    Chinese Academic Science, Peoples R China.
    Arens, Michael
    Fraunhofer IOSB, Germany.
    Valstar, Michel
    University of Nottingham, England.
    Tang, Ming
    Chinese Academic Science, Peoples R China.
    Chang, Ming-Ching
    SUNY Albany, NY USA.
    Haris Khan, Muhammad
    University of Nottingham, England.
    Fan, Nana
    Harbin Institute Technology, Peoples R China.
    Wang, Naiyan
    TuSimple LLC, CA USA; Hong Kong University of Science and Technology, Peoples R China.
    Miksik, Ondrej
    University of Oxford, England.
    Torr, Philip H. S.
    University of Oxford, England.
    Wang, Qiang
    Chinese Academic Science, Peoples R China.
    Martin-Nieto, Rafael
    University of Autonoma Madrid, Spain.
    Pelapur, Rengarajan
    University of Missouri, MO 65211 USA.
    Bowden, Richard
    University of Surrey, England.
    Laganiere, Robert
    University of Ottawa, Canada.
    Moujtahid, Salma
    University of Lyon, France.
    Hare, Sam
    Obvious Engn, England.
    Hadfield, Simon
    University of Surrey, England.
    Lyu, Siwei
    SUNY Albany, NY USA.
    Li, Siyi
    Hong Kong University of Science and Technology, Peoples R China.
    Zhu, Song-Chun
    University of California, USA.
    Becker, Stefan
    Fraunhofer IOSB, Germany.
    Duffner, Stefan
    University of Lyon, France; LIRIS, France.
    Hicks, Stephen L.
    University of Oxford, England.
    Golodetz, Stuart
    University of Oxford, England.
    Choi, Sunglok
    Elect and Telecommun Research Institute, South Korea.
    Wu, Tianfu
    University of California, USA.
    Mauthner, Thomas
    Graz University of Technology, Austria.
    Pridmore, Tony
    University of Nottingham, England.
    Hu, Weiming
    Chinese Academic Science, Peoples R China.
    Hubner, Wolfgang
    Fraunhofer IOSB, Germany.
    Wang, Xiaomeng
    University of Nottingham, England.
    Li, Xin
    Harbin Institute Technology, Peoples R China.
    Shi, Xinchu
    Chinese Academic Science, Peoples R China.
    Zhao, Xu
    Chinese Academic Science, Peoples R China.
    Mei, Xue
    Toyota Research Institute, Japan.
    Shizeng, Yao
    University of Missouri, USA.
    Hua, Yang
    INRIA Grenoble Rhône-Alpes, France.
    Li, Yang
    Zhejiang University, Peoples R China.
    Lu, Yang
    University of California, USA.
    Li, Yuezun
    SUNY Albany, NY USA.
    Chen, Zhaoyun
    National University of Def Technology, Peoples R China; National Key Lab Parallel and Distributed Proc, Peoples R China.
    Huang, Zehua
    Carnegie Mellon University, PA 15213 USA.
    Chen, Zhe
    University of Technology Sydney, Australia.
    Zhang, Zhe
    Baidu Corp, Peoples R China.
    He, Zhenyu
    Harbin Institute Technology, Peoples R China.
    Hong, Zhibin
    University of Technology Sydney, Australia.
    The Visual Object Tracking VOT2015 challenge results2015Ingår i: Proceedings 2015 IEEE International Conference on Computer Vision Workshops ICCVW 2015, IEEE , 2015, s. 564-586Konferensbidrag (Refereegranskat)
    Abstract [en]

    The Visual Object Tracking challenge 2015, VOT2015, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 62 trackers are presented. The number of tested trackers makes VOT 2015 the largest benchmark on short-term tracking to date. For each participating tracker, a short description is provided in the appendix. Features of the VOT2015 challenge that go beyond its VOT2014 predecessor are: (i) a new VOT2015 dataset twice as large as in VOT2014 with full annotation of targets by rotated bounding boxes and per-frame attribute, (ii) extensions of the VOT2014 evaluation methodology by introduction of a new performance measure. The dataset, the evaluation kit as well as the results are publicly available at the challenge website(1).

  • 46.
    Kristan, Matej
    et al.
    University of Ljubljana, Ljubljana, Slovenia.
    Pflugfelder, Roman P.
    Austrian Institute of Technology, Vienna, Austria.
    Leonardis, Ales
    University of Birmingham, Birmingham, UK.
    Matas, Jiri
    Czech Technical University, Prague, Czech Republic.
    Cehovin, Luka
    University of Ljubljana, Ljubljana, Slovenia.
    Nebehay, Georg
    Austrian Institute of Technology, Vienna, Austria.
    Vojir, Tomas
    Czech Technical University, Prague, Czech Republic.
    Fernandez, Gustavo
    Austrian Institute of Technology, Vienna, Austria.
    Lukezi, Alan
    University of Ljubljana, Ljubljana, Slovenia.
    Dimitriev, Aleksandar
    University of Ljubljana, Ljubljana, Slovenia.
    Petrosino, Alfredo
    Parthenope University of Naples, Naples, Italy.
    Saffari, Amir
    Affectv Limited, London, UK.
    Li, Bo
    Panasonic R&D Center, Singapore, Singapore.
    Han, Bohyung
    POSTECH, Pohang, Korea.
    Heng, CherKeng
    Panasonic R&D Center, Singapore, Singapore.
    Garcia, Christophe
    LIRIS, Lyon, France.
    Pangersic, Dominik
    University of Ljubljana, Ljubljana, Slovenia.
    Häger, Gustav
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad Shahbaz
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Oven, Franci
    University of Ljubljana, Ljubljana, Slovenia.
    Possegger, Horst
    Graz University of Technology, Graz, Austria.
    Bischof, Horst
    Graz University of Technology, Graz, Austria.
    Nam, Hyeonseob
    POSTECH, Pohang, Korea.
    Zhu, Jianke
    Zhejiang University, Hangzhou, China.
    Li, JiJia
    Shanghai Jiao Tong University, Shanghai, China.
    Choi, Jin Young
    ASRI Seoul National University, Gwanak, Korea.
    Choi, Jin-Woo
    Electronics and Telecommunications Research Institute, Daejeon, Korea.
    Henriques, Joao F.
    University of Coimbra, Coimbra, Portugal.
    van de Weijer, Joost
    Universitat Autonoma de Barcelona, Barcelona, Spain.
    Batista, Jorge
    University of Coimbra, Coimbra, Portugal.
    Lebeda, Karel
    University of Surrey, Surrey, UK.
    Ofjall, Kristoffer
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Yi, Kwang Moo
    EPFL CVLab, Lausanne, Switzerland.
    Qin, Lei
    ICT CAS, Beijing, China.
    Wen, Longyin
    Chinese Academy of Sciences, Beijing, China.
    Maresca, Mario Edoardo
    Parthenope University of Naples, Naples, Italy.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Felsberg, Michael
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Cheng, Ming-Ming
    University of Oxford, Oxford, UK.
    Torr, Philip
    University of Oxford, Oxford, UK.
    Huang, Qingming
    Harbin Institute of Technology, Harbin, China.
    Bowden, Richard
    University of Surrey, Surrey, UK.
    Hare, Sam
    Obvious Engineering Limited, London, UK.
    YueYing Lim, Samantha
    Panasonic R&D Center, Singapore, Singapore.
    Hong, Seunghoon
    POSTECH, Pohang, Korea.
    Liao, Shengcai
    Chinese Academy of Sciences, Beijing, China.
    Hadfield, Simon
    University of Surrey, Surrey, UK.
    Li, Stan Z.
    Chinese Academy of Sciences, Beijing, China.
    Duffner, Stefan
    LIRIS, Lyon, France.
    Golodetz, Stuart
    University of Oxford, Oxford, UK.
    Mauthner, Thomas
    Graz University of Technology, Graz, Austria.
    Vineet, Vibhav
    University of Oxford, Oxford, UK.
    Lin, Weiyao
    Shanghai Jiao Tong University, Shanghai, China.
    Li, Yang
    Zhejiang University, Hangzhou, China.
    Qi, Yuankai
    Harbin Institute of Technology, Harbin, China.
    Lei, Zhen
    Chinese Academy of Sciences, Beijing, China.
    Niu, ZhiHeng
    Panasonic R&D Center, Singapore, Singapore.
    The Visual Object Tracking VOT2014 Challenge Results2015Ingår i: COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II, Springer, 2015, Vol. 8926, s. 191-217Konferensbidrag (Refereegranskat)
    Abstract [en]

    The Visual Object Tracking challenge 2014, VOT2014, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 38 trackers are presented. The number of tested trackers makes VOT 2014 the largest benchmark on short-term tracking to date. For each participating tracker, a short description is provided in the appendix. Features of the VOT2014 challenge that go beyond its VOT2013 predecessor are introduced: (i) a new VOT2014 dataset with full annotation of targets by rotated bounding boxes and per-frame attribute, (ii) extensions of the VOT2013 evaluation methodology, (iii) a new unit for tracking speed assessment less dependent on the hardware and (iv) the VOT2014 evaluation toolkit that significantly speeds up execution of experiments. The dataset, the evaluation kit as well as the results are publicly available at the challenge website (http://​votchallenge.​net).

  • 47.
    Muhammad Anwer, Rao
    et al.
    Aalto University, Finland.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    van de Weijer, Joost
    University of Autonoma Barcelona, Spain.
    Laaksonen, Jorma
    Aalto University, Finland.
    Combining Holistic and Part-based Deep Representations for Computational Painting Categorization2016Ingår i: ICMR16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ASSOC COMPUTING MACHINERY , 2016, s. 339-342Konferensbidrag (Refereegranskat)
    Abstract [en]

    Automatic analysis of visual art, such as paintings, is a challenging inter-disciplinary research problem. Conventional approaches only rely on global scene characteristics by encoding holistic information for computational painting categorization. We argue that such approaches are sub-optimal and that discriminative common visual structures provide complementary information for painting classification. We present an approach that encodes both the global scene layout and discriminative latent common structures for computational painting categorization. The region of interests are automatically extracted, without any manual part labeling, by training class-specific deformable part-based models. Both holistic and region-of-interests are then described using multi-scale dense convolutional features. These features are pooled separately using Fisher vector encoding and concatenated afterwards in a single image representation. Experiments are performed on a challenging dataset with 91 different painters and 13 diverse painting styles. Our approach outperforms the standard method, which only employs the global scene characteristics. Furthermore, our method achieves state-of-the-art results outperforming a recent multi-scale deep features based approach [11] by 6.4% and 3.8% respectively on artist and style classification.

  • 48.
    van de Weijer, Joost
    et al.
    Comp Vis Centre Barcelona, Spain.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    An Overview of Color Name Applications in Computer Vision2015Ingår i: COMPUTATIONAL COLOR IMAGING, CCIW 2015, Springer Verlag (Germany) , 2015, Vol. 9016, s. 16-22Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this article we provide an overview of color name applications in computer vision. Color names are linguistic labels which humans use to communicate color. Computational color naming learns a mapping from pixels values to color names. In recent years color names have been applied to a wide variety of computer vision applications, including image classification, object recognition, texture classification, visual tracking and action recognition. Here we provide an overview of these results which show that in general color names outperform photometric invariants as a color representation.

  • 49.
    Yu, Lu
    et al.
    Northwestern Polytech Univ, Peoples R China; Univ Autonoma Barcelona, Spain.
    Zhang, Lichao
    Univ Autonoma Barcelona, Spain.
    van de Weijer, Joost
    Univ Autonoma Barcelona, Spain.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Cheng, Yongmei
    Northwestern Polytech Univ, Peoples R China.
    Alejandro Parraga, C.
    Univ Autonoma Barcelona, Spain.
    Beyond Eleven Color Names for Image Understanding2018Ingår i: Machine Vision and Applications, ISSN 0932-8092, E-ISSN 1432-1769, Vol. 29, nr 2, s. 361-373Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Color description is one of the fundamental problems of image understanding. One of the popular ways to represent colors is by means of color names. Most existing work on color names focuses on only the eleven basic color terms of the English language. This could be limiting the discriminative power of these representations, and representations based on more color names are expected to perform better. However, there exists no clear strategy to choose additional color names. We collect a dataset of 28 additional color names. To ensure that the resulting color representation has high discriminative power we propose a method to order the additional color names according to their complementary nature with the basic color names. This allows us to compute color name representations with high discriminative power of arbitrary length. In the experiments we show that these new color name descriptors outperform the existing color name descriptor on the task of visual tracking, person re-identification and image classification.

  • 50.
    Zhang, Lichao
    et al.
    Univ Autonoma Barcelona, Spain.
    Gonzalez-Garcia, Abel
    Univ Autonoma Barcelona, Spain.
    van de Weijer, Joost
    Univ Autonoma Barcelona, Spain.
    Danelljan, Martin
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.
    Khan, Fahad
    Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten. Incept Inst Artificial Intelligence, U Arab Emirates.
    Synthetic Data Generation for End-to-End Thermal Infrared Tracking2019Ingår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 28, nr 4, s. 1837-1850Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The usage of both off-the-shelf and end-to-end trained deep networks have significantly improved the performance of visual tracking on RGB videos. However, the lack of large labeled datasets hampers the usage of convolutional neural networks for tracking in thermal infrared (TIR) images. Therefore, most state-of-the-art methods on tracking for TIR data are still based on handcrafted features. To address this problem, we propose to use image-to-image translation models. These models allow us to translate the abundantly available labeled RGB data to synthetic TIR data. We explore both the usage of paired and unpaired image translation models for this purpose. These methods provide us with a large labeled dataset of synthetic TIR sequences, on which we can train end-to-end optimal features for tracking. To the best of our knowledge, we are the first to train end-to-end features for TIR tracking. We perform extensive experiments on the VOT-TIR2017 dataset. We show that a network trained on a large dataset of synthetic TIR data obtains better performance than one trained on the available real TIR data. Combining both data sources leads to further improvement. In addition, when we combine the network with motion features, we outperform the state of the art with a relative gain of over 10%, clearly showing the efficiency of using synthetic data to train end-to-end TIR trackers.

1 - 50 av 50
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf