liu.seSearch for publications in DiVA
Change search
Refine search result
1234567 1 - 50 of 327
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. FOI, SE-58111 Linkoping, Sweden.
    Optimizing Object, Atmosphere, and Sensor Parameters in Thermal Hyperspectral Imagery2017In: IEEE Transactions on Geoscience and Remote Sensing, ISSN 0196-2892, E-ISSN 1558-0644, Vol. 55, no 2, p. 658-670Article in journal (Refereed)
    Abstract [en]

    We address the problem of estimating atmosphere parameters (temperature and water vapor content) from data captured by an airborne thermal hyperspectral imager and propose a method based on linear and nonlinear optimization. The method is used for the estimation of the parameters (temperature and emissivity) of the observed object as well as sensor gain under certain restrictions. The method is analyzed with respect to sensitivity to noise and the number of spectral bands. Simulations with synthetic signatures are performed to validate the analysis, showing that the estimation can be performed with as few as 10-20 spectral bands at moderate noise levels. The proposed method is also extended to exploit additional knowledge, for example, measurements of atmospheric parameters and sensor noise. Additionally, we show how to extend the method in order to improve spectral calibration.

  • 2.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Arsic, Dejan
    Munich University of Technology, Germany.
    Ganchev, Todor
    University of Patras, Greece.
    Linderhed, Anna
    FOI Swedish Defence Research Agency.
    Menezes, Paolo
    University of Coimbra, Portugal.
    Ntalampiras, Stavros
    University of Patras, Greece.
    Olma, Tadeusz
    MARAC S.A., Greece.
    Potamitis, Ilyas
    Technological Educational Institute of Crete, Greece.
    Ros, Julien
    Probayes SAS, France.
    Prometheus: Prediction and interpretation of human behaviour based on probabilistic structures and heterogeneous sensors2008Conference paper (Refereed)
    Abstract [en]

    The on-going EU funded project Prometheus (FP7-214901) aims at establishing a general framework which links fundamental sensing tasks to automated cognition processes enabling interpretation and short-term prediction of individual and collective human behaviours in unrestricted environments as well as complex human interactions. To achieve the aforementioned goals, the Prometheus consortium works on the following core scientific and technological objectives:

    1. sensor modeling and information fusion from multiple, heterogeneous perceptual modalities;

    2. modeling, localization, and tracking of multiple people;

    3. modeling, recognition, and short-term prediction of continuous complex human behavior.

  • 3.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Evaluating Template Rescaling in Short-Term Single-Object Tracking2015Conference paper (Refereed)
    Abstract [en]

    In recent years, short-term single-object tracking has emerged has a popular research topic, as it constitutes the core of more general tracking systems. Many such tracking methods are based on matching a part of the image with a template that is learnt online and represented by, for example, a correlation filter or a distribution field. In order for such a tracker to be able to not only find the position, but also the scale, of the tracked object in the next frame, some kind of scale estimation step is needed. This step is sometimes separate from the position estimation step, but is nevertheless jointly evaluated in de facto benchmarks. However, for practical as well as scientific reasons, the scale estimation step should be evaluated separately – for example,theremightincertainsituationsbeothermethodsmore suitable for the task. In this paper, we describe an evaluation method for scale estimation in template-based short-term single-object tracking, and evaluate two state-of-the-art tracking methods where estimation of scale and position are separable.

  • 4.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Glana Sensors AB, Sweden.
    Renhorn, Ingmar
    Glana Sensors AB, Sweden.
    Chevalier, Tomas
    Scienvisic AB, Sweden.
    Rydell, Joakim
    FOI, Swedish Defence Research Agency, Sweden.
    Bergström, David
    FOI, Swedish Defence Research Agency, Sweden.
    Three-dimensional hyperspectral imaging technique2017In: ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY XXIII / [ed] Miguel Velez-Reyes; David W. Messinger, SPIE - International Society for Optical Engineering, 2017, Vol. 10198, article id 1019805Conference paper (Refereed)
    Abstract [en]

    Hyperspectral remote sensing based on unmanned airborne vehicles is a field increasing in importance. The combined functionality of simultaneous hyperspectral and geometric modeling is less developed. A configuration has been developed that enables the reconstruction of the hyperspectral three-dimensional (3D) environment. The hyperspectral camera is based on a linear variable filter and a high frame rate, high resolution camera enabling point-to-point matching and 3D reconstruction. This allows the information to be combined into a single and complete 3D hyperspectral model. In this paper, we describe the camera and illustrate capabilities and difficulties through real-world experiments.

  • 5.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Åstrom, Anders
    Swedish Natl Forens Ctr NFC, Linkoping, Sweden.
    Forchheimer, Robert
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Faculty of Science & Engineering.
    Simultaneous sensing, readout, and classification on an intensity-ranking image sensor2018In: International journal of circuit theory and applications, ISSN 0098-9886, E-ISSN 1097-007X, Vol. 46, no 9, p. 1606-1619Article in journal (Refereed)
    Abstract [en]

    We combine the near-sensor image processing concept with address-event representation leading to an intensity-ranking image sensor (IRIS) and show the benefits of using this type of sensor for image classification. The functionality of IRIS is to output pixel coordinates (X and Y values) continuously as each pixel has collected a certain number of photons. Thus, the pixel outputs will be automatically intensity ranked. By keeping track of the timing of these events, it is possible to record the full dynamic range of the image. However, in many cases, this is not necessary-the intensity ranking in itself gives the needed information for the task at hand. This paper describes techniques for classification and proposes a particular variant (groves) that fits the IRIS architecture well as it can work on the intensity rankings only. Simulation results using the CIFAR-10 dataset compare the results of the proposed method with the more conventional ferns technique. It is concluded that the simultaneous sensing and classification obtainable with the IRIS sensor yields both fast (shorter than full exposure time) and processing-efficient classification.

    The full text will be freely available from 2019-08-13 11:07
  • 6.
    Ahlman, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improved Temporal Resolution Using Parallel Imaging in Radial-Cartesian 3D functional MRI2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    MRI (Magnetic Resonance Imaging) is a medical imaging method that uses magnetic fields in order to retrieve images of the human body. This thesis revolves around a novel acquisition method of 3D fMRI (functional Magnetic Resonance Imaging) called PRESTO-CAN that uses a radial pattern in order to sample the (kx,kz)-plane of k-space (the frequency domain), and a Cartesian sample pattern in the ky-direction. The radial sample pattern allows for a denser sampling of the central parts of k-space, which contain the most basic frequency information about the structure of the recorded object. This allows for higher temporal resolution to be achieved compared with other sampling methods since a fewer amount of total samples are needed in order to retrieve enough information about how the object has changed over time. Since fMRI is mainly used for monitoring blood flow in the brain, increased temporal resolution means that we can be able to track fast changes in brain activity more efficiently.The temporal resolution can be further improved by reducing the time needed for scanning, which in turn can be achieved by applying parallel imaging. One such parallel imaging method is SENSE (SENSitivity Encoding). The scan time is reduced by decreasing the sampling density, which causes aliasing in the recorded images. The aliasing is removed by the SENSE method by utilizing the extra information provided by the fact that multiple receiver coils with differing sensitivities are used during the acquisition. By measuring the sensitivities of the respective receiver coils and solving an equation system with the aliased images, it is possible to calculate how they would have looked like without aliasing.In this master thesis, SENSE has been successfully implemented in PRESTO-CAN. By using normalized convolution in order to refine the sensitivity maps of the receiver coils, images with satisfying quality was able to be reconstructed when reducing the k-space sample rate by a factor of 2, and images of relatively good quality also when the sample rate was reduced by a factor of 4. In this way, this thesis has been able to contribute to the improvement of the temporal resolution of the PRESTO-CAN method.

  • 7.
    Ahlman, Gustav
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Magnusson, Maria
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Dahlqvist Leinhard, Olof
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Medical and Health Sciences, Radiation Physics. Linköping University, Faculty of Health Sciences.
    Lundberg, Peter
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Medical and Health Sciences, Radiation Physics. Linköping University, Department of Medical and Health Sciences, Radiology. Linköping University, Faculty of Health Sciences. Östergötlands Läns Landsting, Center for Surgery, Orthopaedics and Cancer Treatment, Department of Radiation Physics. Östergötlands Läns Landsting, Center for Diagnostics, Department of Radiology in Linköping.
    Increased temporal resolution in radial-Cartesian sampling of k-space by implementation of parallel imaging2011Conference paper (Refereed)
  • 8.
    Andersson, Elin
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Thermal Impact of a Calibrated Stereo Camera Rig2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Measurements performed from stereo reconstruction can be obtained with a high accuracy with correct calibrated cameras. A stereo camera rig mounted in an outdoor environment is exposed to temperature changes, which has an impact of the calibration of the cameras.

    The aim of the master thesis was to investigate the thermal impact of a calibrated stereo camera rig. This was performed by placing a stereo rig in a temperature chamber and collect data of a calibration board at different temperatures. Data was collected with two different cameras and lensesand used for calibration of the stereo camera rig for different scenarios. The obtained parameters were plotted and analyzed.

    The result from the master thesis gives that the thermal variation has an impact of the accuracy of the calibrated stereo camera rig. A calibration obtained in one temperature can not be used for a different temperature without a degradation of the accuracy. The plotted parameters from the calibration had a high noise level due to problems with the calibration methods, and no visible trend from temperature changes could be seen.

  • 9.
    Andersson, Maria
    et al.
    FOI Swedish Defence Research Agency.
    Rydell, Joakim
    FOI Swedish Defence Research Agency.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. FOI Swedish Defence Research Agency.
    Estimation of crowd behaviour using sensor networks and sensor fusion2009Conference paper (Refereed)
    Abstract [en]

    Commonly, surveillance operators are today monitoring a large number of CCTV screens, trying to solve the complex cognitive tasks of analyzing crowd behavior and detecting threats and other abnormal behavior. Information overload is a rule rather than an exception. Moreover, CCTV footage lacks important indicators revealing certain threats, and can also in other respects be complemented by data from other sensors. This article presents an approach to automatically interpret sensor data and estimate behaviors of groups of people in order to provide the operator with relevant warnings. We use data from distributed heterogeneous sensors (visual cameras and a thermal infrared camera), and process the sensor data using detection algorithms. The extracted features are fed into a hidden Markov model in order to model normal behavior and detect deviations. We also discuss the use of radars for weapon detection.

  • 10.
    Andersson, Viktor
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Semantic Segmentation: Using Convolutional Neural Networks and Sparse dictionaries2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The two main bottlenecks using deep neural networks are data dependency and training time. This thesis proposes a novel method for weight initialization of the convolutional layers in a convolutional neural network. This thesis introduces the usage of sparse dictionaries. A sparse dictionary optimized on domain specific data can be seen as a set of intelligent feature extracting filters. This thesis investigates the effect of using such filters as kernels in the convolutional layers in the neural network. How do they affect the training time and final performance?

    The dataset used here is the Cityscapes-dataset which is a library of 25000 labeled road scene images.The sparse dictionary was acquired using the K-SVD method. The filters were added to two different networks whose performance was tested individually. One of the architectures is much deeper than the other. The results have been presented for both networks. The results show that filter initialization is an important aspect which should be taken into consideration while training the deep networks for semantic segmentation.

  • 11.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Two-Stream Part-based Deep Representation for Human Attribute Recognition2018In: 2018 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), IEEE , 2018, p. 90-97Conference paper (Refereed)
    Abstract [en]

    Recognizing human attributes in unconstrained environments is a challenging computer vision problem. State-of-the-art approaches to human attribute recognition are based on convolutional neural networks (CNNs). The de facto practice when training these CNNs on a large labeled image dataset is to take RGB pixel values of an image as input to the network. In this work, we propose a two-stream part-based deep representation for human attribute classification. Besides the standard RGB stream, we train a deep network by using mapped coded images with explicit texture information, that complements the standard RGB deep model. To integrate human body parts knowledge, we employ the deformable part-based models together with our two-stream deep model. Experiments are performed on the challenging Human Attributes (HAT-27) Dataset consisting of 27 different human attributes. Our results clearly show that (a) the two-stream deep network provides consistent gain in performance over the standard RGB model and (b) that the attribute classification results are further improved with our two-stream part-based deep representations, leading to state-of-the-art results.

  • 12.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    van de Weijer, Joost
    Univ Autonoma Barcelona, Spain.
    Molinier, Matthieu
    VTT Tech Res Ctr Finland Ltd, Finland.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification2018In: ISPRS journal of photogrammetry and remote sensing (Print), ISSN 0924-2716, E-ISSN 1872-8235, Vol. 138, p. 74-85Article in journal (Refereed)
    Abstract [en]

    Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification. (C) 2018 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.

  • 13.
    Ardeshiri, Tohid
    et al.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Larsson, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Gustafsson, Fredrik
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Schön, Thomas B.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Bicycle Tracking Using Ellipse Extraction2011In: Proceedings of the 14thInternational Conference on Information Fusion, 2011, IEEE , 2011, p. 1-8Conference paper (Refereed)
    Abstract [en]

    A new approach to track bicycles from imagery sensor data is proposed. It is based on detecting ellipsoids in the images, and treat these pair-wise using a dynamic bicycle model. One important application area is in automotive collision avoidance systems, where no dedicated systems for bicyclists yet exist and where very few theoretical studies have been published.

    Possible conflicts can be predicted from the position and velocity state in the model, but also from the steering wheel articulation and roll angle that indicate yaw changes before the velocity vector changes. An algorithm is proposed which consists of an ellipsoid detection and estimation algorithm and a particle filter.

    A simulation study of three critical single target scenarios is presented, and the algorithm is shown to produce excellent state estimates. An experiment using a stationary camera and the particle filter for state estimation is performed and has shown encouraging results.

  • 14.
    Baravdish, George
    et al.
    Linköping University, Department of Science and Technology, Communications and Transport Systems. Linköping University, The Institute of Technology.
    Svensson, Olof
    Linköping University, Department of Science and Technology, Communications and Transport Systems. Linköping University, The Institute of Technology.
    Åström, Freddie
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    On Backward p(x)-Parabolic Equations for Image Enhancement2015In: Numerical Functional Analysis and Optimization, ISSN 0163-0563, E-ISSN 1532-2467, Vol. 36, no 2, p. 147-168Article in journal (Refereed)
    Abstract [en]

    In this study, we investigate the backward p(x)-parabolic equation as a new methodology to enhance images. We propose a novel iterative regularization procedure for the backward p(x)-parabolic equation based on the nonlinear Landweber method for inverse problems. The proposed scheme can also be extended to the family of iterative regularization methods involving the nonlinear Landweber method. We also investigate the connection between the variable exponent p(x) in the proposed energy functional and the diffusivity function in the corresponding Euler-Lagrange equation. It is well known that the forward problems converges to a constant solution destroying the image. The purpose of the approach of the backward problems is twofold. First, solving the backward problem by a sequence of forward problems we obtain a smooth image which is denoised. Second, by choosing the initial data properly we try to reduce the blurriness of the image. The numerical results for denoising appear to give improvement over standard methods as shown by preliminary results.

  • 15.
    Barnada, Marc
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Conrad, Christian
    Goethe University of Frankfurt, Germany.
    Bradler, Henry
    Goethe University of Frankfurt, Germany.
    Ochs, Matthias
    Goethe University of Frankfurt, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Estimation of Automotive Pitch, Yaw, and Roll using Enhanced Phase Correlation on Multiple Far-field Windows2015In: 2015 IEEE Intelligent Vehicles Symposium (IV), IEEE , 2015, p. 481-486Conference paper (Refereed)
    Abstract [en]

    The online-estimation of yaw, pitch, and roll of a moving vehicle is an important ingredient for systems which estimate egomotion, and 3D structure of the environment in a moving vehicle from video information. We present an approach to estimate these angular changes from monocular visual data, based on the fact that the motion of far distant points is not dependent on translation, but only on the current rotation of the camera. The presented approach does not require features (corners, edges,...) to be extracted. It allows to estimate in parallel also the illumination changes from frame to frame, and thus allows to largely stabilize the estimation of image correspondences and motion vectors, which are most often central entities needed for computating scene structure, distances, etc. The method is significantly less complex and much faster than a full egomotion computation from features, such as PTAM [6], but it can be used for providing motion priors and reduce search spaces for more complex methods which perform a complete analysis of egomotion and dynamic 3D structure of the scene in which a vehicle moves.

  • 16.
    Bengtsson, Morgan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Indoor 3D Mapping using Kinect2014Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In recent years several depth cameras have emerged on the consumer market, creating many interesting possibilities forboth professional and recreational usage. One example of such a camera is the Microsoft Kinect sensor originally usedwith the Microsoft Xbox 360 game console. In this master thesis a system is presented that utilizes this device in order to create an as accurate as possible 3D reconstruction of an indoor environment. The major novelty of the presented system is the data structure based on signed distance fields and voxel octrees used to represent the observed environment.

  • 17.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Classification of leakage detections acquired by airborne thermography of district heating networks2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In Sweden and many other northern countries, it is common for heat to be distributed to homes and industries through district heating networks. Such networks consist of pipes buried underground carrying hot water or steam with temperatures in the range of 90-150 C. Due to bad insulation or cracks, heat or water leakages might appear.

    A system for large-scale monitoring of district heating networks through remote thermography has been developed and is in use at the company Termisk Systemteknik AB. Infrared images are captured from an aircraft and analysed, finding and indicating the areas for which the ground temperature is higher than normal. During the analysis there are, however, many other warm areas than true water or energy leakages that are marked as detections. Objects or phenomena that can cause false alarms are those who, for some reason, are warmer than their surroundings, for example, chimneys, cars and heat leakages from buildings.

    During the last couple of years, the system has been used in a number of cities. Therefore, there exists a fair amount of examples of different types of detections. The purpose of the present master’s thesis is to evaluate the reduction of false alarms of the existing analysis that can be achieved with the use of a learning system, i.e. a system which can learn how to recognize different types of detections. 

    A labelled data set for training and testing was acquired by contact with customers. Furthermore, a number of features describing the intensity difference within the detection, its shape and propagation as well as proximity information were found, implemented and evaluated. Finally, four different classifiers and other methods for classification were evaluated.

    The method that obtained the best results consists of two steps. In the initial step, all detections which lie on top of a building are removed from the data set of labelled detections. The second step consists of classification using a Random forest classifier. Using this two-step method, the number of false alarms is reduced by 43% while the percentage of water and energy detections correctly classified is 99%.

  • 18.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Detection and Tracking in Thermal Infrared Imagery2016Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as it is possible to measure a temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.

    This thesis addresses the problem of detection and tracking in thermal infrared imagery. Visual detection and tracking of objects in video are research areas that have been and currently are subject to extensive research. Indications oftheir popularity are recent benchmarks such as the annual Visual Object Tracking (VOT) challenges, the Object Tracking Benchmarks, the series of workshops on Performance Evaluation of Tracking and Surveillance (PETS), and the workshops on Change Detection. Benchmark results indicate that detection and tracking are still challenging problems.

    A common belief is that detection and tracking in thermal infrared imagery is identical to detection and tracking in grayscale visual imagery. This thesis argues that the preceding allegation is not true. The characteristics of thermal infrared radiation and imagery pose certain challenges to image analysis algorithms. The thesis describes these characteristics and challenges as well as presents evaluation results confirming the hypothesis.

    Detection and tracking are often treated as two separate problems. However, some tracking methods, e.g. template-based tracking methods, base their tracking on repeated specific detections. They learn a model of the object that is adaptively updated. That is, detection and tracking are performed jointly. The thesis includes a template-based tracking method designed specifically for thermal infrared imagery, describes a thermal infrared dataset for evaluation of template-based tracking methods, and provides an overview of the first challenge on short-term,single-object tracking in thermal infrared video. Finally, two applications employing detection and tracking methods are presented.

    List of papers
    1. A Thermal Object Tracking Benchmark
    Open this publication in new window or tab >>A Thermal Object Tracking Benchmark
    2015 (English)Conference paper, Published paper (Refereed)
    Abstract [en]

    Short-term single-object (STSO) tracking in thermal images is a challenging problem relevant in a growing number of applications. In order to evaluate STSO tracking algorithms on visual imagery, there are de facto standard benchmarks. However, we argue that tracking in thermal imagery is different than in visual imagery, and that a separate benchmark is needed. The available thermal infrared datasets are few and the existing ones are not challenging for modern tracking algorithms. Therefore, we hereby propose a thermal infrared benchmark according to the Visual Object Tracking (VOT) protocol for evaluation of STSO tracking methods. The benchmark includes the new LTIR dataset containing 20 thermal image sequences which have been collected from multiple sources and annotated in the format used in the VOT Challenge. In addition, we show that the ranking of different tracking principles differ between the visual and thermal benchmarks, confirming the need for the new benchmark.

    Place, publisher, year, edition, pages
    IEEE, 2015
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-121001 (URN)10.1109/AVSS.2015.7301772 (DOI)000380619700052 ()978-1-4673-7632-7 (ISBN)
    Conference
    12th IEEE International Conference on Advanced Video- and Signal-based Surveillance, Karlsruhe, Germany, August 25-28 2015
    Available from: 2015-09-02 Created: 2015-09-02 Last updated: 2018-01-11Bibliographically approved
    2. The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results
    Open this publication in new window or tab >>The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results
    Show others...
    2015 (English)In: Proceedings of the IEEE International Conference on Computer Vision, Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 639-651Conference paper, Published paper (Refereed)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking challenge 2015, VOTTIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply prelearned models of object appearance. VOT-TIR2015 is the first benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2015 challenge is based on the VOT2013 challenge, but introduces the following novelties: (i) the newly collected LTIR (Linköping TIR) dataset is used, (ii) the VOT2013 attributes are adapted to TIR data, (iii) the evaluation is performed using insights gained during VOT2013 and VOT2014 and is similar to VOT2015.

    Place, publisher, year, edition, pages
    Institute of Electrical and Electronics Engineers (IEEE), 2015
    Series
    IEEE International Conference on Computer Vision. Proceedings, ISSN 1550-5499
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-126917 (URN)10.1109/ICCVW.2015.86 (DOI)000380434700077 ()978-146738390-5 (ISBN)
    External cooperation:
    Conference
    IEEE International Conference on Computer Vision Workshop (ICCVW. 7-13 Dec. 2015 Santiago, Chile
    Available from: 2016-04-07 Created: 2016-04-07 Last updated: 2018-01-10Bibliographically approved
    3. Channel Coded Distribution Field Tracking for Thermal Infrared Imagery
    Open this publication in new window or tab >>Channel Coded Distribution Field Tracking for Thermal Infrared Imagery
    2016 (English)In: PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), IEEE , 2016, p. 1248-1256Conference paper, Published paper (Refereed)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. The fast progress has been possible thanks to the development of new template-based tracking methods with online template updates, methods which have not been explored for TIR tracking. Instead, tracking methods used for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a template-based tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. In order to avoid background contamination of the object template, we propose to exploit background information for the online template update and to adaptively select the object region used for tracking. Moreover, we propose a novel method for estimating object scale change. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Further, the proposed tracker, ABCD, and the VOT-TIR2015 winner SRDCFir are evaluated on maritime data. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

    Place, publisher, year, edition, pages
    IEEE, 2016
    Series
    IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, ISSN 2160-7508
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-134402 (URN)10.1109/CVPRW.2016.158 (DOI)000391572100151 ()978-1-5090-1438-5 (ISBN)978-1-5090-1437-8 (ISBN)
    Conference
    Computer Vision and Pattern Recognition Workshops (CVPRW), 2016 IEEE Conference on
    Funder
    Swedish Research Council, D0570301EU, FP7, Seventh Framework Programme, 312784EU, FP7, Seventh Framework Programme, 607567
    Available from: 2017-02-09 Created: 2017-02-09 Last updated: 2018-01-13
    4. Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera
    Open this publication in new window or tab >>Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera
    2015 (English)In: Image Analysis: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen; Kim S. Pedersen, Springer, 2015, p. 492-503Conference paper, Published paper (Refereed)
    Abstract [en]

    We propose a method for detecting obstacles on the railway in front of a moving train using a monocular thermal camera. The problem is motivated by the large number of collisions between trains and various obstacles, resulting in reduced safety and high costs. The proposed method includes a novel way of detecting the rails in the imagery, as well as a way to detect anomalies on the railway. While the problem at a first glance looks similar to road and lane detection, which in the past has been a popular research topic, a closer look reveals that the problem at hand is previously unaddressed. As a consequence, relevant datasets are missing as well, and thus our contribution is two-fold: We propose an approach to the novel problem of obstacle detection on railways and we describe the acquisition of a novel data set.

    Place, publisher, year, edition, pages
    Springer, 2015
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 9127
    Keywords
    Thermal imaging; Computer vision; Train safety; Railway detection; Anomaly detection; Obstacle detection
    National Category
    Signal Processing
    Identifiers
    urn:nbn:se:liu:diva-119507 (URN)10.1007/978-3-319-19665-7_42 (DOI)978-3-319-19664-0 (ISBN)978-3-319-19665-7 (ISBN)
    Conference
    19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015
    Available from: 2015-06-22 Created: 2015-06-18 Last updated: 2018-02-07Bibliographically approved
    5. Enhanced analysis of thermographic images for monitoring of district heat pipe networks
    Open this publication in new window or tab >>Enhanced analysis of thermographic images for monitoring of district heat pipe networks
    2016 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 83, no 2, p. 215-223Article in journal (Refereed) Published
    Abstract [en]

    We address two problems related to large-scale aerial monitoring of district heating networks. First, we propose a classification scheme to reduce the number of false alarms among automatically detected leakages in district heating networks. The leakages are detected in images captured by an airborne thermal camera, and each detection corresponds to an image region with abnormally high temperature. This approach yields a significant number of false positives, and we propose to reduce this number in two steps; by (a) using a building segmentation scheme in order to remove detections on buildings, and (b) to use a machine learning approach to classify the remaining detections as true or false leakages. We provide extensive experimental analysis on real-world data, showing that this post-processing step significantly improves the usefulness of the system. Second, we propose a method for characterization of leakages over time, i.e., repeating the image acquisition one or a few years later and indicate areas that suffer from an increased energy loss. We address the problem of finding trends in the degradation of pipe networks in order to plan for long-term maintenance, and propose a visualization scheme exploiting the consecutive data collections.

    Place, publisher, year, edition, pages
    Elsevier, 2016
    Keywords
    Remote thermography; Classification; Pattern recognition; District heating; Thermal infrared
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-133004 (URN)10.1016/j.patrec.2016.07.002 (DOI)000386874800013 ()
    Note

    Funding Agencies|Swedish Research Council (Vetenskapsradet) through project Learning systems for remote thermography [621-2013-5703]; Swedish Research Council [2014-6227]

    Available from: 2016-12-08 Created: 2016-12-07 Last updated: 2018-11-26
  • 19.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB Linköping, Sweden.
    Classifying district heating network leakages in aerial thermal imagery2014Conference paper (Other academic)
    Abstract [en]

    In this paper we address the problem of automatically detecting leakages in underground pipes of district heating networks from images captured by an airborne thermal camera. The basic idea is to classify each relevant image region as a leakage if its temperature exceeds a threshold. This simple approach yields a significant number of false positives. We propose to address this issue by machine learning techniques and provide extensive experimental analysis on real-world data. The results show that this postprocessing step significantly improves the usefulness of the system.

  • 20.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    A thermal infrared dataset for evaluation of short-term tracking methods2015Conference paper (Other academic)
    Abstract [en]

    During recent years, thermal cameras have decreased in both size and cost while improving image quality. The area of use for such cameras has expanded with many exciting applications, many of which require tracking of objects. While being subject to extensive research in the visual domain, tracking in thermal imagery has historically been of interest mainly for military purposes. The available thermal infrared datasets for evaluating methods addressing these problems are few and the ones that do are not challenging enough for today’s tracking algorithms. Therefore, we hereby propose a thermal infrared dataset for evaluation of short-term tracking methods. The dataset consists of 20 sequences which have been collected from multiple sources and the data format used is in accordance with the Visual Object Tracking (VOT) Challenge.

  • 21.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    A Thermal Object Tracking Benchmark2015Conference paper (Refereed)
    Abstract [en]

    Short-term single-object (STSO) tracking in thermal images is a challenging problem relevant in a growing number of applications. In order to evaluate STSO tracking algorithms on visual imagery, there are de facto standard benchmarks. However, we argue that tracking in thermal imagery is different than in visual imagery, and that a separate benchmark is needed. The available thermal infrared datasets are few and the existing ones are not challenging for modern tracking algorithms. Therefore, we hereby propose a thermal infrared benchmark according to the Visual Object Tracking (VOT) protocol for evaluation of STSO tracking methods. The benchmark includes the new LTIR dataset containing 20 thermal image sequences which have been collected from multiple sources and annotated in the format used in the VOT Challenge. In addition, we show that the ranking of different tracking principles differ between the visual and thermal benchmarks, confirming the need for the new benchmark.

  • 22.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Faculty of Science & Engineering.
    Channel Coded Distribution Field Tracking for Thermal Infrared Imagery2016In: PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), IEEE , 2016, p. 1248-1256Conference paper (Refereed)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. The fast progress has been possible thanks to the development of new template-based tracking methods with online template updates, methods which have not been explored for TIR tracking. Instead, tracking methods used for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a template-based tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. In order to avoid background contamination of the object template, we propose to exploit background information for the online template update and to adaptively select the object region used for tracking. Moreover, we propose a novel method for estimating object scale change. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Further, the proposed tracker, ABCD, and the VOT-TIR2015 winner SRDCFir are evaluated on maritime data. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

  • 23.
    Berg, Amanda
    et al.
    Linköping University, Faculty of Science & Engineering. Linköping University, Department of Electrical Engineering, Computer Vision. Termisk Syst Tekn AB, Diskettgatan 11 B, SE-58335 Linkoping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Syst Tekn AB, Diskettgatan 11 B, SE-58335 Linkoping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Enhanced analysis of thermographic images for monitoring of district heat pipe networks2016In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 83, no 2, p. 215-223Article in journal (Refereed)
    Abstract [en]

    We address two problems related to large-scale aerial monitoring of district heating networks. First, we propose a classification scheme to reduce the number of false alarms among automatically detected leakages in district heating networks. The leakages are detected in images captured by an airborne thermal camera, and each detection corresponds to an image region with abnormally high temperature. This approach yields a significant number of false positives, and we propose to reduce this number in two steps; by (a) using a building segmentation scheme in order to remove detections on buildings, and (b) to use a machine learning approach to classify the remaining detections as true or false leakages. We provide extensive experimental analysis on real-world data, showing that this post-processing step significantly improves the usefulness of the system. Second, we propose a method for characterization of leakages over time, i.e., repeating the image acquisition one or a few years later and indicate areas that suffer from an increased energy loss. We address the problem of finding trends in the degradation of pipe networks in order to plan for long-term maintenance, and propose a visualization scheme exploiting the consecutive data collections.

  • 24.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Generating Visible Spectrum Images from Thermal Infrared2018Conference paper (Refereed)
    Abstract [en]

    Transformation of thermal infrared (TIR) images into visual, i.e. perceptually realistic color (RGB) images, is a challenging problem. TIR cameras have the ability to see in scenarios where vision is severely impaired, for example in total darkness or fog, and they are commonly used, e.g., for surveillance and automotive applications. However, interpretation of TIR images is difficult, especially for untrained operators. Enhancing the TIR image display by transforming it into a plausible, visual, perceptually realistic RGB image presumably facilitates interpretation. Existing grayscale to RGB, so called, colorization methods cannot be applied to TIR images directly since those methods only estimate the chrominance and not the luminance. In the absence of conventional colorization methods, we propose two fully automatic TIR to visual color image transformation methods, a two-step and an integrated approach, based on Convolutional Neural Networks. The methods require neither pre- nor postprocessing, do not require any user input, and are robust to image pair misalignments. We show that the methods do indeed produce perceptually realistic results on publicly available data, which is assessed both qualitatively and quantitatively.

  • 25.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Object Tracking in Thermal Infrared Imagery based on Channel Coded Distribution Fields2017Conference paper (Other academic)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. Tracking methods designed for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a templatebased tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

  • 26.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    An Overview of the Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge2016Conference paper (Other academic)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking (VOT-TIR2015) Challenge was organized in conjunction with ICCV2015. It was the first benchmark on short-term,single-target tracking in thermal infrared (TIR) sequences. The challenge aimed at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. It was based on the VOT2013 Challenge, but introduced the following novelties: (i) the utilization of the LTIR (Linköping TIR) dataset, (ii) adaption of the VOT2013 attributes to thermal data, (iii) a similar evaluation to that of VOT2015. This paper provides an overview of the VOT-TIR2015 Challenge as well as the results of the 24 participating trackers.

  • 27.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Öfjäll, Kristoffer
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera2015In: Image Analysis: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen; Kim S. Pedersen, Springer, 2015, p. 492-503Conference paper (Refereed)
    Abstract [en]

    We propose a method for detecting obstacles on the railway in front of a moving train using a monocular thermal camera. The problem is motivated by the large number of collisions between trains and various obstacles, resulting in reduced safety and high costs. The proposed method includes a novel way of detecting the rails in the imagery, as well as a way to detect anomalies on the railway. While the problem at a first glance looks similar to road and lane detection, which in the past has been a popular research topic, a closer look reveals that the problem at hand is previously unaddressed. As a consequence, relevant datasets are missing as well, and thus our contribution is two-fold: We propose an approach to the novel problem of obstacle detection on railways and we describe the acquisition of a novel data set.

  • 28.
    Berglin, Lukas
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Design, Evaluation and Implementation of a Pipeline for Semi-Automatic Lung Nodule Segmentation2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Lung cancer is the most common type of cancer in the world and always manifests as lung nodules. Nodules are small tumors that consist of lung tissue. They are usually spherical in shape and their cores can be either solid or subsolid. Nodules are common in lungs, but not all of them are malignant. To determine if a nodule is malignant or benign, attributes like nodule size and volume growth are commonly used. The procedure to obtain these attributes is time consuming, and therefore calls for tools to simplify the process.

    The purpose of this thesis work was to investigate  the feasibility of a semi-automatic lungnodule segmentation pipeline including volume estimation. This was done by implementing, tuning and evaluating image processing algorithms with different characteristics to create pipeline candidates. These candidates were compared using a similarity index between their segmentation results and ground truth markings to determine the most promising one.

    The best performing pipeline consisted of a fixed region of interest together with a level set segmentation algorithm. Its segmentation accuracy was not consistent for all nodules evaluated, but the pipeline showed great potential when dynamically adapting its parameters for each nodule. The use of dynamic parameters was only brie y explored, and further research would be necessary to determine its feasibility.

  • 29.
    Bianco, Giuseppe
    et al.
    Lund University, Sweden.
    Ilieva, Mihaela
    Lund University, Sweden; Bulgarian Academic Science, Bulgaria.
    Veibäck, Clas
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.
    Öfjäll, Kristoffer
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Gadomska, Alicja
    Lund University, Sweden.
    Hendeby, Gustaf
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Gustafsson, Fredrik
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.
    Åkesson, Susanne
    Lund University, Sweden.
    Emlen funnel experiments revisited: methods update for studying compass orientation in songbirds2016In: Ecology and Evolution, ISSN 2045-7758, E-ISSN 2045-7758, Vol. 6, no 19, p. 6930-6942Article in journal (Refereed)
    Abstract [en]

    1 Migratory songbirds carry an inherited capacity to migrate several thousand kilometers each year crossing continental landmasses and barriers between distant breeding sites and wintering areas. How individual songbirds manage with extreme precision to find their way is still largely unknown. The functional characteristics of biological compasses used by songbird migrants has mainly been investigated by recording the birds directed migratory activity in circular cages, so-called Emlen funnels. This method is 50 years old and has not received major updates over the past decades. The aim of this work was to compare the results from newly developed digital methods with the established manual methods to evaluate songbird migratory activity and orientation in circular cages. 2 We performed orientation experiments using the European robin (Erithacus rubecula) using modified Emlen funnels equipped with thermal paper and simultaneously recorded the songbird movements from above. We evaluated and compared the results obtained with five different methods. Two methods have been commonly used in songbirds orientation experiments; the other three methods were developed for this study and were based either on evaluation of the thermal paper using automated image analysis, or on the analysis of videos recorded during the experiment. 3 The methods used to evaluate scratches produced by the claws of birds on the thermal papers presented some differences compared with the video analyses. These differences were caused mainly by differences in scatter, as any movement of the bird along the sloping walls of the funnel was recorded on the thermal paper, whereas video evaluations allowed us to detect single takeoff attempts by the birds and to consider only this behavior in the orientation analyses. Using computer vision, we were also able to identify and separately evaluate different behaviors that were impossible to record by the thermal paper. 4 The traditional Emlen funnel is still the most used method to investigate compass orientation in songbirds under controlled conditions. However, new numerical image analysis techniques provide a much higher level of detail of songbirds migratory behavior and will provide an increasing number of possibilities to evaluate and quantify specific behaviors as new algorithms will be developed.

  • 30.
    Biedermann, Daniel
    et al.
    Goethe University, Germany.
    Ochs, Matthias
    Goethe University, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University, Germany.
    Evaluating visual ADAS components on the COnGRATS dataset2016In: 2016 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), IEEE , 2016, p. 986-991Conference paper (Refereed)
    Abstract [en]

    We present a framework that supports the development and evaluation of vision algorithms in the context of driver assistance applications and traffic surveillance. This framework allows the creation of highly realistic image sequences featuring traffic scenarios. The sequences are created with a realistic state of the art vehicle physics model; different kinds of environments are featured, thus providing a wide range of testing scenarios. Due to the physically-based rendering technique and variable camera models employed for the image rendering process, we can simulate different sensor setups and provide appropriate and fully accurate ground truth data.

  • 31.
    Björkeson, Felix
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Autonomous Morphometrics using Depth Cameras for Object Classification and Identification2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Identification of individuals has been solved with many different solutions around the world, either using biometric data or external means of verification such as id cards or RFID tags. The advantage of using biometric measurements is that they are directly tied to the individual and are usually unalterable. Acquiring dependable measurements is however challenging when the individuals are uncooperative. A dependable system should be able to deal with this and produce reliable identifications.

    The system proposed in this thesis can autonomously classify uncooperative specimens from depth data. The data is acquired from a depth camera mounted in an uncontrolled environment, where it was allowed to continuously record for two weeks. This requires stable data extraction and normalization algorithms to produce good representations of the specimens. Robust descriptors can therefore be extracted from each sample of a specimen and together with different classification algorithms, the system can be trained or validated. Even with as many as 138 different classes the system achieves high recognition rates. Inspired by the research field of face recognition, the best classification algorithm, the method of fisherfaces, was able to accurately recognize 99.6% of the validation samples. Followed by two variations of the method of eigenfaces, achieving recognition rates of 98.8% and 97.9%. These results affirm that the capabilities of the system are adequate for a commercial implementation.

  • 32.
    Björnfot, Magnus
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Extension of DIRA (Dual-Energy Iterative Algorithm) to 3D Helical CT2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    There is a need for quantitative CT data in radiation therapy. Currently there are only few algorithms that address this issue, for instance the commercial DirectDensity algorithm. In scientific literature, an example of such an algorithm is DIRA. DIRA is an iterative model-based reconstruction method for dual-energy CT whose goal is to determine the material composition of the patient from accurate linear attenuation coefficients. It has been implemented in a two dimensional geometry, i.e., it could process axial scans only.  There was a need to extend DIRA so that it could process projection data generated in helical scanning geometries. The newly developed algorithm (DIRA-3D) implemented (i) polyenergetic semi-parallel projection generation, (ii) mono-energetic parallel projection generation and (iii) the PI-method for image reconstruction. The computation experiments showed that the accuracies of the resulting LAC and mass fractions were comparable to the ones of the original DIRA. The results converged after 10 iterations.

  • 33.
    Bondemark, Richard
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improving SLAM on a TOF Camera by Exploiting Planar Surfaces2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Simultaneous localization and mapping (SLAM) is the problem of mapping your surroundings while simultaneously localizing yourself in the map. It is an important and active area of research for robotics. In this master thesis two approaches are attempted to reduce the drift which appears over time in SLAM algorithms. The first approach tries 3 different motion models for the camera. Two of the models exploit the a priori knowledge that the camera is mounted on a trolley. These two methods are shown to improve the results. The second approach attempts to reduce the drift by reducing noise in the point cloud data used for mapping. This is done by finding planar surfaces in the point clouds. Median filtering is used as an alternative to compare the result for noise reduction. The planes estimation approach is also shown to reduce the drift, while the median estimation makes it worse.

  • 34.
    Bradler, Henry
    et al.
    Goethe University of Frankfurt, Germany.
    Anne Wiegand, Birthe
    Goethe University of Frankfurt, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    The Statistics of Driving Sequences - and what we can learn from them2015In: 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), IEEE , 2015, p. 106-114Conference paper (Refereed)
    Abstract [en]

    The motion of a driving car is highly constrained and we claim that powerful predictors can be built that learn the typical egomotion statistics, and support the typical tasks of feature matching, tracking, and egomotion estimation. We analyze the statistics of the ground truth data given in the KITTI odometry benchmark sequences and confirm that a coordinated turn motion model, overlaid by moderate vibrations, is a very realistic model. We develop a predictor that is able to significantly reduce the uncertainty about the relative motion when a new image frame comes in. Such predictors can be used to steer the matching process from frame n to frame n + 1. We show that they can also be employed to detect outliers in the temporal sequence of egomotion parameters.

  • 35.
    Bradler, Henry
    et al.
    Goethe University, Germany.
    Ochs, Matthias
    Goethe University, Germany.
    Fanani, Nolang
    Goethe University, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University, Germany.
    Joint Epipolar Tracking (JET): Simultaneous optimization of epipolar geometry and feature correspondences2017In: 2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), IEEE , 2017, p. 445-453Conference paper (Refereed)
    Abstract [en]

    Traditionally, pose estimation is considered as a two step problem. First, feature correspondences are determined by direct comparison of image patches, or by associating feature descriptors. In a second step, the relative pose and the coordinates of corresponding points are estimated, most often by minimizing the reprojection error (RPE). RPE optimization is based on a loss function that is merely aware of the feature pixel positions but not of the underlying image intensities. In this paper, we propose a sparse direct method which introduces a loss function that allows to simultaneously optimize the unscaled relative pose, as well as the set of feature correspondences directly considering the image intensity values. Furthermore, we show how to integrate statistical prior information on the motion into the optimization process. This constructive inclusion of a Bayesian bias term is particularly efficient in application cases with a strongly predictable (short term) dynamic, e.g. in a driving scenario. In our experiments, we demonstrate that the JET` algorithm we propose outperforms the classical reprojection error optimization on two synthetic datasets and on the KITTI dataset. The JET algorithm runs in real-time on a single CPU thread.

  • 36.
    Brandtberg, Tomas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Saab Dynam AB, Linköping, Sweden.
    Virtual hexagonal and multi-scale operator for fuzzy rank order texture classification using one-dimensional generalised Fourier analysis2016In: 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), IEEE COMPUTER SOC , 2016, p. 2018-2024Conference paper (Refereed)
    Abstract [en]

    This paper presents a study on a family of local hexagonal and multi-scale operators useful for texture analysis. The hexagonal grid shows an attractive rotation symmetry with uniform neighbour distances. The operator depicts a closed connected curve (1D periodic). It is resized within a scale interval during the conversion from the original square grid to the virtual hexagonal grid. Complementary image features, together with their tangential first-order hexagonal derivatives, are calculated. The magnitude/phase information from the Fourier or Fractional Fourier Transform (FFT, FrFT) are accumulated in thirty different Cartesian (polar for visualisation) and multi-scale domains. Simultaneous phase-correlation of a subset of the data gives an estimate of scaling/rotation relative the references. Similarity metrics are used as template matching. The sample, unseen by the system, is classified into the group with the maximum fuzzy rank order. An instantiation of a 12-point hexagonal operator (radius=2) is first successfully evaluated on a set of thirteen Brodatz images (no scaling/rotation). Then it is evaluated on the more challenging KTH-TIPS2b texture dataset (scaling/rotation, varying pose/illumination). A confusion matrix and cumulative fuzzy rank order summaries show, for example, that the correct class is top-ranked 44 - 50% and top-three ranked 68 - 76% of all sample images. A similar evaluation, using a box-like 12-point mask of square grids, gives overall lower accuracies. Finally, the FrFT parameter is an additional tuning parameter influencing the accuracies significantly.

  • 37.
    Brorsson, Andreas
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Compressive Sensing: Single Pixel SWIR Imaging of Natural Scenes2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Photos captured in the shortwave infrared (SWIR) spectrum are interesting in military applications because they are independent of what time of day the pic- ture is captured because the sun, moon, stars and night glow illuminate the earth with short-wave infrared radiation constantly. A major problem with today’s SWIR cameras is that they are very expensive to produce and hence not broadly available either within the military or to civilians. Using a relatively new tech- nology called compressive sensing (CS), enables a new type of camera with only a single pixel sensor in the sensor (a SPC). This new type of camera only needs a fraction of measurements relative to the number of pixels to be reconstructed and reduces the cost of a short-wave infrared camera with a factor of 20. The camera uses a micromirror array (DMD) to select which mirrors (pixels) to be measured in the scene, thus creating an underdetermined linear equation system that can be solved using the techniques described in CS to reconstruct the im- age. Given the new technology, it is in the Swedish Defence Research Agency (FOI) interest to evaluate the potential of a single pixel camera. With a SPC ar- chitecture developed by FOI, the goal of this thesis was to develop methods for sampling, reconstructing images and evaluating their quality. This thesis shows that structured random matrices and fast transforms have to be used to enable high resolution images and speed up the process of reconstructing images signifi- cantly. The evaluation of the images could be done with standard measurements associated with camera evaluation and showed that the camera can reproduce high resolution images with relative high image quality in daylight.

  • 38.
    Budde, Kiran Kumar
    Linköping University, Department of Electrical Engineering, Computer Vision.
    A Matlab Toolbox for fMRI Data Analysis: Detection, Estimation and Brain Connectivity2012Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Functional Magnetic Resonance Imaging (fMRI) is one of the best techniques for neuroimaging and has revolutionized the way to understand the brain functions. It measures the changes in the blood oxygen level-dependent (BOLD) signal which is related to the neuronal activity. Complexity of the data, presence of different types of noises and the massive amount of data makes the fMRI data analysis a challenging one. It demands efficient signal processing and statistical analysis methods.  The inference of the analysis is used by the physicians, neurologists and researchers for better understanding of the brain functions.

         The purpose of this study is to design a toolbox for fMRI data analysis. It includes methods to detect the brain activity maps, estimation of the hemodynamic response (HDR) and the connectivity of the brain structures. This toolbox provides methods for detection of activated brain regions measured with Bayesian estimator. Results are compared with the conventional methods such as t-test, ordinary least squares (OLS) and weighted least squares (WLS). Brain activation and HDR are estimated with linear adaptive model and nonlinear method based on radial basis function (RBF) neural network. Nonlinear autoregressive with exogenous inputs (NARX) neural network is developed to model the dynamics of the fMRI data.  This toolbox also provides methods to brain connectivity such as functional connectivity and effective connectivity.  These methods are examined on simulated and real fMRI datasets.

  • 39.
    Carlsson, Mattias
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Neural Networks for Semantic Segmentation in the Food Packaging Industry2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Industrial applications of computer vision often utilize traditional image processing techniques whereas state-of-the-art methods in most image processing challenges are almost exclusively based on convolutional neural networks (CNNs). Thus there is a large potential for improving the performance of many machine vision applications by incorporating CNNs.

    One such application is the classification of juice boxes with straws, where the baseline solution uses classical image processing techniques on depth images to reject or accept juice boxes. This thesis aim to investigate how CNNs perform on the task of semantic segmentation (pixel-wise classification) of said images and if the result can be used to increase classification performance.

    A drawback of CNNs is that they usually require large amounts of labelled data for training to be able to generalize and learn anything useful. As labelled data is hard to come by, two ways to get cheap data are investigated, one being synthetic data generation and the other being automatic labelling using the baseline solution.

    The implemented network performs well on semantic segmentation, even when trained on synthetic data only, though the performance increases with the ratio of real (automatically labelled) to synthetic images. The classification task is very sensitive to small errors in semantic segmentation and the results are therefore not as good as the baseline solution. It is suspected that the drop in performance between validation and test data is due to a domain shift between the data sets, e.g. variations in data collection and straw and box type, and fine-tuning to the target domain could definitely increase performance.

    When trained on synthetic data the domain shift is even larger and the performance on classification is next to useless. It is likely that the results could be improved by using more advanced data generation, e.g. a generative adversarial network (GAN), or more rigorous modelling of the data.

  • 40.
    Ceco, Ema
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Image Analysis in the Field of Oil Contamination Monitoring2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Monitoring wear particles in lubricating oils allows specialists to evaluate thehealth and functionality of a mechanical system. The main analysis techniquesavailable today are manual particle analysis and automatic optical analysis. Man-ual particle analysis is effective and reliable since the analyst continuously seeswhat is being counted . The drawback is that the technique is quite time demand-ing and dependent of the skills of the analyst. Automatic optical particle countingconstitutes of a closed system not allowing for the objects counted to be observedin real-time. This has resulted in a number of sources of error for the instrument.In this thesis a new method for counting particles based on light microscopywith image analysis is proposed. It has proven to be a fast and effective methodthat eliminates the sources of error of the previously described methods. Thenew method correlates very well with manual analysis which is used as a refer-ence method throughout this study. Size estimation of particles and detectionof metallic particles has also shown to be possible with the current image analy-sis setup. With more advanced software and analysis instrumentation, the imageanalysis method could be further developed to a decision based machine allowingfor declarations about which wear mode is occurring in a mechanical system.

  • 41.
    Chandaria, Jigna
    et al.
    BBC Research, UK.
    Thomas, Graham
    BBC Research, UK.
    Bartczak, Bogumil
    University of Kiel, Germany.
    Koeser, Kevin
    University of Kiel, Germany.
    Koch, Reinhard
    University of Kiel, Germany.
    Becker, Mario
    Fraunhofer IGD, Germany.
    Bleser, Gabriele
    Fraunhofer IGD, Germany.
    Stricker, Didier
    Fraunhofer IGD, Germany.
    Wohlleber, Cedric
    Fraunhofer IGD, Germany.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Gustafsson, Fredrik
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Hol, Jeroen
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Schön, Thomas
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Skoglund, Johan
    Linköping University, The Institute of Technology.
    Slycke, Per
    Xsens, Netherlands.
    Smeitz, Sebastiaan
    Xsens, Netherlands.
    Real-Time Camera Tracking in the MATRIS Project2006In: Prcoeedings of the 2006 International Broadcasting Convention, 2006Conference paper (Refereed)
    Abstract [en]

    In order to insert a virtual object into a TV image, the graphics system needs to know precisely how the camera is moving, so that the virtual object can be rendered in the correct place in every frame. Nowadays this can be achieved relatively easily in postproduction, or in a studio equipped with a special tracking system. However, for live shooting on location, or in a studio that is not specially equipped, installing such a system can be difficult or uneconomic. To overcome these limitations, the MATRIS project is developing a real-time system for measuring the movement of a camera. The system uses image analysis to track naturally occurring features in the scene, and data from an inertial sensor. No additional sensors, special markers, or camera mounts are required. This paper gives an overview of the system and presents some results.  

  • 42.
    Chellappa, Rama
    et al.
    Department of Electrical and Computer Engineering, University of Maryland, USA.
    Heyden, AndersLund University, Sweden.Laurendeau, DenisUniversité Laval, Canada.Felsberg, MichaelLinköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).Borga, MagnusLinköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, Faculty of Arts and Sciences. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Special issue on ICPR 2014 awarded papers2016Collection (editor) (Refereed)
    Abstract [en]

    We, the Guest Editors of this special issue of Pattern Recognition Letters are pleased to share these contributions with you. The papers included here are based on work from the 22nd International Conference on Pattern Recognition (IAPR) in Stockholm, Sweden, held August 24–28, 2014. The papers selected for this special issue were those winning one of the IAPR awards, as well as one paper by a former student of the winner of the KS Fu Prize, Prof. Jitendra Malik. Taken together, this body of work represents some of the finest research being conducted by the IAPR community worldwide, it builds on a rich legacy of accomplishment by the entire community, and it offers a view to the future, to where we are going as a scientific community.

    For each of the award-winning papers, the authors were asked to revise and extend their contributions to full journal length and to provide true added value vis-à-vis the original conference submission. In some cases, the authors elected to modify the titles slightly, and in some cases the list of authors has also been modified. The resulting manuscripts were sent out for full review by a different set of referees than those who reviewed the conference versions. The process, including required revisions, was in accordance with the standing editorial policy of Pattern Recognition Letters, resulting in the final versions accepted and appearing here. These are thoroughly vetted, high-caliber scientific contributions.

    It has been our honor to serve as Guest Editors for this special issue. We would like to thank the Editors of Pattern Recognition Letters for allowing us this opportunity. We are especially grateful to Dr. Gabriella Sanniti di Baja for her enthusiasm, support, and her willingness to keep prodding us along to bring the special issue through to completion. We would also like to thank all of those who reviewed the papers, both originally for the conference and subsequently for the journal, and those who served on the ICPR awards and KS Fu Prize committees.

    Finally, we express our heartfelt gratitude to all of the authors for taking the time to prepare these versions for our collective enlightenment, sharing their knowledge, innovation, and discoveries with the rest of us.

  • 43.
    Conrad, Christian
    et al.
    Goethe University, Frankfurt, Germany.
    Mertz, Matthias
    Goethe University, Frankfurt, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Contour-relaxed Superpixels2013In: / [ed] Heyden, A., Kahl, F., Olsson, C., Oskarsson, M., Tai, X.-C., Springer Berlin/Heidelberg, 2013, p. 280-293Conference paper (Refereed)
    Abstract [en]

    We propose and evaluate a versatile scheme for image pre-segmentation that generates a partition of the image into a selectable number of patches (’superpixels’), under the constraint of obtaining maximum homogeneity of the ’texture’ inside of each patch, and maximum accordance of the contours with both the image content as well as a Gibbs-Markov random field model. In contrast to current state-of-the art approaches to superpixel segmentation, ’homogeneity’ does not limit itself to smooth region-internal signals and high feature value similarity between neighboring pixels, but is applicable also to highly textured scenes. The energy functional that is to be maximized for this purpose has only a very small number of design parameters, depending on the particular statistical model used for the images.

    The capability of the resulting partitions to deform according to the image content can be controlled by a single parameter. We show by means of an extensive comparative experimental evaluation that the compactness-controlled contour-relaxed superpixels method outperforms the state-of-the art superpixel algorithms with respect to boundary recall and undersegmentation error while being faster or on a par with respect to runtime.

  • 44.
    Conrad, Christian
    et al.
    Goethe University, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University, Germany.
    LEARNING RANK REDUCED MAPPINGS USING CANONICAL CORRELATION ANALYSIS2016In: 2016 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), IEEE , 2016Conference paper (Refereed)
    Abstract [en]

    Correspondence relations between different views of the same scene can be learnt in an unsupervised manner. We address autonomous learning of arbitrary fixed spatial (point-to-point) mappings. Since any such transformation can be represented by a permutation matrix, the signal model is a linear one, whereas the proposed analysis method, mainly based on Canonical Correlation Analysis (CCA) is based on a generalized eigensystem problem, i.e., a nonlinear operation. The learnt transformation is represented implicitly in terms of pairs of learned basis vectors and does neither use nor require an analytic/parametric expression for the latent mapping. We show how the rank of the signal that is shared among views may be determined from canonical correlations and how the overlapping (=shared) dimensions among the views may be inferred.

  • 45.
    Conrad, Christian
    et al.
    Goethe University, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University, Germany.
    Learning Relative Photometric Differences of Pairs of Cameras2015In: 2015 12TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), IEEE , 2015Conference paper (Refereed)
    Abstract [en]

    We present an approach to learn relative photometric differences between pairs of cameras, which have partially overlapping fields of views. This is an important problem, especially in appearance based methods to correspondence estimation or object identification in multi-camera systems where grey values observed by different cameras are processed. We model intensity differences among pairs of cameras by means of a low order polynomial (Gray Value Transfer Function - GVTF) which represents the characteristic curve of the mapping of grey values, s(i) produced by camera C-i to the corresponding grey values s(j) acquired with camera C-j. While the estimation of the GVTF parameters is straightforward once a set of truly corresponding pairs of grey values is available, the non trivial task in the GVTF estimation process solved in this paper is the extraction of corresponding grey value pairs in the presence of geometric and photometric errors. We also present a temporal GVTF update scheme to adapt to gradual global illumination changes, e.g., due to the change of daylight.

  • 46.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Learning Convolution Operators for Visual Tracking2018Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Visual tracking is one of the fundamental problems in computer vision. Its numerous applications include robotics, autonomous driving, augmented reality and 3D reconstruction. In essence, visual tracking can be described as the problem of estimating the trajectory of a target in a sequence of images. The target can be any image region or object of interest. While humans excel at this task, requiring little effort to perform accurate and robust visual tracking, it has proven difficult to automate. It has therefore remained one of the most active research topics in computer vision.

    In its most general form, no prior knowledge about the object of interest or environment is given, except for the initial target location. This general form of tracking is known as generic visual tracking. The unconstrained nature of this problem makes it particularly difficult, yet applicable to a wider range of scenarios. As no prior knowledge is given, the tracker must learn an appearance model of the target on-the-fly. Cast as a machine learning problem, it imposes several major challenges which are addressed in this thesis.

    The main purpose of this thesis is the study and advancement of the, so called, Discriminative Correlation Filter (DCF) framework, as it has shown to be particularly suitable for the tracking application. By utilizing properties of the Fourier transform, a correlation filter is discriminatively learned by efficiently minimizing a least-squares objective. The resulting filter is then applied to a new image in order to estimate the target location.

    This thesis contributes to the advancement of the DCF methodology in several aspects. The main contribution regards the learning of the appearance model: First, the problem of updating the appearance model with new training samples is covered. Efficient update rules and numerical solvers are investigated for this task. Second, the periodic assumption induced by the circular convolution in DCF is countered by proposing a spatial regularization component. Third, an adaptive model of the training set is proposed to alleviate the impact of corrupted or mislabeled training samples. Fourth, a continuous-space formulation of the DCF is introduced, enabling the fusion of multiresolution features and sub-pixel accurate predictions. Finally, the problems of computational complexity and overfitting are addressed by investigating dimensionality reduction techniques.

    As a second contribution, different feature representations for tracking are investigated. A particular focus is put on the analysis of color features, which had been largely overlooked in prior tracking research. This thesis also studies the use of deep features in DCF-based tracking. While many vision problems have greatly benefited from the advent of deep learning, it has proven difficult to harvest the power of such representations for tracking. In this thesis it is shown that both shallow and deep layers contribute positively. Furthermore, the problem of fusing their complementary properties is investigated.

    The final major contribution of this thesis regards the prediction of the target scale. In many applications, it is essential to track the scale, or size, of the target since it is strongly related to the relative distance. A thorough analysis of how to integrate scale estimation into the DCF framework is performed. A one-dimensional scale filter is proposed, enabling efficient and accurate scale estimation.

    List of papers
    1. Adaptive Color Attributes for Real-Time Visual Tracking
    Open this publication in new window or tab >>Adaptive Color Attributes for Real-Time Visual Tracking
    2014 (English)In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2014, IEEE Computer Society, 2014, p. 1090-1097Conference paper, Published paper (Refereed)
    Abstract [en]

    Visual tracking is a challenging problem in computer vision. Most state-of-the-art visual trackers either rely on luminance information or use simple color representations for image description. Contrary to visual tracking, for object recognition and detection, sophisticated color features when combined with luminance have shown to provide excellent performance. Due to the complexity of the tracking problem, the desired color feature should be computationally efficient, and possess a certain amount of photometric invariance while maintaining high discriminative power.

    This paper investigates the contribution of color in a tracking-by-detection framework. Our results suggest that color attributes provides superior performance for visual tracking. We further propose an adaptive low-dimensional variant of color attributes. Both quantitative and attributebased evaluations are performed on 41 challenging benchmark color sequences. The proposed approach improves the baseline intensity-based tracker by 24% in median distance precision. Furthermore, we show that our approach outperforms state-of-the-art tracking methods while running at more than 100 frames per second.

    Place, publisher, year, edition, pages
    IEEE Computer Society, 2014
    Series
    IEEE Conference on Computer Vision and Pattern Recognition. Proceedings, ISSN 1063-6919
    National Category
    Computer Engineering
    Identifiers
    urn:nbn:se:liu:diva-105857 (URN)10.1109/CVPR.2014.143 (DOI)2-s2.0-84911362613 (Scopus ID)978-147995117-8 (ISBN)
    Conference
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, USA, June 24-27, 2014
    Note

    Publication status: Accepted

    Available from: 2014-04-10 Created: 2014-04-10 Last updated: 2018-04-25Bibliographically approved
    2. Coloring Channel Representations for Visual Tracking
    Open this publication in new window or tab >>Coloring Channel Representations for Visual Tracking
    2015 (English)In: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen, Kim S. Pedersen, Springer, 2015, Vol. 9127, p. 117-129Conference paper, Published paper (Refereed)
    Abstract [en]

    Visual object tracking is a classical, but still open research problem in computer vision, with many real world applications. The problem is challenging due to several factors, such as illumination variation, occlusions, camera motion and appearance changes. Such problems can be alleviated by constructing robust, discriminative and computationally efficient visual features. Recently, biologically-inspired channel representations \cite{felsberg06PAMI} have shown to provide promising results in many applications ranging from autonomous driving to visual tracking.

    This paper investigates the problem of coloring channel representations for visual tracking. We evaluate two strategies, channel concatenation and channel product, to construct channel coded color representations. The proposed channel coded color representations are generic and can be used beyond tracking.

    Experiments are performed on 41 challenging benchmark videos. Our experiments clearly suggest that a careful selection of color feature together with an optimal fusion strategy, significantly outperforms the standard luminance based channel representation. Finally, we show promising results compared to state-of-the-art tracking methods in the literature.

    Place, publisher, year, edition, pages
    Springer, 2015
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 9127
    Keywords
    Visual tracking, channel coding, color names
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-121003 (URN)10.1007/978-3-319-19665-7_10 (DOI)978-3-319-19664-0 (ISBN)978-3-319-19665-7 (ISBN)
    Conference
    Scandinavian Conference on Image Analysis
    Available from: 2015-09-02 Created: 2015-09-02 Last updated: 2018-04-25Bibliographically approved
    3. Discriminative Scale Space Tracking
    Open this publication in new window or tab >>Discriminative Scale Space Tracking
    2017 (English)In: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 39, no 8, p. 1561-1575Article in journal (Refereed) Published
    Abstract [en]

    Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation. The explicit scale filter is learned online using the target appearance sampled at a set of different scales. Contrary to standard approaches, our method directly learns the appearance change induced by variations in the target scale. Additionally, we investigate strategies to reduce the computational cost of our approach. Extensive experiments are performed on the OTB and the VOT2014 datasets. Compared to the standard exhaustive scale search, our approach achieves a gain of 2.5 percent in average overlap precision on the OTB dataset. Additionally, our method is computationally efficient, operating at a 50 percent higher frame rate compared to the exhaustive scale search. Our method obtains the top rank in performance by outperforming 19 state-of-the-art trackers on OTB and 37 state-of-the-art trackers on VOT2014.

    Place, publisher, year, edition, pages
    IEEE COMPUTER SOC, 2017
    Keywords
    Visual tracking; scale estimation; correlation filters
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-139382 (URN)10.1109/TPAMI.2016.2609928 (DOI)000404606300006 ()27654137 (PubMedID)
    Note

    Funding Agencies|Swedish Foundation for Strategic Research; Swedish Research Council; Strategic Vehicle Research and Innovation (FFI); Wallenberg Autonomous Systems Program; National Supercomputer Centre; Nvidia

    Available from: 2017-08-07 Created: 2017-08-07 Last updated: 2018-10-17
    4. Learning Spatially Regularized Correlation Filters for Visual Tracking
    Open this publication in new window or tab >>Learning Spatially Regularized Correlation Filters for Visual Tracking
    2015 (English)In: Proceedings of the International Conference in Computer Vision (ICCV), 2015, IEEE Computer Society, 2015, p. 4310-4318Conference paper, Published paper (Refereed)
    Abstract [en]

    Robust and accurate visual tracking is one of the most challenging computer vision problems. Due to the inherent lack of training data, a robust approach for constructing a target appearance model is crucial. Recently, discriminatively learned correlation filters (DCF) have been successfully applied to address this problem for tracking. These methods utilize a periodic assumption of the training samples to efficiently learn a classifier on all patches in the target neighborhood. However, the periodic assumption also introduces unwanted boundary effects, which severely degrade the quality of the tracking model.

    We propose Spatially Regularized Discriminative Correlation Filters (SRDCF) for tracking. A spatial regularization component is introduced in the learning to penalize correlation filter coefficients depending on their spatial location. Our SRDCF formulation allows the correlation filters to be learned on a significantly larger set of negative training samples, without corrupting the positive samples. We further propose an optimization strategy, based on the iterative Gauss-Seidel method, for efficient online learning of our SRDCF. Experiments are performed on four benchmark datasets: OTB-2013, ALOV++, OTB-2015, and VOT2014. Our approach achieves state-of-the-art results on all four datasets. On OTB-2013 and OTB-2015, we obtain an absolute gain of 8.0% and 8.2% respectively, in mean overlap precision, compared to the best existing trackers.

    Place, publisher, year, edition, pages
    IEEE Computer Society, 2015
    Series
    IEEE International Conference on Computer Vision. Proceedings, ISSN 1550-5499
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-121609 (URN)10.1109/ICCV.2015.490 (DOI)000380414100482 ()978-1-4673-8390-5 (ISBN)
    Conference
    International Conference in Computer Vision (ICCV), Santiago, Chile, December 13-16, 2015
    Available from: 2015-09-28 Created: 2015-09-28 Last updated: 2018-04-25
    5. Convolutional Features for Correlation Filter Based Visual Tracking
    Open this publication in new window or tab >>Convolutional Features for Correlation Filter Based Visual Tracking
    2015 (English)In: Proceedings of the IEEE International Conference on Computer Vision, IEEE conference proceedings, 2015, p. 621-629Conference paper, Published paper (Refereed)
    Abstract [en]

    Visual object tracking is a challenging computer vision problem with numerous real-world applications. This paper investigates the impact of convolutional features for the visual tracking problem. We propose to use activations from the convolutional layer of a CNN in discriminative correlation filter based tracking frameworks. These activations have several advantages compared to the standard deep features (fully connected layers). Firstly, they mitigate the need of task specific fine-tuning. Secondly, they contain structural information crucial for the tracking problem. Lastly, these activations have low dimensionality. We perform comprehensive experiments on three benchmark datasets: OTB, ALOV300++ and the recently introduced VOT2015. Surprisingly, different to image classification, our results suggest that activations from the first layer provide superior tracking performance compared to the deeper layers. Our results further show that the convolutional features provide improved results compared to standard handcrafted features. Finally, results comparable to state-of-theart trackers are obtained on all three benchmark datasets.

    Place, publisher, year, edition, pages
    IEEE conference proceedings, 2015
    Series
    IEEE International Conference on Computer Vision. Proceedings, ISSN 1550-5499
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-128869 (URN)10.1109/ICCVW.2015.84 (DOI)000380434700075 ()978-1-4673-9711-7 (ISBN)
    Conference
    15th IEEE International Conference on Computer Vision Workshops, ICCVW 2015
    Available from: 2016-06-02 Created: 2016-06-02 Last updated: 2018-10-17
    6. Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
    Open this publication in new window or tab >>Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
    2016 (English)In: 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CPVR), IEEE , 2016, p. 1430-1438Conference paper, Published paper (Refereed)
    Abstract [en]

    Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.

    Place, publisher, year, edition, pages
    IEEE, 2016
    Series
    IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-137882 (URN)10.1109/CVPR.2016.159 (DOI)000400012301051 ()978-1-4673-8851-1 (ISBN)
    Conference
    29th IEEE Conference on Computer Vision and Pattern Recognition
    Note

    Funding Agencies|SSF (CUAS); VR (EMC2); VR (ELLIIT); Wallenberg Autonomous Systems Program; NSC; Nvidia

    Available from: 2017-06-01 Created: 2017-06-01 Last updated: 2018-10-17
    7. Deep motion and appearance cues for visual tracking
    Open this publication in new window or tab >>Deep motion and appearance cues for visual tracking
    Show others...
    2018 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344Article in journal (Refereed) Published
    Abstract [en]

    Generic visual tracking is a challenging computer vision problem, with numerous applications. Most existing approaches rely on appearance information by employing either hand-crafted features or deep RGB features extracted from convolutional neural networks. Despite their success, these approaches struggle in case of ambiguous appearance information, leading to tracking failure. In such cases, we argue that motion cue provides discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. In this paper, we investigate the impact of deep motion features in a tracking-by-detection framework. We also evaluate the fusion of hand-crafted, deep RGB, and deep motion features and show that they contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly demonstrate that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.

    Place, publisher, year, edition, pages
    Elsevier, 2018
    Keywords
    Visual tracking, Deep learning, Optical flow, Discriminative correlation filters
    National Category
    Computer and Information Sciences
    Identifiers
    urn:nbn:se:liu:diva-148015 (URN)10.1016/j.patrec.2018.03.009 (DOI)2-s2.0-85044328745 (Scopus ID)
    Available from: 2018-05-24 Created: 2018-05-24 Last updated: 2018-05-31Bibliographically approved
    8. Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
    Open this publication in new window or tab >>Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
    2016 (English)In: Computer Vision - ECCV 2016, Pt V, SPRINGER INT PUBLISHING AG , 2016, Vol. 9909, p. 472-488Conference paper, Published paper (Refereed)
    Abstract [en]

    Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual object tracking. The key to their success is the ability to efficiently exploit available negative data by including all shifted versions of a training sample. However, the underlying DCF formulation is restricted to single-resolution feature maps, significantly limiting its potential. In this paper, we go beyond the conventional DCF framework and introduce a novel formulation for training continuous convolution filters. We employ an implicit interpolation model to pose the learning problem in the continuous spatial domain. Our proposed formulation enables efficient integration of multi-resolution deep feature maps, leading to superior results on three object tracking benchmarks: OTB-2015 (+5.1% in mean OP), Temple-Color (+4.6% in mean OP), and VOT2015 (20% relative reduction in failure rate). Additionally, our approach is capable of sub-pixel localization, crucial for the task of accurate feature point tracking. We also demonstrate the effectiveness of our learning formulation in extensive feature point tracking experiments.

    Place, publisher, year, edition, pages
    SPRINGER INT PUBLISHING AG, 2016
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-133550 (URN)10.1007/978-3-319-46454-1_29 (DOI)000389385400029 ()978-3-319-46453-4 (ISBN)
    Conference
    14th European Conference on Computer Vision (ECCV)
    Available from: 2016-12-30 Created: 2016-12-29 Last updated: 2018-10-17
    9. ECO: Efficient Convolution Operators for Tracking
    Open this publication in new window or tab >>ECO: Efficient Convolution Operators for Tracking
    2017 (English)In: 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), IEEE , 2017, p. 6931-6939Conference paper, Published paper (Refereed)
    Abstract [en]

    In recent years, Discriminative Correlation Filter (DCF) based methods have significantly advanced the state-of-the-art in tracking. However, in the pursuit of ever increasing tracking performance, their characteristic speed and real-time capability have gradually faded. Further, the increasingly complex models, with massive number of trainable parameters, have introduced the risk of severe over-fitting. In this work, we tackle the key causes behind the problems of computational complexity and over-fitting, with the aim of simultaneously improving both speed and performance. We revisit the core DCF formulation and introduce: (i) a factorized convolution operator, which drastically reduces the number of parameters in the model; (ii) a compact generative model of the training sample distribution, that significantly reduces memory and time complexity, while providing better diversity of samples; (iii) a conservative model update strategy with improved robustness and reduced complexity. We perform comprehensive experiments on four benchmarks: VOT2016, UAV123, OTB-2015, and Temple-Color. When using expensive deep features, our tracker provides a 20-fold speedup and achieves a 13.0% relative gain in Expected Average Overlap compared to the top ranked method [12] in the VOT2016 challenge. Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65.0% AUC on OTB-2015.

    Place, publisher, year, edition, pages
    IEEE, 2017
    Series
    IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-144284 (URN)10.1109/CVPR.2017.733 (DOI)000418371407004 ()978-1-5386-0457-1 (ISBN)
    Conference
    30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
    Note

    Funding Agencies|SSF (SymbiCloud); VR (EMC2) [2016-05543]; SNIC; WASP; Visual Sweden; Nvidia

    Available from: 2018-01-12 Created: 2018-01-12 Last updated: 2018-10-17
  • 47.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Visual Tracking2013Independent thesis Advanced level (degree of Master (Two Years)), 300 HE creditsStudent thesis
    Abstract [en]

    Visual tracking is a classical computer vision problem with many important applications in areas such as robotics, surveillance and driver assistance. The task is to follow a target in an image sequence. The target can be any object of interest, for example a human, a car or a football. Humans perform accurate visual tracking with little effort, while it remains a difficult computer vision problem. It imposes major challenges, such as appearance changes, occlusions and background clutter. Visual tracking is thus an open research topic, but significant progress has been made in the last few years.

    The first part of this thesis explores generic tracking, where nothing is known about the target except for its initial location in the sequence. A specific family of generic trackers that exploit the FFT for faster tracking-by-detection is studied. Among these, the CSK tracker have recently shown obtain competitive performance at extraordinary low computational costs. Three contributions are made to this type of trackers. Firstly, a new method for learning the target appearance is proposed and shown to outperform the original method. Secondly, different color descriptors are investigated for the tracking purpose. Evaluations show that the best descriptor greatly improves the tracking performance. Thirdly, an adaptive dimensionality reduction technique is proposed, which adaptively chooses the most important feature combinations to use. This technique significantly reduces the computational cost of the tracking task. Extensive evaluations show that the proposed tracker outperform state-of-the-art methods in literature, while operating at several times higher frame rate.

    In the second part of this thesis, the proposed generic tracking method is applied to human tracking in surveillance applications. A causal framework is constructed, that automatically detects and tracks humans in the scene. The system fuses information from generic tracking and state-of-the-art object detection in a Bayesian filtering framework. In addition, the system incorporates the identification and tracking of specific human parts to achieve better robustness and performance. Tracking results are demonstrated on a real-world benchmark sequence.

  • 48.
    Danelljan, Martin
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Bhat, Goutam
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Gladh, Susanna
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Deep motion and appearance cues for visual tracking2018In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344Article in journal (Refereed)
    Abstract [en]

    Generic visual tracking is a challenging computer vision problem, with numerous applications. Most existing approaches rely on appearance information by employing either hand-crafted features or deep RGB features extracted from convolutional neural networks. Despite their success, these approaches struggle in case of ambiguous appearance information, leading to tracking failure. In such cases, we argue that motion cue provides discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. In this paper, we investigate the impact of deep motion features in a tracking-by-detection framework. We also evaluate the fusion of hand-crafted, deep RGB, and deep motion features and show that they contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly demonstrate that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.

  • 49.
    Danelljan, Martin
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Bhat, Goutam
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    ECO: Efficient Convolution Operators for Tracking2017In: 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), IEEE , 2017, p. 6931-6939Conference paper (Refereed)
    Abstract [en]

    In recent years, Discriminative Correlation Filter (DCF) based methods have significantly advanced the state-of-the-art in tracking. However, in the pursuit of ever increasing tracking performance, their characteristic speed and real-time capability have gradually faded. Further, the increasingly complex models, with massive number of trainable parameters, have introduced the risk of severe over-fitting. In this work, we tackle the key causes behind the problems of computational complexity and over-fitting, with the aim of simultaneously improving both speed and performance. We revisit the core DCF formulation and introduce: (i) a factorized convolution operator, which drastically reduces the number of parameters in the model; (ii) a compact generative model of the training sample distribution, that significantly reduces memory and time complexity, while providing better diversity of samples; (iii) a conservative model update strategy with improved robustness and reduced complexity. We perform comprehensive experiments on four benchmarks: VOT2016, UAV123, OTB-2015, and Temple-Color. When using expensive deep features, our tracker provides a 20-fold speedup and achieves a 13.0% relative gain in Expected Average Overlap compared to the top ranked method [12] in the VOT2016 challenge. Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65.0% AUC on OTB-2015.

  • 50.
    Danelljan, Martin
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Accurate Scale Estimation for Robust Visual Tracking2014In: Proceedings of the British Machine Vision Conference 2014 / [ed] Michel Valstar, Andrew French and Tony Pridmore, BMVA Press , 2014Conference paper (Refereed)
    Abstract [en]

    Robust scale estimation is a challenging problem in visual object tracking. Most existing methods fail to handle large scale variations in complex image sequences. This paper presents a novel approach for robust scale estimation in a tracking-by-detection framework. The proposed approach works by learning discriminative correlation filters based on a scale pyramid representation. We learn separate filters for translation and scale estimation, and show that this improves the performance compared to an exhaustive scale search. Our scale estimation approach is generic as it can be incorporated into any tracking method with no inherent scale estimation.

    Experiments are performed on 28 benchmark sequences with significant scale variations. Our results show that the proposed approach significantly improves the performance by 18.8 % in median distance precision compared to our baseline. Finally, we provide both quantitative and qualitative comparison of our approach with state-of-the-art trackers in literature. The proposed method is shown to outperform the best existing tracker by 16.6 % in median distance precision, while operating at real-time.

1234567 1 - 50 of 327
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf