liu.seSearch for publications in DiVA
Change search
Refine search result
1234567 1 - 50 of 577
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Abdella, Juhar Ahmed
    et al.
    UAEU, U Arab Emirates.
    Zaki, N. M.
    UAEU, U Arab Emirates.
    Shuaib, Khaled
    UAEU, U Arab Emirates.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Airline ticket price and demand prediction: A survey2021In: Journal of King Saud University - Computer and Information Sciences, ISSN 1319-1578, Vol. 33, no 4, p. 375-391Article in journal (Refereed)
    Abstract [en]

    Nowadays, airline ticket prices can vary dynamically and significantly for the same flight, even for nearby seats within the same cabin. Customers are seeking to get the lowest price while airlines are trying to keep their overall revenue as high as possible and maximize their profit. Airlines use various kinds of computational techniques to increase their revenue such as demand prediction and price discrimination. From the customer side, two kinds of models are proposed by different researchers to save money for customers: models that predict the optimal time to buy a ticket and models that predict the minimum ticket price. In this paper, we present a review of customer side and airlines side prediction models. Our review analysis shows that models on both sides rely on limited set of features such as historical ticket price data, ticket purchase date and departure date. Features extracted from external factors such as social media data and search engine query are not considered. Therefore, we introduce and discuss the concept of using social media data for ticket/demand prediction. (c) 2019 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

    Download full text (pdf)
    fulltext
  • 2.
    Acsintoae, Andra
    et al.
    Univ Bucharest, Romania.
    Florescu, Andrei
    Univ Bucharest, Romania.
    Georgescu, Mariana-Iuliana
    Univ Bucharest, Romania; MBZ Univ Artificial Intelligence, U Arab Emirates; SecurifAI, Romania.
    Mare, Tudor
    SecurifAI, Romania.
    Sumedrea, Paul
    Univ Bucharest, Romania.
    Ionescu, Radu Tudor
    Univ Bucharest, Romania; SecurifAI, Romania.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. MBZ Univ Artificial Intelligence, U Arab Emirates.
    Shah, Mubarak
    Univ Cent Florida, FL 32816 USA.
    UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection2022In: 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), IEEE COMPUTER SOC , 2022, p. 20111-20121Conference paper (Refereed)
    Abstract [en]

    Detecting abnormal events in video is commonly framed as a one-class classification task, where training videos contain only normal events, while test videos encompass both normal and abnormal events. In this scenario, anomaly detection is an open-set problem. However, some studies assimilate anomaly detection to action recognition. This is a closed-set scenario that fails to test the capability of systems at detecting new anomaly types. To this end, we propose UBnormal, a new supervised open-set benchmark composed of multiple virtual scenes for video anomaly detection. Unlike existing data sets, we introduce abnormal events annotated at the pixel level at training time, for the first time enabling the use of fully-supervised learning methods for abnormal event detection. To preserve the typical open-set formulation, we make sure to include dis-joint sets of anomaly types in our training and test collections of videos. To our knowledge, UBnormal is the first video anomaly detection benchmark to allow a fair head-to-head comparison between one-class open-set models and supervised closed-set models, as shown in our experiments. Moreover, we provide empirical evidence showing that UB-normal can enhance the performance of a state-of-the-art anomaly detection framework on two prominent data sets, Avenue and ShanghaiTech. Our benchmark is freely available at https://github.com/lilygeorgescu/UBnormal.

  • 3.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Arsic, Dejan
    Munich University of Technology, Germany.
    Ganchev, Todor
    University of Patras, Greece.
    Linderhed, Anna
    FOI Swedish Defence Research Agency.
    Menezes, Paolo
    University of Coimbra, Portugal.
    Ntalampiras, Stavros
    University of Patras, Greece.
    Olma, Tadeusz
    MARAC S.A., Greece.
    Potamitis, Ilyas
    Technological Educational Institute of Crete, Greece.
    Ros, Julien
    Probayes SAS, France.
    Prometheus: Prediction and interpretation of human behaviour based on probabilistic structures and heterogeneous sensors2008Conference paper (Refereed)
    Abstract [en]

    The on-going EU funded project Prometheus (FP7-214901) aims at establishing a general framework which links fundamental sensing tasks to automated cognition processes enabling interpretation and short-term prediction of individual and collective human behaviours in unrestricted environments as well as complex human interactions. To achieve the aforementioned goals, the Prometheus consortium works on the following core scientific and technological objectives:

    1. sensor modeling and information fusion from multiple, heterogeneous perceptual modalities;

    2. modeling, localization, and tracking of multiple people;

    3. modeling, recognition, and short-term prediction of continuous complex human behavior.

    Download full text (pdf)
    fulltext
  • 4.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Evaluating Template Rescaling in Short-Term Single-Object Tracking2015Conference paper (Refereed)
    Abstract [en]

    In recent years, short-term single-object tracking has emerged has a popular research topic, as it constitutes the core of more general tracking systems. Many such tracking methods are based on matching a part of the image with a template that is learnt online and represented by, for example, a correlation filter or a distribution field. In order for such a tracker to be able to not only find the position, but also the scale, of the tracked object in the next frame, some kind of scale estimation step is needed. This step is sometimes separate from the position estimation step, but is nevertheless jointly evaluated in de facto benchmarks. However, for practical as well as scientific reasons, the scale estimation step should be evaluated separately – for example,theremightincertainsituationsbeothermethodsmore suitable for the task. In this paper, we describe an evaluation method for scale estimation in template-based short-term single-object tracking, and evaluate two state-of-the-art tracking methods where estimation of scale and position are separable.

    Download full text (pdf)
    fulltext
  • 5.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Glana Sensors AB, Sweden.
    Renhorn, Ingmar
    Glana Sensors AB, Sweden.
    Chevalier, Tomas
    Scienvisic AB, Sweden.
    Rydell, Joakim
    FOI, Swedish Defence Research Agency, Sweden.
    Bergström, David
    FOI, Swedish Defence Research Agency, Sweden.
    Three-dimensional hyperspectral imaging technique2017In: ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY XXIII / [ed] Miguel Velez-Reyes; David W. Messinger, SPIE - International Society for Optical Engineering, 2017, Vol. 10198, article id 1019805Conference paper (Refereed)
    Abstract [en]

    Hyperspectral remote sensing based on unmanned airborne vehicles is a field increasing in importance. The combined functionality of simultaneous hyperspectral and geometric modeling is less developed. A configuration has been developed that enables the reconstruction of the hyperspectral three-dimensional (3D) environment. The hyperspectral camera is based on a linear variable filter and a high frame rate, high resolution camera enabling point-to-point matching and 3D reconstruction. This allows the information to be combined into a single and complete 3D hyperspectral model. In this paper, we describe the camera and illustrate capabilities and difficulties through real-world experiments.

    Download full text (pdf)
    fulltext
  • 6.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Åstrom, Anders
    Swedish Natl Forens Ctr NFC, Linkoping, Sweden.
    Forchheimer, Robert
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Faculty of Science & Engineering.
    Simultaneous sensing, readout, and classification on an intensity-ranking image sensor2018In: International journal of circuit theory and applications, ISSN 0098-9886, E-ISSN 1097-007X, Vol. 46, no 9, p. 1606-1619Article in journal (Refereed)
    Abstract [en]

    We combine the near-sensor image processing concept with address-event representation leading to an intensity-ranking image sensor (IRIS) and show the benefits of using this type of sensor for image classification. The functionality of IRIS is to output pixel coordinates (X and Y values) continuously as each pixel has collected a certain number of photons. Thus, the pixel outputs will be automatically intensity ranked. By keeping track of the timing of these events, it is possible to record the full dynamic range of the image. However, in many cases, this is not necessary-the intensity ranking in itself gives the needed information for the task at hand. This paper describes techniques for classification and proposes a particular variant (groves) that fits the IRIS architecture well as it can work on the intensity rankings only. Simulation results using the CIFAR-10 dataset compare the results of the proposed method with the more conventional ferns technique. It is concluded that the simultaneous sensing and classification obtainable with the IRIS sensor yields both fast (shorter than full exposure time) and processing-efficient classification.

    Download full text (pdf)
    fulltext
  • 7.
    Ahlman, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improved Temporal Resolution Using Parallel Imaging in Radial-Cartesian 3D functional MRI2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    MRI (Magnetic Resonance Imaging) is a medical imaging method that uses magnetic fields in order to retrieve images of the human body. This thesis revolves around a novel acquisition method of 3D fMRI (functional Magnetic Resonance Imaging) called PRESTO-CAN that uses a radial pattern in order to sample the (kx,kz)-plane of k-space (the frequency domain), and a Cartesian sample pattern in the ky-direction. The radial sample pattern allows for a denser sampling of the central parts of k-space, which contain the most basic frequency information about the structure of the recorded object. This allows for higher temporal resolution to be achieved compared with other sampling methods since a fewer amount of total samples are needed in order to retrieve enough information about how the object has changed over time. Since fMRI is mainly used for monitoring blood flow in the brain, increased temporal resolution means that we can be able to track fast changes in brain activity more efficiently.The temporal resolution can be further improved by reducing the time needed for scanning, which in turn can be achieved by applying parallel imaging. One such parallel imaging method is SENSE (SENSitivity Encoding). The scan time is reduced by decreasing the sampling density, which causes aliasing in the recorded images. The aliasing is removed by the SENSE method by utilizing the extra information provided by the fact that multiple receiver coils with differing sensitivities are used during the acquisition. By measuring the sensitivities of the respective receiver coils and solving an equation system with the aliased images, it is possible to calculate how they would have looked like without aliasing.In this master thesis, SENSE has been successfully implemented in PRESTO-CAN. By using normalized convolution in order to refine the sensitivity maps of the receiver coils, images with satisfying quality was able to be reconstructed when reducing the k-space sample rate by a factor of 2, and images of relatively good quality also when the sample rate was reduced by a factor of 4. In this way, this thesis has been able to contribute to the improvement of the temporal resolution of the PRESTO-CAN method.

    Download full text (pdf)
    Gustav_Ahlman_Examensarbete_SENSE
  • 8.
    Ahlman, Gustav
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Magnusson, Maria
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Dahlqvist Leinhard, Olof
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Medical and Health Sciences, Radiation Physics. Linköping University, Faculty of Health Sciences.
    Lundberg, Peter
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Medical and Health Sciences, Radiation Physics. Linköping University, Department of Medical and Health Sciences, Radiology. Linköping University, Faculty of Health Sciences. Östergötlands Läns Landsting, Center for Surgery, Orthopaedics and Cancer Treatment, Department of Radiation Physics. Östergötlands Läns Landsting, Center for Diagnostics, Department of Radiology in Linköping.
    Increased temporal resolution in radial-Cartesian sampling of k-space by implementation of parallel imaging2011Conference paper (Refereed)
  • 9.
    Ahlqvist, Axel
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Examining Difficulties in Weed Detection2022Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Automatic detection of weeds could be used for more efficient weed control in agriculture. In this master thesis, weed detectors have been trained and examined on data collected by RISE to investigate whether an accurate weed detector could be trained on the collected data. When only using annotations of the weed class Creeping thistle for training and evaluation, a detector achieved a mAP of 0.33. When using four classes of weed, a detector was trained with a mAP of 0.07. The performance was worse than in a previous study also dealing with weed detection. Hypotheses for why the performance was lacking were examined. Experiments indicated that the problem could not fully be explained by the model being underfitted, nor by the object’s backgrounds being too similar to the foreground, nor by the quality of the annotations being too low. The performance was better when training the model with as much data as possible than when only selected segments of the data were used.

    Download full text (pdf)
    fulltext
  • 10.
    Almin, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Detection of Non-Ferrous Materials with Computer Vision2020Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In one of the facilities at the Stena Recycling plant in Halmstad, Sweden, about 300 tonnes of metallic waste is processed each day with the aim of sorting out all non-ferrous material. At the end of this process, non-ferrous materials are

    manually sorted out from the ferrous materials. This thesis investigates a computer vision based approach to identify and localize the non-ferrous materials

    and eventually automate the sorting.Images were captured of ferrous and non-ferrous materials. The images areprocessed and segmented to be used as annotation data for a deep convolutionalneural segmentation network. Network models have been trained on different

    kinds and amounts of data. The resulting models are evaluated and tested in ac-cordance with different evaluation metrics. Methods of creating advanced train-ing data by merging imaging information were tested. Experiments with using

    classifier prediction confidence to identify objects of unknown classes were per-formed.

    This thesis shows that it is possible to discern ferrous from non-ferrous mate-rial with a purely vision based system. The thesis also shows that it is possible to

    automatically create annotated training data. It becomes evident that it is possi-ble to create better training data, tailored for the task at hand, by merging image

    data. A segmentation network trained on more than two classes yields lowerprediction confidence for objects unknown to the classifier.Substituting manual sorting with a purely vision based system seems like aviable approach. Before a substitution is considered, the automatic system needsto be evaluated in comparison to the manual sorting.

    Download full text (pdf)
    Detection of Non-Ferrous Materials with Computer Vision
  • 11.
    Andersson, Elin
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Thermal Impact of a Calibrated Stereo Camera Rig2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Measurements performed from stereo reconstruction can be obtained with a high accuracy with correct calibrated cameras. A stereo camera rig mounted in an outdoor environment is exposed to temperature changes, which has an impact of the calibration of the cameras.

    The aim of the master thesis was to investigate the thermal impact of a calibrated stereo camera rig. This was performed by placing a stereo rig in a temperature chamber and collect data of a calibration board at different temperatures. Data was collected with two different cameras and lensesand used for calibration of the stereo camera rig for different scenarios. The obtained parameters were plotted and analyzed.

    The result from the master thesis gives that the thermal variation has an impact of the accuracy of the calibrated stereo camera rig. A calibration obtained in one temperature can not be used for a different temperature without a degradation of the accuracy. The plotted parameters from the calibration had a high noise level due to problems with the calibration methods, and no visible trend from temperature changes could be seen.

    Download full text (pdf)
    fulltext
  • 12.
    Andersson, Maria
    et al.
    FOI Swedish Defence Research Agency.
    Rydell, Joakim
    FOI Swedish Defence Research Agency.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. FOI Swedish Defence Research Agency.
    Estimation of crowd behaviour using sensor networks and sensor fusion2009Conference paper (Refereed)
    Abstract [en]

    Commonly, surveillance operators are today monitoring a large number of CCTV screens, trying to solve the complex cognitive tasks of analyzing crowd behavior and detecting threats and other abnormal behavior. Information overload is a rule rather than an exception. Moreover, CCTV footage lacks important indicators revealing certain threats, and can also in other respects be complemented by data from other sensors. This article presents an approach to automatically interpret sensor data and estimate behaviors of groups of people in order to provide the operator with relevant warnings. We use data from distributed heterogeneous sensors (visual cameras and a thermal infrared camera), and process the sensor data using detection algorithms. The extracted features are fed into a hidden Markov model in order to model normal behavior and detect deviations. We also discuss the use of radars for weapon detection.

  • 13.
    Andersson, Viktor
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Semantic Segmentation: Using Convolutional Neural Networks and Sparse dictionaries2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The two main bottlenecks using deep neural networks are data dependency and training time. This thesis proposes a novel method for weight initialization of the convolutional layers in a convolutional neural network. This thesis introduces the usage of sparse dictionaries. A sparse dictionary optimized on domain specific data can be seen as a set of intelligent feature extracting filters. This thesis investigates the effect of using such filters as kernels in the convolutional layers in the neural network. How do they affect the training time and final performance?

    The dataset used here is the Cityscapes-dataset which is a library of 25000 labeled road scene images.The sparse dictionary was acquired using the K-SVD method. The filters were added to two different networks whose performance was tested individually. One of the architectures is much deeper than the other. The results have been presented for both networks. The results show that filter initialization is an important aspect which should be taken into consideration while training the deep networks for semantic segmentation.

    Download full text (pdf)
    fulltext
  • 14.
    Antonsson, Per
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Johansson, Jesper
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Measuring Respiratory Frequency Using Optronics and Computer Vision2021Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This thesis investigates the development and use of software to measure respiratory frequency on cows using optronics and computer vision. It examines mainly two different strategies of image and signal processing and their performances for different input qualities. The effect of heat stress on dairy cows and the high transmission risk of pneumonia for calves make the investigation done during this thesis highly relevant since they both have the same symptom; increased respiratory frequency. The data set used in this thesis was of recorded dairy cows in different environments and from varying angles. Recordings, where the authors could determine a true breathing frequency by monitoring body movements, were accepted to the data set and used to test and develop the algorithms. One method developed in this thesis estimated the breathing rate in the frequency domain by Fast Fourier Transform and was named "N-point Fast Fourier Transform." The other method was called "Breathing Movement Zero-Crossing Counting." It estimated a signal in the time domain, whose fundamental frequency was determined by a zero-crossing algorithm as the breathing frequency. The result showed that both the developed algorithm successfully estimated a breathing frequency with a reasonable error margin for most of the data set. The zero-crossing algorithm showed the most consistent result with an error margin lower than 0.92 breaths per minute (BPM) for twelve of thirteen recordings. However, it is limited to recordings where the camera is placed above the cow. The N-point FFT algorithm estimated the breathing frequency with error margins between 0.44 and 5.20 BPM for the same recordings as the zero-crossing algorithm. This method is not limited to a specific camera angle but requires the cow to be relatively stationary to get accurate results. Therefore, it could be evaluated with the remaining three recordings of the data set. The error margins for these recordings were measured between 1.92 and 10.88 BPM. Both methods had execution time acceptable for implementation in real-time. It was, however, too incomplete a data set to determine any performance with recordings from different optronic devices. 

    Download full text (pdf)
    fulltext
  • 15.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Two-Stream Part-based Deep Representation for Human Attribute Recognition2018In: 2018 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), IEEE , 2018, p. 90-97Conference paper (Refereed)
    Abstract [en]

    Recognizing human attributes in unconstrained environments is a challenging computer vision problem. State-of-the-art approaches to human attribute recognition are based on convolutional neural networks (CNNs). The de facto practice when training these CNNs on a large labeled image dataset is to take RGB pixel values of an image as input to the network. In this work, we propose a two-stream part-based deep representation for human attribute classification. Besides the standard RGB stream, we train a deep network by using mapped coded images with explicit texture information, that complements the standard RGB deep model. To integrate human body parts knowledge, we employ the deformable part-based models together with our two-stream deep model. Experiments are performed on the challenging Human Attributes (HAT-27) Dataset consisting of 27 different human attributes. Our results clearly show that (a) the two-stream deep network provides consistent gain in performance over the standard RGB model and (b) that the attribute classification results are further improved with our two-stream part-based deep representations, leading to state-of-the-art results.

  • 16.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland; Incept Inst Artificial Intelligence, U Arab Emirates.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Incept Inst Artificial Intelligence, U Arab Emirates.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Zaki, Nazar
    United Arab Emirates Univ, U Arab Emirates.
    Multi-stream Convolutional Networks for Indoor Scene Recognition2019In: COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2019, PT I, SPRINGER INTERNATIONAL PUBLISHING AG , 2019, Vol. 11678, p. 196-208Conference paper (Refereed)
    Abstract [en]

    Convolutional neural networks (CNNs) have recently achieved outstanding results for various vision tasks, including indoor scene understanding. The de facto practice employed by state-of-the-art indoor scene recognition approaches is to use RGB pixel values as input to CNN models that are trained on large amounts of labeled data (Image-Net or Places). Here, we investigate CNN architectures by augmenting RGB images with estimated depth and texture information, as multiple streams, for monocular indoor scene recognition. First, we exploit the recent advancements in the field of depth estimation from monocular images and use the estimated depth information to train a CNN model for learning deep depth features. Second, we train a CNN model to exploit the successful Local Binary Patterns (LBP) by using mapped coded images with explicit LBP encoding to capture texture information available in indoor scenes. We further investigate different fusion strategies to combine the learned deep depth and texture streams with the traditional RGB stream. Comprehensive experiments are performed on three indoor scene classification benchmarks: MIT-67, OCIS and SUN-397. The proposed multi-stream network significantly outperforms the standard RGB network by achieving an absolute gain of 9.3%, 4.7%, 7.3% on the MIT-67, OCIS and SUN-397 datasets respectively.

  • 17.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    van de Weijer, Joost
    Univ Autonoma Barcelona, Spain.
    Molinier, Matthieu
    VTT Tech Res Ctr Finland Ltd, Finland.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification2018In: ISPRS journal of photogrammetry and remote sensing (Print), ISSN 0924-2716, E-ISSN 1872-8235, Vol. 138, p. 74-85Article in journal (Refereed)
    Abstract [en]

    Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification. (C) 2018 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.

  • 18.
    Ardeshiri, Tohid
    et al.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Larsson, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Gustafsson, Fredrik
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Schön, Thomas B.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Bicycle Tracking Using Ellipse Extraction2011In: Proceedings of the 14thInternational Conference on Information Fusion, 2011, IEEE , 2011, p. 1-8Conference paper (Refereed)
    Abstract [en]

    A new approach to track bicycles from imagery sensor data is proposed. It is based on detecting ellipsoids in the images, and treat these pair-wise using a dynamic bicycle model. One important application area is in automotive collision avoidance systems, where no dedicated systems for bicyclists yet exist and where very few theoretical studies have been published.

    Possible conflicts can be predicted from the position and velocity state in the model, but also from the steering wheel articulation and roll angle that indicate yaw changes before the velocity vector changes. An algorithm is proposed which consists of an ellipsoid detection and estimation algorithm and a particle filter.

    A simulation study of three critical single target scenarios is presented, and the algorithm is shown to produce excellent state estimates. An experiment using a stationary camera and the particle filter for state estimation is performed and has shown encouraging results.

  • 19.
    Baravdish, George
    et al.
    Linköping University, Department of Science and Technology, Communications and Transport Systems. Linköping University, The Institute of Technology.
    Svensson, Olof
    Linköping University, Department of Science and Technology, Communications and Transport Systems. Linköping University, The Institute of Technology.
    Åström, Freddie
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    On Backward p(x)-Parabolic Equations for Image Enhancement2015In: Numerical Functional Analysis and Optimization, ISSN 0163-0563, E-ISSN 1532-2467, Vol. 36, no 2, p. 147-168Article in journal (Refereed)
    Abstract [en]

    In this study, we investigate the backward p(x)-parabolic equation as a new methodology to enhance images. We propose a novel iterative regularization procedure for the backward p(x)-parabolic equation based on the nonlinear Landweber method for inverse problems. The proposed scheme can also be extended to the family of iterative regularization methods involving the nonlinear Landweber method. We also investigate the connection between the variable exponent p(x) in the proposed energy functional and the diffusivity function in the corresponding Euler-Lagrange equation. It is well known that the forward problems converges to a constant solution destroying the image. The purpose of the approach of the backward problems is twofold. First, solving the backward problem by a sequence of forward problems we obtain a smooth image which is denoised. Second, by choosing the initial data properly we try to reduce the blurriness of the image. The numerical results for denoising appear to give improvement over standard methods as shown by preliminary results.

  • 20.
    Barbalau, Antonio
    et al.
    Univ Bucharest, Romania.
    Ionescu, Radu Tudor
    Univ Bucharest, Romania; SecurifAI, Romania; MBZ Univ Artificial Intelligence, U Arab Emirates.
    Georgescu, Mariana-Iuliana
    Univ Bucharest, Romania; SecurifAI, Romania.
    Dueholm, Jacob
    Aalborg Univ, Denmark; Milestone Syst, Denmark.
    Ramachandra, Bharathkumar
    Geopipe Inc, NY 10019 USA.
    Nasrollahi, Kamal
    Aalborg Univ, Denmark; Milestone Syst, Denmark.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. MBZ Univ Artificial Intelligence, U Arab Emirates.
    Moeslund, Thomas B.
    Aalborg Univ, Denmark.
    Shah, Mubarak
    Univ Cent Florida, FL 32816 USA.
    SSMTL plus plus : Revisiting self-supervised multi-task learning for video anomaly detection2023In: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 229, article id 103656Article in journal (Refereed)
    Abstract [en]

    A self-supervised multi-task learning (SSMTL) framework for video anomaly detection was recently introduced in literature. Due to its highly accurate results, the method attracted the attention of many researchers. In this work, we revisit the self-supervised multi-task learning framework, proposing several updates to the original method. First, we study various detection methods, e.g. based on detecting high-motion regions using optical flow or background subtraction, since we believe the currently used pre-trained YOLOv3 is suboptimal, e.g. objects in motion or objects from unknown classes are never detected. Second, we modernize the 3D convolutional backbone by introducing multi-head self-attention modules, inspired by the recent success of vision transformers. As such, we alternatively introduce both 2D and 3D convolutional vision transformer (CvT) blocks. Third, in our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps through knowledge distillation, solving jigsaw puzzles, estimating body pose through knowledge distillation, predicting masked regions (inpainting), and adversarial learning with pseudo-anomalies. We conduct experiments to assess the performance impact of the introduced changes. Upon finding more promising configurations of the framework, dubbed SSMTL++v1 and SSMTL++v2, we extend our preliminary experiments to more data sets, demonstrating that our performance gains are consistent across all data sets. In most cases, our results on Avenue, ShanghaiTech and UBnormal raise the state-of-the-art performance bar to a new level.

  • 21.
    Barberi, Emmanuele
    et al.
    Università degli Studi di Messina, Italy.
    Cucinotta, Filippo
    Università degli Studi di Messina, Italy.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Raffaele, Marcello
    Università degli Studi di Messina, Italy.
    Salmeri, Fabio
    Università degli Studi di Messina, Italy.
    A differential entropy-based method for reverse engineering quality assessment2023Conference paper (Other academic)
    Abstract [en]

    The present work proposes the use of point clouds differential entropy as a method for reverse engineering quality assessment. This quality assessment can be used to measure the deviation of objects made with additive manufacturing or CNC techniques. The quality of the execution is intended as a measure of the deviation of the geometry of the obtained object compared to the original CAD. This paper proposes the use of the quality index of the CorAl method to assess the quality of an objects compared to its original CAD. This index, based on the differential entropy, takes on a value the closer to 0 the more they obtained object is close to the original geometry. The advantage of this method is to have a global synthetic index. It is however possible to have entropy maps of the individual points to verify which are the areas with the greatest deviation. The method is robust for comparing point clouds at different densities. Objects obtained by additive manufacturing with different print qualities were used. The quality index evaluated for each object, as defined in the CorAl method, turns out to be gradually closer to 0 as the quality of the piece's construction increases.

  • 22.
    Barnada, Marc
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Conrad, Christian
    Goethe University of Frankfurt, Germany.
    Bradler, Henry
    Goethe University of Frankfurt, Germany.
    Ochs, Matthias
    Goethe University of Frankfurt, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Estimation of Automotive Pitch, Yaw, and Roll using Enhanced Phase Correlation on Multiple Far-field Windows2015In: 2015 IEEE Intelligent Vehicles Symposium (IV), IEEE , 2015, p. 481-486Conference paper (Refereed)
    Abstract [en]

    The online-estimation of yaw, pitch, and roll of a moving vehicle is an important ingredient for systems which estimate egomotion, and 3D structure of the environment in a moving vehicle from video information. We present an approach to estimate these angular changes from monocular visual data, based on the fact that the motion of far distant points is not dependent on translation, but only on the current rotation of the camera. The presented approach does not require features (corners, edges,...) to be extracted. It allows to estimate in parallel also the illumination changes from frame to frame, and thus allows to largely stabilize the estimation of image correspondences and motion vectors, which are most often central entities needed for computating scene structure, distances, etc. The method is significantly less complex and much faster than a full egomotion computation from features, such as PTAM [6], but it can be used for providing motion priors and reduce search spaces for more complex methods which perform a complete analysis of egomotion and dynamic 3D structure of the scene in which a vehicle moves.

  • 23.
    Bengtsson, Morgan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Indoor 3D Mapping using Kinect2014Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In recent years several depth cameras have emerged on the consumer market, creating many interesting possibilities forboth professional and recreational usage. One example of such a camera is the Microsoft Kinect sensor originally usedwith the Microsoft Xbox 360 game console. In this master thesis a system is presented that utilizes this device in order to create an as accurate as possible 3D reconstruction of an indoor environment. The major novelty of the presented system is the data structure based on signed distance fields and voxel octrees used to represent the observed environment.

    Download full text (pdf)
    fulltext
  • 24.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Classification of leakage detections acquired by airborne thermography of district heating networks2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In Sweden and many other northern countries, it is common for heat to be distributed to homes and industries through district heating networks. Such networks consist of pipes buried underground carrying hot water or steam with temperatures in the range of 90-150 C. Due to bad insulation or cracks, heat or water leakages might appear.

    A system for large-scale monitoring of district heating networks through remote thermography has been developed and is in use at the company Termisk Systemteknik AB. Infrared images are captured from an aircraft and analysed, finding and indicating the areas for which the ground temperature is higher than normal. During the analysis there are, however, many other warm areas than true water or energy leakages that are marked as detections. Objects or phenomena that can cause false alarms are those who, for some reason, are warmer than their surroundings, for example, chimneys, cars and heat leakages from buildings.

    During the last couple of years, the system has been used in a number of cities. Therefore, there exists a fair amount of examples of different types of detections. The purpose of the present master’s thesis is to evaluate the reduction of false alarms of the existing analysis that can be achieved with the use of a learning system, i.e. a system which can learn how to recognize different types of detections. 

    A labelled data set for training and testing was acquired by contact with customers. Furthermore, a number of features describing the intensity difference within the detection, its shape and propagation as well as proximity information were found, implemented and evaluated. Finally, four different classifiers and other methods for classification were evaluated.

    The method that obtained the best results consists of two steps. In the initial step, all detections which lie on top of a building are removed from the data set of labelled detections. The second step consists of classification using a Random forest classifier. Using this two-step method, the number of false alarms is reduced by 43% while the percentage of water and energy detections correctly classified is 99%.

    Download full text (pdf)
    fulltext
  • 25. Order onlineBuy this publication >>
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Detection and Tracking in Thermal Infrared Imagery2016Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as it is possible to measure a temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.

    This thesis addresses the problem of detection and tracking in thermal infrared imagery. Visual detection and tracking of objects in video are research areas that have been and currently are subject to extensive research. Indications oftheir popularity are recent benchmarks such as the annual Visual Object Tracking (VOT) challenges, the Object Tracking Benchmarks, the series of workshops on Performance Evaluation of Tracking and Surveillance (PETS), and the workshops on Change Detection. Benchmark results indicate that detection and tracking are still challenging problems.

    A common belief is that detection and tracking in thermal infrared imagery is identical to detection and tracking in grayscale visual imagery. This thesis argues that the preceding allegation is not true. The characteristics of thermal infrared radiation and imagery pose certain challenges to image analysis algorithms. The thesis describes these characteristics and challenges as well as presents evaluation results confirming the hypothesis.

    Detection and tracking are often treated as two separate problems. However, some tracking methods, e.g. template-based tracking methods, base their tracking on repeated specific detections. They learn a model of the object that is adaptively updated. That is, detection and tracking are performed jointly. The thesis includes a template-based tracking method designed specifically for thermal infrared imagery, describes a thermal infrared dataset for evaluation of template-based tracking methods, and provides an overview of the first challenge on short-term,single-object tracking in thermal infrared video. Finally, two applications employing detection and tracking methods are presented.

    List of papers
    1. A Thermal Object Tracking Benchmark
    Open this publication in new window or tab >>A Thermal Object Tracking Benchmark
    2015 (English)Conference paper, Published paper (Refereed)
    Abstract [en]

    Short-term single-object (STSO) tracking in thermal images is a challenging problem relevant in a growing number of applications. In order to evaluate STSO tracking algorithms on visual imagery, there are de facto standard benchmarks. However, we argue that tracking in thermal imagery is different than in visual imagery, and that a separate benchmark is needed. The available thermal infrared datasets are few and the existing ones are not challenging for modern tracking algorithms. Therefore, we hereby propose a thermal infrared benchmark according to the Visual Object Tracking (VOT) protocol for evaluation of STSO tracking methods. The benchmark includes the new LTIR dataset containing 20 thermal image sequences which have been collected from multiple sources and annotated in the format used in the VOT Challenge. In addition, we show that the ranking of different tracking principles differ between the visual and thermal benchmarks, confirming the need for the new benchmark.

    Place, publisher, year, edition, pages
    IEEE, 2015
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-121001 (URN)10.1109/AVSS.2015.7301772 (DOI)000380619700052 ()978-1-4673-7632-7 (ISBN)
    Conference
    12th IEEE International Conference on Advanced Video- and Signal-based Surveillance, Karlsruhe, Germany, August 25-28 2015
    Available from: 2015-09-02 Created: 2015-09-02 Last updated: 2019-10-23Bibliographically approved
    2. The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results
    Open this publication in new window or tab >>The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results
    Show others...
    2015 (English)In: Proceedings of the IEEE International Conference on Computer Vision, Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 639-651Conference paper, Published paper (Refereed)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking challenge 2015, VOTTIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply prelearned models of object appearance. VOT-TIR2015 is the first benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2015 challenge is based on the VOT2013 challenge, but introduces the following novelties: (i) the newly collected LTIR (Linköping TIR) dataset is used, (ii) the VOT2013 attributes are adapted to TIR data, (iii) the evaluation is performed using insights gained during VOT2013 and VOT2014 and is similar to VOT2015.

    Place, publisher, year, edition, pages
    Institute of Electrical and Electronics Engineers (IEEE), 2015
    Series
    IEEE International Conference on Computer Vision. Proceedings, ISSN 1550-5499
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-126917 (URN)10.1109/ICCVW.2015.86 (DOI)000380434700077 ()978-146738390-5 (ISBN)
    External cooperation:
    Conference
    IEEE International Conference on Computer Vision Workshop (ICCVW. 7-13 Dec. 2015 Santiago, Chile
    Available from: 2016-04-07 Created: 2016-04-07 Last updated: 2023-04-03Bibliographically approved
    3. Channel Coded Distribution Field Tracking for Thermal Infrared Imagery
    Open this publication in new window or tab >>Channel Coded Distribution Field Tracking for Thermal Infrared Imagery
    2016 (English)In: PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), IEEE , 2016, p. 1248-1256Conference paper, Published paper (Refereed)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. The fast progress has been possible thanks to the development of new template-based tracking methods with online template updates, methods which have not been explored for TIR tracking. Instead, tracking methods used for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a template-based tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. In order to avoid background contamination of the object template, we propose to exploit background information for the online template update and to adaptively select the object region used for tracking. Moreover, we propose a novel method for estimating object scale change. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Further, the proposed tracker, ABCD, and the VOT-TIR2015 winner SRDCFir are evaluated on maritime data. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

    Place, publisher, year, edition, pages
    IEEE, 2016
    Series
    IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, ISSN 2160-7508
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-134402 (URN)10.1109/CVPRW.2016.158 (DOI)000391572100151 ()978-1-5090-1438-5 (ISBN)978-1-5090-1437-8 (ISBN)
    Conference
    Computer Vision and Pattern Recognition Workshops (CVPRW), 2016 IEEE Conference on
    Funder
    Swedish Research Council, D0570301EU, FP7, Seventh Framework Programme, 312784EU, FP7, Seventh Framework Programme, 607567
    Available from: 2017-02-09 Created: 2017-02-09 Last updated: 2020-07-16
    4. Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera
    Open this publication in new window or tab >>Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera
    2015 (English)In: Image Analysis: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen; Kim S. Pedersen, Springer, 2015, p. 492-503Conference paper, Published paper (Refereed)
    Abstract [en]

    We propose a method for detecting obstacles on the railway in front of a moving train using a monocular thermal camera. The problem is motivated by the large number of collisions between trains and various obstacles, resulting in reduced safety and high costs. The proposed method includes a novel way of detecting the rails in the imagery, as well as a way to detect anomalies on the railway. While the problem at a first glance looks similar to road and lane detection, which in the past has been a popular research topic, a closer look reveals that the problem at hand is previously unaddressed. As a consequence, relevant datasets are missing as well, and thus our contribution is two-fold: We propose an approach to the novel problem of obstacle detection on railways and we describe the acquisition of a novel data set.

    Place, publisher, year, edition, pages
    Springer, 2015
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 9127
    Keywords
    Thermal imaging; Computer vision; Train safety; Railway detection; Anomaly detection; Obstacle detection
    National Category
    Signal Processing
    Identifiers
    urn:nbn:se:liu:diva-119507 (URN)10.1007/978-3-319-19665-7_42 (DOI)978-3-319-19664-0 (ISBN)978-3-319-19665-7 (ISBN)
    Conference
    19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015
    Available from: 2015-06-22 Created: 2015-06-18 Last updated: 2019-10-23Bibliographically approved
    5. Enhanced analysis of thermographic images for monitoring of district heat pipe networks
    Open this publication in new window or tab >>Enhanced analysis of thermographic images for monitoring of district heat pipe networks
    2016 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 83, no 2, p. 215-223Article in journal (Refereed) Published
    Abstract [en]

    We address two problems related to large-scale aerial monitoring of district heating networks. First, we propose a classification scheme to reduce the number of false alarms among automatically detected leakages in district heating networks. The leakages are detected in images captured by an airborne thermal camera, and each detection corresponds to an image region with abnormally high temperature. This approach yields a significant number of false positives, and we propose to reduce this number in two steps; by (a) using a building segmentation scheme in order to remove detections on buildings, and (b) to use a machine learning approach to classify the remaining detections as true or false leakages. We provide extensive experimental analysis on real-world data, showing that this post-processing step significantly improves the usefulness of the system. Second, we propose a method for characterization of leakages over time, i.e., repeating the image acquisition one or a few years later and indicate areas that suffer from an increased energy loss. We address the problem of finding trends in the degradation of pipe networks in order to plan for long-term maintenance, and propose a visualization scheme exploiting the consecutive data collections.

    Place, publisher, year, edition, pages
    Elsevier, 2016
    Keywords
    Remote thermography; Classification; Pattern recognition; District heating; Thermal infrared
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-133004 (URN)10.1016/j.patrec.2016.07.002 (DOI)000386874800013 ()
    Note

    Funding Agencies|Swedish Research Council (Vetenskapsradet) through project Learning systems for remote thermography [621-2013-5703]; Swedish Research Council [2014-6227]

    Available from: 2016-12-08 Created: 2016-12-07 Last updated: 2019-10-23
    Download full text (pdf)
    fulltext
    Download (pdf)
    omslag
    Download (jpg)
    presentationsbild
  • 26.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Learning to Analyze what is Beyond the Visible Spectrum2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing camera price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as there exists a measurable temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.

    This thesis addresses the problem of automatic image analysis in thermal infrared images with a focus on machine learning methods. The main purpose of this thesis is to study the variations of processing required due to the thermal infrared data modality. In particular, three different problems are addressed: visual object tracking, anomaly detection, and modality transfer. All these are research areas that have been and currently are subject to extensive research. Furthermore, they are all highly relevant for a number of different real-world applications.

    The first addressed problem is visual object tracking, a problem for which no prior information other than the initial location of the object is given. The main contribution concerns benchmarking of short-term single-object (STSO) visual object tracking methods in thermal infrared images. The proposed dataset, LTIR (Linköping Thermal Infrared), was integrated in the VOT-TIR2015 challenge, introducing the first ever organized challenge on STSO tracking in thermal infrared video. Another contribution also related to benchmarking is a novel, recursive, method for semi-automatic annotation of multi-modal video sequences. Based on only a few initial annotations, a video object segmentation (VOS) method proposes segmentations for all remaining frames and difficult parts in need for additional manual annotation are automatically detected. The third contribution to the problem of visual object tracking is a template tracking method based on a non-parametric probability density model of the object's thermal radiation using channel representations.

    The second addressed problem is anomaly detection, i.e., detection of rare objects or events. The main contribution is a method for truly unsupervised anomaly detection based on Generative Adversarial Networks (GANs). The method employs joint training of the generator and an observation to latent space encoder, enabling stratification of the latent space and, thus, also separation of normal and anomalous samples. The second contribution is the previously unaddressed problem of obstacle detection in front of moving trains using a train-mounted thermal camera. Adaptive correlation filters are updated continuously and missed detections of background are treated as detections of anomalies, or obstacles. The third contribution to the problem of anomaly detection is a method for characterization and classification of automatically detected district heat leakages for the purpose of false alarm reduction.

    Finally, the thesis addresses the problem of modality transfer between thermal infrared and visual spectrum images, a previously unaddressed problem. The contribution is a method based on Convolutional Neural Networks (CNNs), enabling perceptually realistic transformations of thermal infrared to visual images. By careful design of the loss function the method becomes robust to image pair misalignments. The method exploits the lower acuity for color differences than for luminance possessed by the human visual system, separating the loss into a luminance and a chrominance part.

    List of papers
    1. A Thermal Object Tracking Benchmark
    Open this publication in new window or tab >>A Thermal Object Tracking Benchmark
    2015 (English)Conference paper, Published paper (Refereed)
    Abstract [en]

    Short-term single-object (STSO) tracking in thermal images is a challenging problem relevant in a growing number of applications. In order to evaluate STSO tracking algorithms on visual imagery, there are de facto standard benchmarks. However, we argue that tracking in thermal imagery is different than in visual imagery, and that a separate benchmark is needed. The available thermal infrared datasets are few and the existing ones are not challenging for modern tracking algorithms. Therefore, we hereby propose a thermal infrared benchmark according to the Visual Object Tracking (VOT) protocol for evaluation of STSO tracking methods. The benchmark includes the new LTIR dataset containing 20 thermal image sequences which have been collected from multiple sources and annotated in the format used in the VOT Challenge. In addition, we show that the ranking of different tracking principles differ between the visual and thermal benchmarks, confirming the need for the new benchmark.

    Place, publisher, year, edition, pages
    IEEE, 2015
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-121001 (URN)10.1109/AVSS.2015.7301772 (DOI)000380619700052 ()978-1-4673-7632-7 (ISBN)
    Conference
    12th IEEE International Conference on Advanced Video- and Signal-based Surveillance, Karlsruhe, Germany, August 25-28 2015
    Available from: 2015-09-02 Created: 2015-09-02 Last updated: 2019-10-23Bibliographically approved
    2. Semi-automatic Annotation of Objects in Visual-Thermal Video
    Open this publication in new window or tab >>Semi-automatic Annotation of Objects in Visual-Thermal Video
    Show others...
    2019 (English)In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Institute of Electrical and Electronics Engineers (IEEE), 2019Conference paper, Published paper (Refereed)
    Abstract [en]

    Deep learning requires large amounts of annotated data. Manual annotation of objects in video is, regardless of annotation type, a tedious and time-consuming process. In particular, for scarcely used image modalities human annotationis hard to justify. In such cases, semi-automatic annotation provides an acceptable option.

    In this work, a recursive, semi-automatic annotation method for video is presented. The proposed method utilizesa state-of-the-art video object segmentation method to propose initial annotations for all frames in a video based on only a few manual object segmentations. In the case of a multi-modal dataset, the multi-modality is exploited to refine the proposed annotations even further. The final tentative annotations are presented to the user for manual correction.

    The method is evaluated on a subset of the RGBT-234 visual-thermal dataset reducing the workload for a human annotator with approximately 78% compared to full manual annotation. Utilizing the proposed pipeline, sequences are annotated for the VOT-RGBT 2019 challenge.

    Place, publisher, year, edition, pages
    Institute of Electrical and Electronics Engineers (IEEE), 2019
    Series
    IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), ISSN 2473-9936, E-ISSN 2473-9944
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-161076 (URN)10.1109/ICCVW.2019.00277 (DOI)000554591602039 ()978-1-7281-5023-9 (ISBN)978-1-7281-5024-6 (ISBN)
    Conference
    IEEE International Conference on Computer Vision Workshop (ICCVW)
    Funder
    Swedish Research Council, 2013-5703Swedish Foundation for Strategic Research Wallenberg AI, Autonomous Systems and Software Program (WASP)Vinnova, VS1810-Q
    Note

    Funding agencies: Swedish Research CouncilSwedish Research Council [2013-5703]; project ELLIIT (the Strategic Area for ICT research - Swedish Government); Wallenberg AI, Autonomous Systems and Software Program (WASP); Visual Sweden project ndimensional Modelling [VS1810-Q]

    Available from: 2019-10-21 Created: 2019-10-21 Last updated: 2021-12-03
    3. Channel Coded Distribution Field Tracking for Thermal Infrared Imagery
    Open this publication in new window or tab >>Channel Coded Distribution Field Tracking for Thermal Infrared Imagery
    2016 (English)In: PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), IEEE , 2016, p. 1248-1256Conference paper, Published paper (Refereed)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. The fast progress has been possible thanks to the development of new template-based tracking methods with online template updates, methods which have not been explored for TIR tracking. Instead, tracking methods used for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a template-based tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. In order to avoid background contamination of the object template, we propose to exploit background information for the online template update and to adaptively select the object region used for tracking. Moreover, we propose a novel method for estimating object scale change. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Further, the proposed tracker, ABCD, and the VOT-TIR2015 winner SRDCFir are evaluated on maritime data. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

    Place, publisher, year, edition, pages
    IEEE, 2016
    Series
    IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, ISSN 2160-7508
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-134402 (URN)10.1109/CVPRW.2016.158 (DOI)000391572100151 ()978-1-5090-1438-5 (ISBN)978-1-5090-1437-8 (ISBN)
    Conference
    Computer Vision and Pattern Recognition Workshops (CVPRW), 2016 IEEE Conference on
    Funder
    Swedish Research Council, D0570301EU, FP7, Seventh Framework Programme, 312784EU, FP7, Seventh Framework Programme, 607567
    Available from: 2017-02-09 Created: 2017-02-09 Last updated: 2020-07-16
    4. Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera
    Open this publication in new window or tab >>Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera
    2015 (English)In: Image Analysis: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen; Kim S. Pedersen, Springer, 2015, p. 492-503Conference paper, Published paper (Refereed)
    Abstract [en]

    We propose a method for detecting obstacles on the railway in front of a moving train using a monocular thermal camera. The problem is motivated by the large number of collisions between trains and various obstacles, resulting in reduced safety and high costs. The proposed method includes a novel way of detecting the rails in the imagery, as well as a way to detect anomalies on the railway. While the problem at a first glance looks similar to road and lane detection, which in the past has been a popular research topic, a closer look reveals that the problem at hand is previously unaddressed. As a consequence, relevant datasets are missing as well, and thus our contribution is two-fold: We propose an approach to the novel problem of obstacle detection on railways and we describe the acquisition of a novel data set.

    Place, publisher, year, edition, pages
    Springer, 2015
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 9127
    Keywords
    Thermal imaging; Computer vision; Train safety; Railway detection; Anomaly detection; Obstacle detection
    National Category
    Signal Processing
    Identifiers
    urn:nbn:se:liu:diva-119507 (URN)10.1007/978-3-319-19665-7_42 (DOI)978-3-319-19664-0 (ISBN)978-3-319-19665-7 (ISBN)
    Conference
    19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015
    Available from: 2015-06-22 Created: 2015-06-18 Last updated: 2019-10-23Bibliographically approved
    5. Enhanced analysis of thermographic images for monitoring of district heat pipe networks
    Open this publication in new window or tab >>Enhanced analysis of thermographic images for monitoring of district heat pipe networks
    2016 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 83, no 2, p. 215-223Article in journal (Refereed) Published
    Abstract [en]

    We address two problems related to large-scale aerial monitoring of district heating networks. First, we propose a classification scheme to reduce the number of false alarms among automatically detected leakages in district heating networks. The leakages are detected in images captured by an airborne thermal camera, and each detection corresponds to an image region with abnormally high temperature. This approach yields a significant number of false positives, and we propose to reduce this number in two steps; by (a) using a building segmentation scheme in order to remove detections on buildings, and (b) to use a machine learning approach to classify the remaining detections as true or false leakages. We provide extensive experimental analysis on real-world data, showing that this post-processing step significantly improves the usefulness of the system. Second, we propose a method for characterization of leakages over time, i.e., repeating the image acquisition one or a few years later and indicate areas that suffer from an increased energy loss. We address the problem of finding trends in the degradation of pipe networks in order to plan for long-term maintenance, and propose a visualization scheme exploiting the consecutive data collections.

    Place, publisher, year, edition, pages
    Elsevier, 2016
    Keywords
    Remote thermography; Classification; Pattern recognition; District heating; Thermal infrared
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-133004 (URN)10.1016/j.patrec.2016.07.002 (DOI)000386874800013 ()
    Note

    Funding Agencies|Swedish Research Council (Vetenskapsradet) through project Learning systems for remote thermography [621-2013-5703]; Swedish Research Council [2014-6227]

    Available from: 2016-12-08 Created: 2016-12-07 Last updated: 2019-10-23
    6. Generating Visible Spectrum Images from Thermal Infrared
    Open this publication in new window or tab >>Generating Visible Spectrum Images from Thermal Infrared
    2018 (English)In: Proceedings 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops CVPRW 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 1224-1233Conference paper, Published paper (Refereed)
    Abstract [en]

    Transformation of thermal infrared (TIR) images into visual, i.e. perceptually realistic color (RGB) images, is a challenging problem. TIR cameras have the ability to see in scenarios where vision is severely impaired, for example in total darkness or fog, and they are commonly used, e.g., for surveillance and automotive applications. However, interpretation of TIR images is difficult, especially for untrained operators. Enhancing the TIR image display by transforming it into a plausible, visual, perceptually realistic RGB image presumably facilitates interpretation. Existing grayscale to RGB, so called, colorization methods cannot be applied to TIR images directly since those methods only estimate the chrominance and not the luminance. In the absence of conventional colorization methods, we propose two fully automatic TIR to visual color image transformation methods, a two-step and an integrated approach, based on Convolutional Neural Networks. The methods require neither pre- nor postprocessing, do not require any user input, and are robust to image pair misalignments. We show that the methods do indeed produce perceptually realistic results on publicly available data, which is assessed both qualitatively and quantitatively.

    Place, publisher, year, edition, pages
    Institute of Electrical and Electronics Engineers (IEEE), 2018
    Series
    IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops, E-ISSN 2160-7516
    National Category
    Computer Vision and Robotics (Autonomous Systems) Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-149429 (URN)10.1109/CVPRW.2018.00159 (DOI)000457636800152 ()9781538661000 (ISBN)9781538661017 (ISBN)
    Conference
    The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 8-22 June 2018, Salt Lake City, UT, USA
    Funder
    Swedish Research Council, 2013-5703Swedish Research Council, 2014-6227
    Note

    Print on Demand(PoD) ISSN: 2160-7508.

    Available from: 2018-06-29 Created: 2018-06-29 Last updated: 2020-02-03Bibliographically approved
    Download full text (pdf)
    Learning to Analyze what is Beyond the Visible Spectrum
    Download (png)
    presentationsbild
  • 27.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, The Institute of Technology. Termisk Systemteknik AB, Linköping, Sweden.
    Classification and temporal analysis of district heating leakages in thermal images2014In: Proceedings of The 14th International Symposium on District Heating and Cooling, 2014Conference paper (Other academic)
    Abstract [en]

    District heating pipes are known to degenerate with time and in some cities the pipes have been used for several decades. Due to bad insulation or cracks, energy or media leakages might appear. This paper presents a complete system for large-scale monitoring of district heating networks, including methods for detection, classification and temporal characterization of (potential) leakages. The system analyses thermal infrared images acquired by an aircraft-mounted camera, detecting the areas for which the pixel intensity is higher than normal. Unfortunately, the system also finds many false detections, i.e., warm areas that are not caused by media or energy leakages. Thus, in order to reduce the number of false detections we describe a machine learning method to classify the detections. The results, based on data from three district heating networks show that we can remove more than half of the false detections. Moreover, we also propose a method to characterize leakages over time, that is, repeating the image acquisition one or a few years later and indicate areas that suffer from an increased energy loss.

    Download full text (pdf)
    fulltext
  • 28.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, The Institute of Technology. Termisk Systemteknik AB, Linköping, Sweden.
    Classification of leakage detections acquired by airborne thermography of district heating networks2014In: 2014 8th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS), IEEE , 2014, p. 1-4Conference paper (Refereed)
    Abstract [en]

    We address the problem of reducing the number offalse alarms among automatically detected leakages in districtheating networks. The leakages are detected in images capturedby an airborne thermal camera, and each detection correspondsto an image region with abnormally high temperature. Thisapproach yields a significant number of false positives, and wepropose to reduce this number in two steps. First, we use abuilding segmentation scheme in order to remove detectionson buildings. Second, we extract features from the detectionsand use a Random forest classifier on the remaining detections.We provide extensive experimental analysis on real-world data,showing that this post-processing step significantly improves theusefulness of the system.

    Download full text (pdf)
    fulltext
  • 29.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB Linköping, Sweden.
    Classifying district heating network leakages in aerial thermal imagery2014Conference paper (Other academic)
    Abstract [en]

    In this paper we address the problem of automatically detecting leakages in underground pipes of district heating networks from images captured by an airborne thermal camera. The basic idea is to classify each relevant image region as a leakage if its temperature exceeds a threshold. This simple approach yields a significant number of false positives. We propose to address this issue by machine learning techniques and provide extensive experimental analysis on real-world data. The results show that this postprocessing step significantly improves the usefulness of the system.

    Download full text (pdf)
    fulltext
  • 30.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    A thermal infrared dataset for evaluation of short-term tracking methods2015Conference paper (Other academic)
    Abstract [en]

    During recent years, thermal cameras have decreased in both size and cost while improving image quality. The area of use for such cameras has expanded with many exciting applications, many of which require tracking of objects. While being subject to extensive research in the visual domain, tracking in thermal imagery has historically been of interest mainly for military purposes. The available thermal infrared datasets for evaluating methods addressing these problems are few and the ones that do are not challenging enough for today’s tracking algorithms. Therefore, we hereby propose a thermal infrared dataset for evaluation of short-term tracking methods. The dataset consists of 20 sequences which have been collected from multiple sources and the data format used is in accordance with the Visual Object Tracking (VOT) Challenge.

    Download full text (pdf)
    fulltext
  • 31.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    A Thermal Object Tracking Benchmark2015Conference paper (Refereed)
    Abstract [en]

    Short-term single-object (STSO) tracking in thermal images is a challenging problem relevant in a growing number of applications. In order to evaluate STSO tracking algorithms on visual imagery, there are de facto standard benchmarks. However, we argue that tracking in thermal imagery is different than in visual imagery, and that a separate benchmark is needed. The available thermal infrared datasets are few and the existing ones are not challenging for modern tracking algorithms. Therefore, we hereby propose a thermal infrared benchmark according to the Visual Object Tracking (VOT) protocol for evaluation of STSO tracking methods. The benchmark includes the new LTIR dataset containing 20 thermal image sequences which have been collected from multiple sources and annotated in the format used in the VOT Challenge. In addition, we show that the ranking of different tracking principles differ between the visual and thermal benchmarks, confirming the need for the new benchmark.

    Download full text (pdf)
    fulltext
  • 32.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Channel Coded Distribution Field Tracking for Thermal Infrared Imagery2016In: PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), IEEE , 2016, p. 1248-1256Conference paper (Refereed)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. The fast progress has been possible thanks to the development of new template-based tracking methods with online template updates, methods which have not been explored for TIR tracking. Instead, tracking methods used for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a template-based tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. In order to avoid background contamination of the object template, we propose to exploit background information for the online template update and to adaptively select the object region used for tracking. Moreover, we propose a novel method for estimating object scale change. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Further, the proposed tracker, ABCD, and the VOT-TIR2015 winner SRDCFir are evaluated on maritime data. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

    Download full text (pdf)
    fulltext
  • 33.
    Berg, Amanda
    et al.
    Linköping University, Faculty of Science & Engineering. Linköping University, Department of Electrical Engineering, Computer Vision. Termisk Syst Tekn AB, Diskettgatan 11 B, SE-58335 Linkoping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Syst Tekn AB, Diskettgatan 11 B, SE-58335 Linkoping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Enhanced analysis of thermographic images for monitoring of district heat pipe networks2016In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 83, no 2, p. 215-223Article in journal (Refereed)
    Abstract [en]

    We address two problems related to large-scale aerial monitoring of district heating networks. First, we propose a classification scheme to reduce the number of false alarms among automatically detected leakages in district heating networks. The leakages are detected in images captured by an airborne thermal camera, and each detection corresponds to an image region with abnormally high temperature. This approach yields a significant number of false positives, and we propose to reduce this number in two steps; by (a) using a building segmentation scheme in order to remove detections on buildings, and (b) to use a machine learning approach to classify the remaining detections as true or false leakages. We provide extensive experimental analysis on real-world data, showing that this post-processing step significantly improves the usefulness of the system. Second, we propose a method for characterization of leakages over time, i.e., repeating the image acquisition one or a few years later and indicate areas that suffer from an increased energy loss. We address the problem of finding trends in the degradation of pipe networks in order to plan for long-term maintenance, and propose a visualization scheme exploiting the consecutive data collections.

    Download full text (pdf)
    fulltext
  • 34.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Generating Visible Spectrum Images from Thermal Infrared2018In: Proceedings 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops CVPRW 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 1224-1233Conference paper (Refereed)
    Abstract [en]

    Transformation of thermal infrared (TIR) images into visual, i.e. perceptually realistic color (RGB) images, is a challenging problem. TIR cameras have the ability to see in scenarios where vision is severely impaired, for example in total darkness or fog, and they are commonly used, e.g., for surveillance and automotive applications. However, interpretation of TIR images is difficult, especially for untrained operators. Enhancing the TIR image display by transforming it into a plausible, visual, perceptually realistic RGB image presumably facilitates interpretation. Existing grayscale to RGB, so called, colorization methods cannot be applied to TIR images directly since those methods only estimate the chrominance and not the luminance. In the absence of conventional colorization methods, we propose two fully automatic TIR to visual color image transformation methods, a two-step and an integrated approach, based on Convolutional Neural Networks. The methods require neither pre- nor postprocessing, do not require any user input, and are robust to image pair misalignments. We show that the methods do indeed produce perceptually realistic results on publicly available data, which is assessed both qualitatively and quantitatively.

    Download full text (pdf)
    Generating Visible Spectrum Images from Thermal Infrared
  • 35.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Object Tracking in Thermal Infrared Imagery based on Channel Coded Distribution Fields2017Conference paper (Other academic)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. Tracking methods designed for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a templatebased tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

    Download full text (pdf)
    Object Tracking in Thermal Infrared Imagery based on Channel Coded Distribution Fields
  • 36.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Unsupervised Adversarial Learning of Anomaly Detection in the Wild2020In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI) / [ed] Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, Jérôme Lang, Amsterdam: IOS Press, 2020, Vol. 325, p. 1002-1008Conference paper (Refereed)
    Abstract [en]

    Unsupervised learning of anomaly detection in high-dimensional data, such as images, is a challenging problem recently subject to intense research. Through careful modelling of the data distribution of normal samples, it is possible to detect deviant samples, so called anomalies. Generative Adversarial Networks (GANs) can model the highly complex, high-dimensional data distribution of normal image samples, and have shown to be a suitable approach to the problem. Previously published GAN-based anomaly detection methods often assume that anomaly-free data is available for training. However, this assumption is not valid in most real-life scenarios, a.k.a. in the wild. In this work, we evaluate the effects of anomaly contaminations in the training data on state-of-the-art GAN-based anomaly detection methods. As expected, detection performance deteriorates. To address this performance drop, we propose to add an additional encoder network already at training time and show that joint generator-encoder training stratifies the latent space, mitigating the problem with contaminated data. We show experimentally that the norm of a query image in this stratified latent space becomes a highly significant cue to discriminate anomalies from normal data. The proposed method achieves state-of-the-art performance on CIFAR-10 as well as on a large, previously untested dataset with cell images.

    Download full text (pdf)
    fulltext
  • 37.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Visual Spectrum Image Generation fromThermal Infrared2019Conference paper (Other academic)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. Tracking methods designed for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a templatebased tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

  • 38.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    An Overview of the Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge2016Conference paper (Other academic)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking (VOT-TIR2015) Challenge was organized in conjunction with ICCV2015. It was the first benchmark on short-term,single-target tracking in thermal infrared (TIR) sequences. The challenge aimed at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. It was based on the VOT2013 Challenge, but introduced the following novelties: (i) the utilization of the LTIR (Linköping TIR) dataset, (ii) adaption of the VOT2013 attributes to thermal data, (iii) a similar evaluation to that of VOT2015. This paper provides an overview of the VOT-TIR2015 Challenge as well as the results of the 24 participating trackers.

    Download full text (pdf)
    fulltext
  • 39.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Johnander, Joakim
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Zenuity AB, Göteborg, Sweden.
    Durand de Gevigney, Flavie
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Grenoble INP, France.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Semi-automatic Annotation of Objects in Visual-Thermal Video2019In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Institute of Electrical and Electronics Engineers (IEEE), 2019Conference paper (Refereed)
    Abstract [en]

    Deep learning requires large amounts of annotated data. Manual annotation of objects in video is, regardless of annotation type, a tedious and time-consuming process. In particular, for scarcely used image modalities human annotationis hard to justify. In such cases, semi-automatic annotation provides an acceptable option.

    In this work, a recursive, semi-automatic annotation method for video is presented. The proposed method utilizesa state-of-the-art video object segmentation method to propose initial annotations for all frames in a video based on only a few manual object segmentations. In the case of a multi-modal dataset, the multi-modality is exploited to refine the proposed annotations even further. The final tentative annotations are presented to the user for manual correction.

    The method is evaluated on a subset of the RGBT-234 visual-thermal dataset reducing the workload for a human annotator with approximately 78% compared to full manual annotation. Utilizing the proposed pipeline, sequences are annotated for the VOT-RGBT 2019 challenge.

    Download full text (pdf)
    fulltext
  • 40.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Öfjäll, Kristoffer
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera2015In: Image Analysis: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen; Kim S. Pedersen, Springer, 2015, p. 492-503Conference paper (Refereed)
    Abstract [en]

    We propose a method for detecting obstacles on the railway in front of a moving train using a monocular thermal camera. The problem is motivated by the large number of collisions between trains and various obstacles, resulting in reduced safety and high costs. The proposed method includes a novel way of detecting the rails in the imagery, as well as a way to detect anomalies on the railway. While the problem at a first glance looks similar to road and lane detection, which in the past has been a popular research topic, a closer look reveals that the problem at hand is previously unaddressed. As a consequence, relevant datasets are missing as well, and thus our contribution is two-fold: We propose an approach to the novel problem of obstacle detection on railways and we describe the acquisition of a novel data set.

    Download full text (pdf)
    fulltext
  • 41.
    Berglin, Lukas
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Design, Evaluation and Implementation of a Pipeline for Semi-Automatic Lung Nodule Segmentation2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Lung cancer is the most common type of cancer in the world and always manifests as lung nodules. Nodules are small tumors that consist of lung tissue. They are usually spherical in shape and their cores can be either solid or subsolid. Nodules are common in lungs, but not all of them are malignant. To determine if a nodule is malignant or benign, attributes like nodule size and volume growth are commonly used. The procedure to obtain these attributes is time consuming, and therefore calls for tools to simplify the process.

    The purpose of this thesis work was to investigate  the feasibility of a semi-automatic lungnodule segmentation pipeline including volume estimation. This was done by implementing, tuning and evaluating image processing algorithms with different characteristics to create pipeline candidates. These candidates were compared using a similarity index between their segmentation results and ground truth markings to determine the most promising one.

    The best performing pipeline consisted of a fixed region of interest together with a level set segmentation algorithm. Its segmentation accuracy was not consistent for all nodules evaluated, but the pipeline showed great potential when dynamically adapting its parameters for each nodule. The use of dynamic parameters was only brie y explored, and further research would be necessary to determine its feasibility.

    Download full text (pdf)
    fulltext
  • 42.
    Bešenić, Krešimir
    et al.
    Faculty of Electrical Engineering and Computing, University of Zagreb, Croatia.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Pandžić, Igor
    Faculty of Electrical Engineering and Computing, University of Zagreb, Croatia.
    Unsupervised Facial Biometric Data Filtering for Age and Gender Estimation2019In: Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP 2019), SciTePress, 2019, Vol. 5, p. 209-217Conference paper (Refereed)
    Abstract [en]

    Availability of large training datasets was essential for the recent advancement and success of deep learning methods. Due to the difficulties related to biometric data collection, datasets with age and gender annotations are scarce and usually limited in terms of size and sample diversity. Web-scraping approaches for automatic data collection can produce large amounts weakly labeled noisy data. The unsupervised facial biometric data filtering method presented in this paper greatly reduces label noise levels in web-scraped facial biometric data. Experiments on two large state-of-the-art web-scraped facial datasets demonstrate the effectiveness of the proposed method, with respect to training and validation scores, training convergence, and generalization capabilities of trained age and gender estimators.

    Download full text (pdf)
    fulltext
  • 43.
    Bešenić, Krešimir
    et al.
    Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Pandžić, Igor S.
    Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia.
    Picking out the bad apples: unsupervised biometric data filtering for refined age estimation2023In: The Visual Computer, ISSN 0178-2789, E-ISSN 1432-2315, Vol. 39, p. 219-237Article in journal (Refereed)
    Abstract [en]

    Introduction of large training datasets was essential for the recent advancement and success of deep learning methods. Due to the difficulties related to biometric data collection, facial image datasets with biometric trait labels are scarce and usually limited in terms of size and sample diversity. Web-scraping approaches for automatic data collection can produce large amounts of weakly labeled and noisy data. This work is focused on picking out the bad apples from web-scraped facial datasets by automatically removing erroneous samples that impair their usability. The unsupervised facial biometric data filtering method presented in this work greatly reduces label noise levels in web-scraped facial biometric data. Experiments on two large state-of-the-art web-scraped datasets demonstrate the effectiveness of the proposed method with respect to real and apparent age estimation based on five different age estimation methods. Furthermore, we apply the proposed method, together with a newly devised strategy for merging multiple datasets, to data collected from three major web-based data sources (i.e., IMDb, Wikipedia, Google) and derive the new Biometrically Filtered Famous Figure Dataset or B3FD. The proposed dataset, which is made publicly available, enables considerable performance gains for all tested age estimation methods and age estimation tasks. This work highlights the importance of training data quality compared to data quantity and selection of the estimation method.

    Download full text (pdf)
    fulltext
  • 44.
    Bhat, Goutam
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Accurate Tracking by Overlap Maximization2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Visual object tracking is one of the fundamental problems in computer vision, with a wide number of practical applications in e.g.\ robotics, surveillance etc. Given a video sequence and the target bounding box in the first frame, a tracker is required to find the target in all subsequent frames. It is a challenging problem due to the limited training data available. An object tracker is generally evaluated using two criterias, namely robustness and accuracy. Robustness refers to the ability of a tracker to track for long durations, without losing the target. Accuracy, on the other hand, denotes how accurately a tracker can estimate the target bounding box.

    Recent years have seen significant improvement in tracking robustness. However, the problem of accurate tracking has seen less attention. Most current state-of-the-art trackers resort to a naive multi-scale search strategy which has fundamental limitations. Thus, in this thesis, we aim to develop a general target estimation component which can be used to determine accurate bounding box for tracking. We will investigate how bounding box estimators used in object detection can be modified to be used for object tracking. The key difference between detection and tracking is that in object detection, the classes to which the objects belong are known. However, in tracking, no prior information is available about the tracked object, other than a single image provided in the first frame. We will thus investigate different architectures to utilize the first frame information to provide target specific bounding box predictions. We will also investigate how the bounding box predictors can be integrated into a state-of-the-art tracking method to obtain robust as well as accurate tracking.

    Download full text (pdf)
    fulltext
  • 45.
    Bhat, Goutam
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Incept Inst Artificial Intelligence, U Arab Emirates.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Combining Local and Global Models for Robust Re-detection2018In: Proceedings of AVSS 2018. 2018 IEEE International Conference on Advanced Video and Signal-based Surveillance, Auckland, New Zealand, 27-30 November 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 25-30Conference paper (Refereed)
    Abstract [en]

    Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual tracking. However, these methods still struggle in occlusion and out-of-view scenarios due to the absence of a re-detection component. While such a component requires global knowledge of the scene to ensure robust re-detection of the target, the standard DCF is only trained on the local target neighborhood. In this paper, we augment the state-of-the-art DCF tracking framework with a re-detection component based on a global appearance model. First, we introduce a tracking confidence measure to detect target loss. Next, we propose a hard negative mining strategy to extract background distractors samples, used for training the global model. Finally, we propose a robust re-detection strategy that combines the global and local appearance model predictions. We perform comprehensive experiments on the challenging UAV123 and LTB35 datasets. Our approach shows consistent improvements over the baseline tracker, setting a new state-of-the-art on both datasets.

    Download full text (pdf)
    Combining Local and Global Models for Robust Re-detection
  • 46.
    Bhat, Goutam
    et al.
    Swiss Fed Inst Technol, Switzerland.
    Danelljan, Martin
    Swiss Fed Inst Technol, Switzerland.
    Timofte, Radu
    Swiss Fed Inst Technol, Switzerland; Julius Maximilian Univ Wurzburg, Germany.
    Cao, Yizhen
    Commun Univ China, Peoples R China.
    Cao, Yuntian
    Commun Univ China, Peoples R China.
    Chen, Meiya
    Xiaomi, Peoples R China.
    Chen, Xihao
    USTC, Peoples R China; Univ Sci & Technol China, Peoples R China.
    Cheng, Shen
    Megvii Technol, Peoples R China.
    Dudhane, Akshay
    Mohamed Bin Zayed Univ AI MBZUAI, U Arab Emirates.
    Fan, Haoqiang
    Megvii Technol, Peoples R China.
    Gang, Ruipeng
    UHDTV Res & Applicat Lab, Peoples R China.
    Gao, Jian
    SRC B, Peoples R China.
    Gu, Yan
    UESTC, Peoples R China.
    Huang, Jie
    USTC, Peoples R China; Univ Sci & Technol China, Peoples R China.
    Huang, Liufeng
    South China Univ Technol, Peoples R China.
    Jo, Youngsu
    Sogang Univ, South Korea.
    Kang, Sukju
    Sogang Univ, South Korea.
    Khan, Salman
    MBZUAI, U Arab Emirates; Australian Natl Univ ANU, Australia.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. MBZUAI, U Arab Emirates.
    Kondo, Yuki
    Toyota Technol Inst TTI, Japan.
    Li, Chenghua
    Chinese Acad Sci, Peoples R China.
    Li, Fangya
    Commun Univ China, Peoples R China.
    Li, Jinjing
    Commun Univ China, Peoples R China.
    Li, Youwei
    Megvii Technol, Peoples R China.
    Li, Zechao
    Nanjing Univ Sci & Technol, Peoples R China.
    Liu, Chenming
    UHDTV Research and Application Laboratory.
    Liu, Shuaicheng
    Megvii Technol, Peoples R China; Univ Elect Sci & Technol China UESTC, Peoples R China.
    Liu, Zikun
    SRC B, Peoples R China.
    Liu, Zhuoming
    South China Univ Technol, Peoples R China.
    Luo, Ziwei
    Megvii Technol, Peoples R China.
    Luo, Zhengxiong
    CASIA, Peoples R China.
    Mehta, Nancy
    Indian Inst Technol Ropar IIT Ropar, India.
    Murala, Subrahmanyam
    Indian Inst Technol Ropar IIT Ropar, India.
    Nam, Yoonchan
    Sogang Univ, South Korea.
    Nakatani, Chihiro
    Toyota Technol Inst TTI, Japan.
    Ostyakov, Pavel
    Huawei, Peoples R China.
    Pan, Jinshan
    Nanjing Univ Sci & Technol, Peoples R China.
    Song, Ge
    USTC, Peoples R China.
    Sun, Jian
    Megvii Technol, Peoples R China.
    Sun, Long
    Nanjing Univ Sci & Technol, Peoples R China.
    Tang, Jinhui
    Nanjing Univ Sci & Technol, Peoples R China.
    Ukita, Norimichi
    Toyota Technol Inst TTI, Japan.
    Wen, Zhihong
    Megvii Technol, Peoples R China.
    Wu, Qi
    Megvii Technol, Peoples R China.
    Wu, Xiaohe
    Harbin Inst Technol, Peoples R China.
    Xiao, Zeyu
    USTC, Peoples R China; Univ Sci & Technol China, Peoples R China.
    Xiong, Zhiwei
    USTC, Peoples R China; Univ Sci & Technol China, Peoples R China.
    Xu, Rongjian
    Harbin Inst Technol, Peoples R China.
    Xu, Ruikang
    USTC, Peoples R China; Univ Sci & Technol China, Peoples R China.
    Yan, Youliang
    Huawei, Peoples R China.
    Yang, Jialin
    WHU, Peoples R China.
    Yang, Wentao
    South China Univ Technol, Peoples R China.
    Yang, Zhongbao
    Nanjing Univ Sci & Technol, Peoples R China.
    Yasue, Fuma
    Toyota Technol Inst TTI, Japan.
    Yao, Mingde
    USTC, Peoples R China; Univ Sci & Technol China, Peoples R China.
    Yu, Lei
    Megvii Technol, Peoples R China.
    Zhang, Cong
    Xiaomi, Peoples R China.
    Zamir, Syed Waqas
    Incept Inst Artificial Intelligence IIAI, U Arab Emirates.
    Zhang, Jianxing
    SRC B, Peoples R China.
    Zhang, Shuohao
    Harbin Inst Technol, Peoples R China.
    Zhang, Zhilu
    Harbin Inst Technol, Peoples R China.
    Zheng, Qian
    Commun Univ China, Peoples R China.
    Zhou, Gaofeng
    Xiaomi, Peoples R China.
    Zhussip, Magauiya
    Huawei, Peoples R China.
    Zou, Xueyi
    Huawei, Peoples R China.
    Zuo, Wangmeng
    Harbin Inst Technol, Peoples R China.
    NTIRE 2022 Burst Super-Resolution Challenge2022In: 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2022), IEEE , 2022, p. 1040-1060Conference paper (Refereed)
    Abstract [en]

    Burst super-resolution has received increased attention in recent years due to its applications in mobile photography. By merging information from multiple shifted images of a scene, burst super-resolution aims to recover details which otherwise cannot be obtained using a simple input image. This paper reviews the NTIRE 2022 challenge on burst super-resolution. In the challenge, the participants were tasked with generating a clean RGB image with 4x higher resolution, given a RAW noisy burst as input. That is, the methods need to perform joint denoising, demosaicking, and super-resolution. The challenge consisted of 2 tracks. Track 1 employed synthetic data, where pixel-accurate high-resolution ground truths are available. Track 2 on the other hand used real-world bursts captured from a handheld camera, along with approximately aligned reference images captured using a DSLR. 14 teams participated in the final testing phase. The top performing methods establish a new state-of-the-art on the burst super-resolution task.

  • 47.
    Bhat, Goutam
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Johnander, Joakim
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Unveiling the power of deep tracking2018In: Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part II / [ed] Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu and Yair Weiss, Cham: Springer Publishing Company, 2018, p. 493-509Conference paper (Refereed)
    Abstract [en]

    In the field of generic object tracking numerous attempts have been made to exploit deep features. Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features. In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking. We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness. We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy. Extensive experiments are performed on four challenging datasets. On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of >17% in EAO.

    Download full text (pdf)
    Unveiling the power of deep tracking
  • 48.
    Bhunia, Ankan Kumar
    et al.
    Mohamed bin Zayed Univ AI, U Arab Emirates.
    Khan, Salman
    Mohamed bin Zayed Univ AI, U Arab Emirates; Australian Natl Univ, Australia.
    Cholakkal, Hisham
    Mohamed bin Zayed Univ AI, U Arab Emirates.
    Anwer, Rao Muhammad
    Mohamed bin Zayed Univ AI, U Arab Emirates; Aalto Univ, Finland.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed bin Zayed Univ AI, U Arab Emirates.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    DoodleFormer: Creative Sketch Drawing with Transformers2022In: COMPUTER VISION - ECCV 2022, PT XVII, SPRINGER INTERNATIONAL PUBLISHING AG , 2022, Vol. 13677, p. 338-355Conference paper (Refereed)
    Abstract [en]

    Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn. Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects. Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition followed by the incorporation of fine-details in the sketch. We introduce graph-aware transformer encoders that effectively capture global dynamic as well as local static structural relations among different body parts. To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder that explicitly models the variations of each sketch body part to be drawn. Experiments are performed on two creative sketch datasets: Creative Birds and Creative Creatures. Our qualitative, quantitative and human-based evaluations show that DoodleFormer outperforms the state-of-the-art on both datasets, yielding realistic and diverse creative sketches. On Creative Creatures, DoodleFormer achieves an absolute gain of 25 in Frechet inception distance (FID) over state-of-the-art. We also demonstrate the effectiveness of DoodleFormer for related applications of text to creative sketch generation, sketch completion and house layout generation. Code is available at: https://github.com/ ankanbhunia/doodleformer.

  • 49.
    Bhunia, Ankan Kumar
    et al.
    Mohamed bin Zayed Univ AI, U Arab Emirates.
    Khan, Salman
    Mohamed bin Zayed Univ AI, U Arab Emirates; Australian Natl Univ, Australia.
    Cholakkal, Hisham
    Mohamed bin Zayed Univ AI, U Arab Emirates.
    Anwer, Rao Muhammad
    Mohamed bin Zayed Univ AI, U Arab Emirates.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed bin Zayed Univ AI, U Arab Emirates.
    Shah, Mubarak
    Univ Cent Florida, FL 32816 USA.
    Handwriting Transformers2021In: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), IEEE , 2021, p. 1066-1074Conference paper (Refereed)
    Abstract [en]

    We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style features of each query character. To the best of our knowledge, we are the first to introduce a transformer-based network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images. Code is available at: https://github.com/ankanbhunia/HandwritingTransformers

  • 50.
    Bhunia, Ankan Kumar
    et al.
    Mohamed bin Zayed University of AI, UAE.
    Khan, Salman
    Mohamed bin Zayed University of AI, UAE; 2Australian National University, Australia.
    Cholakkal, Hisham
    Mohamed bin Zayed University of AI, UAE.
    Anwer, Rao Muhammad
    Mohamed bin Zayed University of AI, UAE.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed bin Zayed University of AI, UAE.
    Shah, Mubarak
    University of Central Florida, USA.
    Handwriting Transformers2021Other (Other academic)
    Abstract [en]