liu.seSearch for publications in DiVA
Change search
Refine search result
1234567 101 - 150 of 367
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 101.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Heyden, AndersLund University, Lund, Sweden.Krüger, NorbertUniversity of Southern Denmark, Odense, Denmark.
    Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I2017Conference proceedings (editor) (Refereed)
    Abstract [en]

    The two volume set LNCS 10424 and 10425 constitutes the refereed proceedings of the 17th International Conference on Computer Analysis of Images and Patterns, CAIP 2017, held in Ystad, Sweden, in August 2017.

    The 72 papers presented were carefully reviewed and selected from 144 submissions The papers are organized in the following topical sections: Vision for Robotics; Motion and Tracking; Segmentation; Image/Video Indexing and Retrieval; Shape Representation and Analysis; Biomedical Image Analysis; Biometrics; Machine Learning; Image Restoration; and Poster Sessions.

  • 102.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Heyden, AndersLund University, Lund, Sweden.Krüger, NorbertUniversity of Southern Denmark, Odense, Denmark.
    Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II2017Conference proceedings (editor) (Refereed)
    Abstract [en]

    The two volume set LNCS 10424 and 10425 constitutes the refereed proceedings of the 17th International Conference on Computer Analysis of Images and Patterns, CAIP 2017, held in Ystad, Sweden, in August 2017.  The 72 papers presented were carefully reviewed and selected from 144 submissions The papers are organized in the following topical sections: Vision for Robotics; Motion and Tracking; Segmentation; Image/Video Indexing and Retrieval; Shape Representation and Analysis; Biomedical Image Analysis; Biometrics; Machine Learning; Image Restoration; and Poster Sessions.

  • 103.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Kristan, Matej
    University of Ljubljana, Slovenia.
    Matas, Jiri
    Czech Technical University, Czech Republic.
    Leonardis, Ales
    University of Birmingham, England.
    Pflugfelder, Roman
    Austrian Institute Technology, Austria.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Berg, Amanda
    Linköping University, Faculty of Science & Engineering. Linköping University, Department of Electrical Engineering, Computer Vision. Termisk Syst Tekn AB, Linkoping, Sweden.
    Eldesokey, Abdelrahman
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Syst Tekn AB, Linkoping, Sweden.
    Cehovin, Luka
    University of Ljubljana, Slovenia.
    Vojir, Tomas
    Czech Technical University, Czech Republic.
    Lukezic, Alan
    University of Ljubljana, Slovenia.
    Fernandez, Gustavo
    Austrian Institute Technology, Austria.
    Petrosino, Alfredo
    Parthenope University of Naples, Italy.
    Garcia-Martin, Alvaro
    University of Autonoma Madrid, Spain.
    Solis Montero, Andres
    University of Ottawa, Canada.
    Varfolomieiev, Anton
    Kyiv Polytech Institute, Ukraine.
    Erdem, Aykut
    Hacettepe University, Turkey.
    Han, Bohyung
    POSTECH, South Korea.
    Chang, Chang-Ming
    University of Albany, GA USA.
    Du, Dawei
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Erdem, Erkut
    Hacettepe University, Turkey.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Porikli, Fatih
    ARC Centre Excellence Robot Vis, Australia; CSIRO, Australia.
    Zhao, Fei
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Bunyak, Filiz
    University of Missouri, MO 65211 USA.
    Battistone, Francesco
    Parthenope University of Naples, Italy.
    Zhu, Gao
    University of Missouri, Columbia, USA.
    Seetharaman, Guna
    US Navy, DC 20375 USA.
    Li, Hongdong
    ARC Centre Excellence Robot Vis, Australia.
    Qi, Honggang
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Bischof, Horst
    Graz University of Technology, Austria.
    Possegger, Horst
    Graz University of Technology, Austria.
    Nam, Hyeonseob
    NAVER Corp, South Korea.
    Valmadre, Jack
    University of Oxford, England.
    Zhu, Jianke
    Zhejiang University, Peoples R China.
    Feng, Jiayi
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Lang, Jochen
    University of Ottawa, Canada.
    Martinez, Jose M.
    University of Autonoma Madrid, Spain.
    Palaniappan, Kannappan
    University of Missouri, MO 65211 USA.
    Lebeda, Karel
    University of Surrey, England.
    Gao, Ke
    University of Missouri, MO 65211 USA.
    Mikolajczyk, Krystian
    Imperial Coll London, England.
    Wen, Longyin
    University of Albany, GA USA.
    Bertinetto, Luca
    University of Oxford, England.
    Poostchi, Mahdieh
    University of Missouri, MO 65211 USA.
    Maresca, Mario
    Parthenope University of Naples, Italy.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Arens, Michael
    Fraunhofer IOSB, Germany.
    Tang, Ming
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Baek, Mooyeol
    POSTECH, South Korea.
    Fan, Nana
    Harbin Institute Technology, Peoples R China.
    Al-Shakarji, Noor
    University of Missouri, MO 65211 USA.
    Miksik, Ondrej
    University of Oxford, England.
    Akin, Osman
    Hacettepe University, Turkey.
    Torr, Philip H. S.
    University of Oxford, England.
    Huang, Qingming
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Martin-Nieto, Rafael
    University of Autonoma Madrid, Spain.
    Pelapur, Rengarajan
    University of Missouri, MO 65211 USA.
    Bowden, Richard
    University of Surrey, England.
    Laganiere, Robert
    University of Ottawa, Canada.
    Krah, Sebastian B.
    Fraunhofer IOSB, Germany.
    Li, Shengkun
    University of Albany, GA USA.
    Yao, Shizeng
    University of Missouri, MO 65211 USA.
    Hadfield, Simon
    University of Surrey, England.
    Lyu, Siwei
    University of Albany, GA USA.
    Becker, Stefan
    Fraunhofer IOSB, Germany.
    Golodetz, Stuart
    University of Oxford, England.
    Hu, Tao
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Mauthner, Thomas
    Graz University of Technology, Austria.
    Santopietro, Vincenzo
    Parthenope University of Naples, Italy.
    Li, Wenbo
    Lehigh University, PA 18015 USA.
    Huebner, Wolfgang
    Fraunhofer IOSB, Germany.
    Li, Xin
    Harbin Institute Technology, Peoples R China.
    Li, Yang
    Zhejiang University, Peoples R China.
    Xu, Zhan
    Zhejiang University, Peoples R China.
    He, Zhenyu
    Harbin Institute Technology, Peoples R China.
    The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results2016In: Computer Vision – ECCV 2016 Workshops. ECCV 2016. / [ed] Hua G., Jégou H., SPRINGER INT PUBLISHING AG , 2016, p. 824-849Conference paper (Refereed)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking challenge 2016, VOT-TIR2016, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2016 is the second benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2016 challenge is similar to the 2015 challenge, the main difference is the introduction of new, more difficult sequences into the dataset. Furthermore, VOT-TIR2016 evaluation adopted the improvements regarding overlap calculation in VOT2016. Compared to VOT-TIR2015, a significant general improvement of results has been observed, which partly compensate for the more difficult sequences. The dataset, the evaluation kit, as well as the results are publicly available at the challenge website.

  • 104.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Larsson, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Wiklund, Johan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Wadströmer, Niclas
    FOI.
    Ahlberg, Jörgen
    Termisk Systemteknik AB.
    Online Learning of Correspondences between Images2013In: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 35, no 1, p. 118-129Article in journal (Refereed)
    Abstract [en]

    We propose a novel method for iterative learning of point correspondences between image sequences. Points moving on surfaces in 3D space are projected into two images. Given a point in either view, the considered problem is to determine the corresponding location in the other view. The geometry and distortions of the projections are unknown as is the shape of the surface. Given several pairs of point-sets but no access to the 3D scene, correspondence mappings can be found by excessive global optimization or by the fundamental matrix if a perspective projective model is assumed. However, an iterative solution on sequences of point-set pairs with general imaging geometry is preferable. We derive such a method that optimizes the mapping based on Neyman's chi-square divergence between the densities representing the uncertainties of the estimated and the actual locations. The densities are represented as channel vectors computed with a basis function approach. The mapping between these vectors is updated with each new pair of images such that fast convergence and high accuracy are achieved. The resulting algorithm runs in real-time and is superior to state-of-the-art methods in terms of convergence and accuracy in a number of experiments.

  • 105.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Öfjäll, Kristoffer
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Lenz, Reiner
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Unbiased decoding of biologically motivated visual feature descriptors2015In: Frontiers in Robotics and AI, ISSN 2296-9144, Vol. 2, no 20Article in journal (Refereed)
    Abstract [en]

    Visual feature descriptors are essential elements in most computer and robot vision systems. They typically lead to an abstraction of the input data, images, or video, for further processing, such as clustering and machine learning. In clustering applications, the cluster center represents the prototypical descriptor of the cluster and estimates the corresponding signal value, such as color value or dominating flow orientation, by decoding the prototypical descriptor. Machine learning applications determine the relevance of respective descriptors and a visualization of the corresponding decoded information is very useful for the analysis of the learning algorithm. Thus decoding of feature descriptors is a relevant problem, frequently addressed in recent work. Also, the human brain represents sensorimotor information at a suitable abstraction level through varying activation of neuron populations. In previous work, computational models have been derived that agree with findings of neurophysiological experiments on the represen-tation of visual information by decoding the underlying signals. However, the represented variables have a bias toward centers or boundaries of the tuning curves. Despite the fact that feature descriptors in computer vision are motivated from neuroscience, the respec-tive decoding methods have been derived largely independent. From first principles, we derive unbiased decoding schemes for biologically motivated feature descriptors with a minimum amount of redundancy and suitable invariance properties. These descriptors establish a non-parametric density estimation of the underlying stochastic process with a particular algebraic structure. Based on the resulting algebraic constraints, we show formally how the decoding problem is formulated as an unbiased maximum likelihood estimator and we derive a recurrent inverse diffusion scheme to infer the dominating mode of the distribution. These methods are evaluated in experiments, where stationary points and bias from noisy image data are compared to existing methods.

  • 106.
    Flood, Katarina
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Danielsson, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Magnusson Seger, Maria
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    On 3D scanning, reconstruction, enhancement, and segmentation of logs2003In: Image Analysis: 13th Scandinavian Conference, SCIA 2003 Halmstad, Sweden, June 29 – July 2, 2003 Proceedings / [ed] Josef Bigun and Tomas Gustavsson, Springer Berlin/Heidelberg, 2003, Vol. 2749, p. 733-740Chapter in book (Refereed)
    Abstract [en]

    This paper presents novel results from an ongoing feasibility study of fully 3D X-ray scanning of Pinus Sylvestris (Scots Pine) logs. Logs are assumed to be translated through two identical and static cone beam systems with the beams rotated 90degrees relative to eachother, providing a dual set of 2D-projections. For reasons of both cost and speed, each 2D-detector in these two systems consists of a limited number of line detectors. The quality of the reconstructed images is far from perfect, due to sparse detector data and missing projection angles. In spite of this we show that by employing a shape- and direction discriminative technique based on second derivatives, we are able to enhance knot-like features in these data. In the enhanced images it is then possible to detect and localize the pith for each whorl of knots, and subsequently also to perform a full segmentation of the knots in the heartwood.

  • 107.
    Forssen, Per-Erik
    et al.
    Univ British Columbia, Dept Comp Sci, Vancouver, BC V6T 1Z4 Canada.
    Moe, Anders
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    View matching with blob features2009In: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 27, no 1-2, p. 99-107Article in journal (Refereed)
    Abstract [en]

    This article introduces a new region based feature for object recognition and image matching. In contrast to many other region based features, this one makes use of colour in the feature extraction stage. We perform experiments on the repeatability rate of the features across scale and inclination angle changes, and show that avoiding to merge regions connected by only a few pixels improves the repeatability. We introduce two voting schemes that allow us to find correspondences automatically, and compare them with respect to the number of valid correspondences they give, and their inlier ratios. We also demonstrate how the matching procedure can be applied to colour correction.

  • 108.
    Forssén, Per-Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ringaby, Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Rectifying rolling shutter video from hand-held devices2010In: IEEE  Conference on  Computer Vision and Pattern Recognition (CVPR), 2010, Los Alamitos, CA, USA: IEEE Computer Society, 2010, p. 507-514Conference paper (Other academic)
    Abstract [en]

    This paper presents a method for rectifying video sequences from rolling shutter (RS) cameras. In contrast to previous RS rectification attempts we model distortions as being caused by the 3D motion of the camera. The camera motion is parametrised as a continuous curve, with knots at the last row of each frame. Curve parameters are solved for using non-linear least squares over inter-frame correspondences obtained from a KLT tracker. We have generated synthetic RS sequences with associated ground-truth to allow controlled evaluation. Using these sequences, we demonstrate that our algorithm improves over to two previously published methods. The RS dataset is available on the web to allow comparison with other methods

  • 109.
    Freddie, Åström
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Michael, Felsberg
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Reiner, Lenz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, The Institute of Technology.
    Color Persistent Anisotropic Diffusion of Images2011In: Image Analysis / [ed] Anders Heyden, Fredrik Kahl, Heidelberg: Springer, 2011, p. 262-272Conference paper (Refereed)
    Abstract [en]

    Techniques from the theory of partial differential equations are often used to design filter methods that are locally adapted to the image structure. These techniques are usually used in the investigation of gray-value images. The extension to color images is non-trivial, where the choice of an appropriate color space is crucial. The RGB color space is often used although it is known that the space of human color perception is best described in terms of non-euclidean geometry, which is fundamentally different from the structure of the RGB space. Instead of the standard RGB space, we use a simple color transformation based on the theory of finite groups. It is shown that this transformation reduces the color artifacts originating from the diffusion processes on RGB images. The developed algorithm is evaluated on a set of real-world images, and it is shown that our approach exhibits fewer color artifacts compared to state-of-the-art techniques. Also, our approach preserves details in the image for a larger number of iterations.

  • 110.
    Fridborn, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Reading Barcodes with Neural Networks2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Barcodes are ubiquituous in modern society and they have had industrial application for decades. However, for noisy images modern methods can underperform. Poor lighting conditions, occlusions and low resolution can be problematic in decoding. This thesis aims to solve this problem by using neural networks, which have enjoyed great success in many computer vision competitions the last years. We investigate how three different networks perform on data sets with noisy images. The first network is a single classifier, the second network is an ensemble classifier and the third is based on a pre-trained feature extractor. For comparison, we also test two baseline methods that are used in industry today. We generate training data using software and modify it to ensure proper generalization. Testing data is created by photographing barcodes in different settings, creating six image classes - normal, dark, white, rotated, occluded and wrinkled. The proposed single classifier and ensemble classifier outperform the baseline as well as the pre-trained feature extractor by a large margin. The thesis work was performed at SICK IVP, a machine vision company in Linköping in 2017.

  • 111.
    Fridman, Linnea
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Nordberg, Victoria
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Two Multimodal Image Registration Approaches for Positioning Purposes2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This report is the result of a master thesis made by two students at Linköping University. The aim was to find an image registration method for visual and infrared images and to find an error measure for grading the registration performance. In practice this could be used for position determination by registering the infrared image taken at the current position to a set of visual images with known positions and determining which visual image matches the best. Two methods were tried, using different image feature extractors and different ways to match the features. The first method used phase information in the images to generate soft features and then minimised the square error of the optical flow equation to estimate the transformation between the visual and infrared image. The second method used the Canny edge detector to extract hard features from the images and Chamfer distance as an error measure. Both methods were evaluated for registration as well as position determination and yielded promising results. However, the performance of both methods was image dependent. The soft edge method proved to be more robust and precise and worked better than the hard edge method for both registration and position determination.

  • 112.
    Fridolfsson, Olle
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Machine Learning: for Barcode Detection and OCR2015Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Machine learning can be utilized in many different ways in the field of automatic manufacturing and logistics. In this thesis supervised machine learning have been utilized to train a classifiers for detection and recognition of objects in images. The techniques AdaBoost and Random forest have been examined, both are based on decision trees.

    The thesis has considered two applications: barcode detection and optical character recognition (OCR). Supervised machine learning methods are highly appropriate in both applications since both barcodes and printed characters generally are rather distinguishable.

    The first part of this thesis examines the use of machine learning for barcode detection in images, both traditional 1D-barcodes and the more recent Maxi-codes, which is a type of two-dimensional barcode. In this part the focus has been to train classifiers with the technique AdaBoost. The Maxi-code detection is mainly done with Local binary pattern features. For detection of 1D-codes, features are calculated from the structure tensor. The classifiers have been evaluated with around 200 real test images, containing barcodes, and shows promising results.

    The second part of the thesis involves optical character recognition. The focus in this part has been to train a Random forest classifier by using the technique point pair features. The performance has also been compared with the more proven and widely used Haar-features. Although, the result shows that Haar-features are superior in terms of accuracy. Nevertheless the conclusion is that point pairs can be utilized as features for Random forest in OCR.

  • 113.
    Gasslander, Maja
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Segmentation of Clouds in Satellite Images2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The usage of 3D modelling is increasing fast, both for civilian and military areas, such as navigation, targeting and urban planning. When creating a 3D model from satellite images, clouds canbe problematic. Thus, automatic detection ofclouds inthe imagesis ofgreat use. This master thesis was carried out at Vricon, who produces 3D models of the earth from satellite images.This thesis aimed to investigate if Support Vector Machines could classify pixels into cloud or non-cloud, with a combination of texture and color as features. To solve the stated goal, the task was divided into several subproblems, where the first part was to extract features from the images. Then the images were preprocessed before fed to the classifier. After that, the classifier was trained, and finally evaluated.The two methods that gave the best results in this thesis had approximately 95 % correctly classified pixels. This result is better than the existing cloud segmentation method at Vricon, for the tested terrain and cloud types.

  • 114.
    Gladh, Susanna
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Visual Tracking Using Deep Motion Features2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Generic visual tracking is a challenging computer vision problem, where the position of a specified target is estimated through a sequence of frames. The only given information is the initial location of the target. Therefore, the tracker has to adapt and learn any kind of object, which it describes through visual features used to differentiate target from background. Standard appearance features only capture momentary visual information. This master’s thesis investigates the use of deep features extracted through optical flow images processed in a deep convolutional network. The optical flow is calculated using two consecutive images, and thereby captures the dynamic nature of the scene. Results show that this information is complementary to the standard appearance features, and improves performance of the tracker. Deep features are typically very high dimensional. Employing dimensionality reduction can increase both the efficiency and performance of the tracker. As a second aim in this thesis, PCA and PLS were evaluated and compared. The evaluations show that the two methods are almost equal in performance, with PLS actually receiving slightly better score than the popular PCA. The final proposed tracker was evaluated on three challenging datasets, and was shown to outperform other state-of-the-art trackers.

  • 115.
    Gladh, Susanna
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Danelljan, Martin
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Khan, Fahad Shahbaz
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Felsberg, Michael
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Deep motion features for visual tracking2016In: Proceedings of the 23rd International Conference on, Pattern Recognition (ICPR), 2016, Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 1243-1248Conference paper (Refereed)
    Abstract [en]

    Robust visual tracking is a challenging computer vision problem, with many real-world applications. Most existing approaches employ hand-crafted appearance features, such as HOG or Color Names. Recently, deep RGB features extracted from convolutional neural networks have been successfully applied for tracking. Despite their success, these features only capture appearance information. On the other hand, motion cues provide discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. This paper presents an investigation of the impact of deep motion features in a tracking-by-detection framework. We further show that hand-crafted, deep RGB, and deep motion features contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly suggest that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.

  • 116.
    Grahn, Fredrik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Nilsson, Kristian
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Object Detection in Domain Specific Stereo-Analysed Satellite Images2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Given satellite images with accompanying pixel classifications and elevation data, we propose different solutions to object detection. The first method uses hierarchical clustering for segmentation and then employs different methods of classification. One of these classification methods used domain knowledge to classify objects while the other used Support Vector Machines. Additionally, a combination of three Support Vector Machines were used in a hierarchical structure which out-performed the regular Support Vector Machine method in most of the evaluation metrics. The second approach is more conventional with different types of Convolutional Neural Networks. A segmentation network was used as well as a few detection networks and different fusions between these. The Convolutional Neural Network approach proved to be the better of the two in terms of precision and recall but the clustering approach was not far behind. This work was done using a relatively small amount of data which potentially could have impacted the results of the Machine Learning models in a negative way.

  • 117.
    Grandell, Oscar
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    An iterative reconstruction algorithm for quantitative tissue decomposition using DECT2012Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The introduction of dual energy CT, DECT, in the field of medical healthcare has made it possible to extract more information of the scanned objects. This in turn has the potential to improve the accuracy in radiation therapy dose planning. One problem that remains before successful material decomposition can be achieved however, is the presence of beam hardening and scatter artifacts that arise in a scan. Methods currently in clinical use for removal of beam hardening often bias the CT numbers. Hence, the possibility for an appropriate tissue decomposition is limited.

    Here a method for successful decomposition as well as removal of the beam hardening artifact is presented. The method uses effective linear attenuations for the five base materials, water, protein, adipose, cortical bone and marrow, to perform the decomposition on reconstructed simulated data. This is performed inside an iterative loop together with the polychromatic x-ray spectra to remove the beam hardening

  • 118.
    Grankvist, Ola
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Recognition and Registration of 3D Models in Depth Sensor Data2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Object Recognition is the art of localizing predefined objects in image sensor data. In this thesis a depth sensor was used which has the benefit that the 3D pose of the object can be estimated. This has applications in e.g. automatic manufacturing, where a robot picks up parts or tools with a robot arm.

    This master thesis presents an implementation and an evaluation of a system for object recognition of 3D models in depth sensor data. The system uses several depth images rendered from a 3D model and describes their characteristics using so-called feature descriptors. These are then matched with the descriptors of a scene depth image to find the 3D pose of the model in the scene. The pose estimate is then refined iteratively using a registration method. Different descriptors and registration methods are investigated.

    One of the main contributions of this thesis is that it compares two different types of descriptors, local and global, which has seen little attention in research. This is done for two different scene scenarios, and for different types of objects and depth sensors. The evaluation shows that global descriptors are fast and robust for objects with a smooth visible surface whereas the local descriptors perform better for larger objects in clutter and occlusion. This thesis also presents a novel global descriptor, the CESF, which is observed to be more robust than other global descriptors. As for the registration methods, the ICP is shown to perform most accurately and ICP point-to-plane more robust.

  • 119.
    Gratorp, Eric
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Evaluation of online hardware video stabilization on a moving platform2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Recording a video sequence with a camera during movement often produces blurred results. This is mainly due to motion blur which is caused by rapid movement of objects in the scene or the camera during recording. By correcting for changes in the orientation of the camera, caused by e.g. uneven terrain, it is possible to minimize the motion blur and thus, produce a stabilized video.

    In order to do this, data gathered from a gyroscope and the camera itself can be used to measure the orientation of the camera. The raw data needs to be processed, synchronized and filtered to produce a robust estimate of the orientation. This estimate can then be used as input to some automatic control system in order to correct for changes in the orientation

    This thesis focuses on examining the possibility of such a stabilization. The actual stabilization is left for future work. An evaluation of the hardware as well as the implemented methods are done with emphasis on speed, which is crucial in real time computing.

  • 120.
    Grelsson, Bertil
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Global Pose Estimation from Aerial Images: Registration with Elevation Models2014Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Over the last decade, the use of unmanned aerial vehicles (UAVs) has increased drastically. Originally, the use of these aircraft was mainly military, but today many civil applications have emerged. UAVs are frequently the preferred choice for surveillance missions in disaster areas, after earthquakes or hurricanes, and in hazardous environments, e.g. for detection of nuclear radiation. The UAVs employed in these missions are often relatively small in size which implies payload restrictions.

    For navigation of the UAVs, continuous global pose (position and attitude) estimation is mandatory. Cameras can be fabricated both small in size and light in weight. This makes vision-based methods well suited for pose estimation onboard these vehicles. It is obvious that no single method can be used for pose estimation in all dierent phases throughout a ight. The image content will be very dierent on the runway, during ascent, during  ight at low or high altitude, above urban or rural areas, etc. In total, a multitude of pose estimation methods is required to handle all these situations. Over the years, a large number of vision-based pose estimation methods for aerial images have been developed. But there are still open research areas within this eld, e.g. the use of omnidirectional images for pose estimation is relatively unexplored.

    The contributions of this thesis are three vision-based methods for global egopositioning and/or attitude estimation from aerial images. The rst method for full 6DoF (degrees of freedom) pose estimation is based on registration of local height information with a geo-referenced 3D model. A dense local height map is computed using motion stereo. A pose estimate from navigation sensors is used as an initialization. The global pose is inferred from the 3D similarity transform between the local height map and the 3D model. Aligning height information is assumed to be more robust to season variations than feature matching in a single-view based approach.

    The second contribution is a method for attitude (pitch and roll angle) estimation via horizon detection. It is one of only a few methods in the literature that use an omnidirectional (sheye) camera for horizon detection in aerial images. The method is based on edge detection and a probabilistic Hough voting scheme. In a  ight scenario, there is often some knowledge on the probability density for the altitude and the attitude angles. The proposed method allows this prior information to be used to make the attitude estimation more robust.

    The third contribution is a further development of method two. It is the very rst method presented where the attitude estimates from the detected horizon in omnidirectional images is rened through registration with the geometrically expected horizon from a digital elevation model. It is one of few methods where the ray refraction in the atmosphere is taken into account, which contributes to the highly accurate pose estimates. The attitude errors obtained are about one order of magnitude smaller than for any previous vision-based method for attitude estimation from horizon detection in aerial images.

    List of papers
    1. Efficient 7D Aerial Pose Estimation
    Open this publication in new window or tab >>Efficient 7D Aerial Pose Estimation
    2013 (English)In: 2013 IEEE Workshop on Robot Vision (WORV), IEEE , 2013, p. 88-95Conference paper, Published paper (Refereed)
    Abstract [en]

    A method for online global pose estimation of aerial images by alignment with a georeferenced 3D model is presented.Motion stereo is used to reconstruct a dense local height patch from an image pair. The global pose is inferred from the 3D transform between the local height patch and the model.For efficiency, the sought 3D similarity transform is found by least-squares minimizations of three 2D subproblems.The method does not require any landmarks or reference points in the 3D model, but an approximate initialization of the global pose, in our case provided by onboard navigation sensors, is assumed.Real aerial images from helicopter and aircraft flights are used to evaluate the method. The results show that the accuracy of the position and orientation estimates is significantly improved compared to the initialization and our method is more robust than competing methods on similar datasets.The proposed matching error computed between the transformed patch and the map clearly indicates whether a reliable pose estimate has been obtained.

    Place, publisher, year, edition, pages
    IEEE, 2013
    Keywords
    Pose estimation, aerial images, registration, 3D model
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-89477 (URN)10.1109/WORV.2013.6521919 (DOI)000325279400014 ()978-1-4673-5646-6 (ISBN)978-1-4673-5647-3 (ISBN)
    Conference
    IEEE Workshop on Robot Vision 2013, Clearwater Beach, Florida, USA, January 16-17, 2013
    Available from: 2013-02-26 Created: 2013-02-26 Last updated: 2019-04-12
    2. Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images
    Open this publication in new window or tab >>Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images
    2013 (English)In: Image Analysis: 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings / [ed] Joni-Kristian Kämäräinen and Markus Koskela, Springer Berlin/Heidelberg, 2013, p. 478-488Conference paper, Published paper (Refereed)
    Abstract [en]

    For navigation of unmanned aerial vehicles (UAVs), attitude estimation is essential. We present a method for attitude estimation (pitch and roll angle) from aerial fisheye images through horizon detection. The method is based on edge detection and a probabilistic Hough voting scheme.  In a flight scenario, there is often some prior knowledge of the vehicle altitude and attitude. We exploit this prior to make the attitude estimation more robust by letting the edge pixel votes be weighted based on the probability distributions for the altitude and pitch and roll angles. The method does not require any sky/ground segmentation as most horizon detection methods do. Our method has been evaluated on aerial fisheye images from the internet. The horizon is robustly detected in all tested images. The deviation in the attitude estimate between our automated horizon detection and a manual detection is less than 1 degree.

    Place, publisher, year, edition, pages
    Springer Berlin/Heidelberg, 2013
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 7944
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-98066 (URN)10.1007/978-3-642-38886-6_45 (DOI)000342988500045 ()978-3-642-38885-9 (ISBN)978-3-642-38886-6 (ISBN)
    Conference
    18th Scandinavian Conferences on Image Analysis (SCIA 2013), 17-20 June 2013, Espoo, Finland.
    Projects
    CIMSMAP
    Available from: 2013-09-27 Created: 2013-09-27 Last updated: 2019-04-12Bibliographically approved
    3. Highly Accurate Attitude Estimation via Horizon Detection
    Open this publication in new window or tab >>Highly Accurate Attitude Estimation via Horizon Detection
    2016 (English)In: Journal of Field Robotics, ISSN 1556-4959, E-ISSN 1556-4967, Vol. 33, no 7, p. 967-993Article in journal (Refereed) Published
    Abstract [en]

    Attitude (pitch and roll angle) estimation from visual information is necessary for GPS-free navigation of airborne vehicles. We propose a highly accurate method to estimate the attitude by horizon detection in fisheye images. A Canny edge detector and a probabilistic Hough voting scheme are used to compute an approximate attitude and the corresponding horizon line in the image. Horizon edge pixels are extracted in a band close to the approximate horizon line. The attitude estimates are refined through registration of the extracted edge pixels with the geometrical horizon from a digital elevation map (DEM), in our case the SRTM3 database, extracted at a given approximate position. The proposed method has been evaluated using 1629 images from a flight trial with flight altitudes up to 600 m in an area with ground elevations ranging from sea level up to 500 m. Compared with the ground truth from a filtered inertial measurement unit (IMU)/GPS solution, the standard deviation for the pitch and roll angle errors obtained with 30 Mpixel images are 0.04° and 0.05°, respectively, with mean errors smaller than 0.02°. To achieve the high-accuracy attitude estimates, the ray refraction in the earth's atmosphere has been taken into account. The attitude errors obtained on real images are less or equal to those achieved on synthetic images for previous methods with DEM refinement, and the errors are about one order of magnitude smaller than for any previous vision-based method without DEM refinement.

    Place, publisher, year, edition, pages
    John Wiley & Sons, 2016
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-108212 (URN)10.1002/rob.21639 (DOI)000387925400005 ()
    Note

    At the date of the thesis presentation was this publication a manuscript.

    Funding agencies: Swedish Governmental Agency for Innovation Systems, VINNOVA [NFFP5 2013-05243]; Swedish Foundation for Strategic Research [RIT10-0047]; Swedish Research Council within the Linnaeus environment CADICS; Knut and Alice Wallenberg Foundation

    Available from: 2014-06-26 Created: 2014-06-26 Last updated: 2019-04-12Bibliographically approved
  • 121.
    Grelsson, Bertil
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Vision-based Localization and Attitude Estimation Methods in Natural Environments2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Over the last decade, the usage of unmanned systems such as Unmanned Aerial Vehicles (UAVs), Unmanned Surface Vessels (USVs) and Unmanned Ground Vehicles (UGVs) has increased drastically, and there is still a rapid growth. Today, unmanned systems are being deployed in many daily operations, e.g. for deliveries in remote areas, to increase efficiency of agriculture, and for environmental monitoring at sea. For safety reasons, unmanned systems are often the preferred choice for surveillance missions in hazardous environments, e.g. for detection of nuclear radiation, and in disaster areas after earthquakes, hurricanes, or during forest fires. For safe navigation of the unmanned systems during their missions, continuous and accurate global localization and attitude estimation is mandatory.

    Over the years, many vision-based methods for position estimation have been developed, primarily for urban areas. In contrast, this thesis is mainly focused on vision-based methods for accurate position and attitude estimates in natural environments, i.e. beyond the urban areas. Vision-based methods possess several characteristics that make them appealing as global position and attitude sensors. First, vision sensors can be realized and tailored for most unmanned vehicle applications. Second, geo-referenced terrain models can be generated worldwide from satellite imagery and can be stored onboard the vehicles. In natural environments, where the availability of geo-referenced images in general is low, registration of image information with terrain models is the natural choice for position and attitude estimation. This is the problem area that I addressed in the contributions of this thesis.

    The first contribution is a method for full 6DoF (degrees of freedom) pose estimation from aerial images. A dense local height map is computed using structure from motion. The global pose is inferred from the 3D similarity transform between the local height map and a digital elevation model. Aligning height information is assumed to be more robust to season variations than feature-based matching.

    The second contribution is a method for accurate attitude (pitch and roll angle) estimation via horizon detection. It is one of only a few methods that use an omnidirectional (fisheye) camera for horizon detection in aerial images. The method is based on edge detection and a probabilistic Hough voting scheme. The method allows prior knowledge of the attitude angles to be exploited to make the initial attitude estimates more robust. The estimates are then refined through registration with the geometrically expected horizon line from a digital elevation model. To the best of our knowledge, it is the first method where the ray refraction in the atmosphere is taken into account, which enables the highly accurate attitude estimates.

    The third contribution is a method for position estimation based on horizon detection in an omnidirectional panoramic image around a surface vessel. Two convolutional neural networks (CNNs) are designed and trained to estimate the camera orientation and to segment the horizon line in the image. The MOSSE correlation filter, normally used in visual object tracking, is adapted to horizon line registration with geometric data from a digital elevation model. Comprehensive field trials conducted in the archipelago demonstrate the GPS-level accuracy of the method, and that the method can be trained on images from one region and then applied to images from a previously unvisited test area.

    The CNNs in the third contribution apply the typical scheme of convolutions, activations, and pooling. The fourth contribution focuses on the activations and suggests a new formulation to tune and optimize a piecewise linear activation function during training of CNNs. Improved classification results from experiments when tuning the activation function led to the introduction of a new activation function, the Shifted Exponential Linear Unit (ShELU).

    List of papers
    1. Efficient 7D Aerial Pose Estimation
    Open this publication in new window or tab >>Efficient 7D Aerial Pose Estimation
    2013 (English)In: 2013 IEEE Workshop on Robot Vision (WORV), IEEE , 2013, p. 88-95Conference paper, Published paper (Refereed)
    Abstract [en]

    A method for online global pose estimation of aerial images by alignment with a georeferenced 3D model is presented.Motion stereo is used to reconstruct a dense local height patch from an image pair. The global pose is inferred from the 3D transform between the local height patch and the model.For efficiency, the sought 3D similarity transform is found by least-squares minimizations of three 2D subproblems.The method does not require any landmarks or reference points in the 3D model, but an approximate initialization of the global pose, in our case provided by onboard navigation sensors, is assumed.Real aerial images from helicopter and aircraft flights are used to evaluate the method. The results show that the accuracy of the position and orientation estimates is significantly improved compared to the initialization and our method is more robust than competing methods on similar datasets.The proposed matching error computed between the transformed patch and the map clearly indicates whether a reliable pose estimate has been obtained.

    Place, publisher, year, edition, pages
    IEEE, 2013
    Keywords
    Pose estimation, aerial images, registration, 3D model
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-89477 (URN)10.1109/WORV.2013.6521919 (DOI)000325279400014 ()978-1-4673-5646-6 (ISBN)978-1-4673-5647-3 (ISBN)
    Conference
    IEEE Workshop on Robot Vision 2013, Clearwater Beach, Florida, USA, January 16-17, 2013
    Available from: 2013-02-26 Created: 2013-02-26 Last updated: 2019-04-12
    2. Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images
    Open this publication in new window or tab >>Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images
    2013 (English)In: Image Analysis: 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings / [ed] Joni-Kristian Kämäräinen and Markus Koskela, Springer Berlin/Heidelberg, 2013, p. 478-488Conference paper, Published paper (Refereed)
    Abstract [en]

    For navigation of unmanned aerial vehicles (UAVs), attitude estimation is essential. We present a method for attitude estimation (pitch and roll angle) from aerial fisheye images through horizon detection. The method is based on edge detection and a probabilistic Hough voting scheme.  In a flight scenario, there is often some prior knowledge of the vehicle altitude and attitude. We exploit this prior to make the attitude estimation more robust by letting the edge pixel votes be weighted based on the probability distributions for the altitude and pitch and roll angles. The method does not require any sky/ground segmentation as most horizon detection methods do. Our method has been evaluated on aerial fisheye images from the internet. The horizon is robustly detected in all tested images. The deviation in the attitude estimate between our automated horizon detection and a manual detection is less than 1 degree.

    Place, publisher, year, edition, pages
    Springer Berlin/Heidelberg, 2013
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 7944
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-98066 (URN)10.1007/978-3-642-38886-6_45 (DOI)000342988500045 ()978-3-642-38885-9 (ISBN)978-3-642-38886-6 (ISBN)
    Conference
    18th Scandinavian Conferences on Image Analysis (SCIA 2013), 17-20 June 2013, Espoo, Finland.
    Projects
    CIMSMAP
    Available from: 2013-09-27 Created: 2013-09-27 Last updated: 2019-04-12Bibliographically approved
    3. Highly Accurate Attitude Estimation via Horizon Detection
    Open this publication in new window or tab >>Highly Accurate Attitude Estimation via Horizon Detection
    2016 (English)In: Journal of Field Robotics, ISSN 1556-4959, E-ISSN 1556-4967, Vol. 33, no 7, p. 967-993Article in journal (Refereed) Published
    Abstract [en]

    Attitude (pitch and roll angle) estimation from visual information is necessary for GPS-free navigation of airborne vehicles. We propose a highly accurate method to estimate the attitude by horizon detection in fisheye images. A Canny edge detector and a probabilistic Hough voting scheme are used to compute an approximate attitude and the corresponding horizon line in the image. Horizon edge pixels are extracted in a band close to the approximate horizon line. The attitude estimates are refined through registration of the extracted edge pixels with the geometrical horizon from a digital elevation map (DEM), in our case the SRTM3 database, extracted at a given approximate position. The proposed method has been evaluated using 1629 images from a flight trial with flight altitudes up to 600 m in an area with ground elevations ranging from sea level up to 500 m. Compared with the ground truth from a filtered inertial measurement unit (IMU)/GPS solution, the standard deviation for the pitch and roll angle errors obtained with 30 Mpixel images are 0.04° and 0.05°, respectively, with mean errors smaller than 0.02°. To achieve the high-accuracy attitude estimates, the ray refraction in the earth's atmosphere has been taken into account. The attitude errors obtained on real images are less or equal to those achieved on synthetic images for previous methods with DEM refinement, and the errors are about one order of magnitude smaller than for any previous vision-based method without DEM refinement.

    Place, publisher, year, edition, pages
    John Wiley & Sons, 2016
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-108212 (URN)10.1002/rob.21639 (DOI)000387925400005 ()
    Note

    At the date of the thesis presentation was this publication a manuscript.

    Funding agencies: Swedish Governmental Agency for Innovation Systems, VINNOVA [NFFP5 2013-05243]; Swedish Foundation for Strategic Research [RIT10-0047]; Swedish Research Council within the Linnaeus environment CADICS; Knut and Alice Wallenberg Foundation

    Available from: 2014-06-26 Created: 2014-06-26 Last updated: 2019-04-12Bibliographically approved
    4. Improved Learning in Convolutional Neural Networks with Shifted Exponential Linear Units (ShELUs)
    Open this publication in new window or tab >>Improved Learning in Convolutional Neural Networks with Shifted Exponential Linear Units (ShELUs)
    2018 (English)In: 2018 24th International Conference on Pattern Recognition (ICPR), IEEE, 2018, p. 517-522Conference paper, Published paper (Refereed)
    Abstract [en]

    The Exponential Linear Unit (ELU) has been proven to speed up learning and improve the classification performance over activation functions such as ReLU and Leaky ReLU for convolutional neural networks. The reasons behind the improved behavior are that ELU reduces the bias shift, it saturates for large negative inputs and it is continuously differentiable. However, it remains open whether ELU has the optimal shape and we address the quest for a superior activation function.We use a new formulation to tune a piecewise linear activation function during training, to investigate the above question, and learn the shape of the locally optimal activation function. With this tuned activation function, the classification performance is improved and the resulting, learned activation function shows to be ELU-shaped irrespective if it is initialized as a RELU, LReLU or ELU. Interestingly, the learned activation function does not exactly pass through the origin indicating that a shifted ELU-shaped activation function is preferable. This observation leads us to introduce the Shifted Exponential Linear Unit (ShELU) as a new activation function.Experiments on Cifar-100 show that the classification performance is further improved when using the ShELU activation function in comparison with ELU. The improvement is achieved when learning an individual bias shift for each neuron.

    Place, publisher, year, edition, pages
    IEEE, 2018
    Series
    International Conference on Pattern Recognition
    Keywords
    CNN, activation function
    National Category
    Other Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-151606 (URN)10.1109/ICPR.2018.8545104 (DOI)000455146800087 ()978-1-5386-3787-6 (ISBN)
    Conference
    24th International Conference on Pattern Recognition, ICPR 2018, Beijing, China, 20-24 Aug. 2018
    Funder
    Wallenberg Foundations
    Note

    Funding agencies:  Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation; Swedish Research Council [2014-6227]

    Available from: 2018-09-27 Created: 2018-09-27 Last updated: 2019-10-31
  • 122.
    Grelsson, Bertil
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Improved Learning in Convolutional Neural Networks with Shifted Exponential Linear Units (ShELUs)2018In: 2018 24th International Conference on Pattern Recognition (ICPR), IEEE, 2018, p. 517-522Conference paper (Refereed)
    Abstract [en]

    The Exponential Linear Unit (ELU) has been proven to speed up learning and improve the classification performance over activation functions such as ReLU and Leaky ReLU for convolutional neural networks. The reasons behind the improved behavior are that ELU reduces the bias shift, it saturates for large negative inputs and it is continuously differentiable. However, it remains open whether ELU has the optimal shape and we address the quest for a superior activation function.We use a new formulation to tune a piecewise linear activation function during training, to investigate the above question, and learn the shape of the locally optimal activation function. With this tuned activation function, the classification performance is improved and the resulting, learned activation function shows to be ELU-shaped irrespective if it is initialized as a RELU, LReLU or ELU. Interestingly, the learned activation function does not exactly pass through the origin indicating that a shifted ELU-shaped activation function is preferable. This observation leads us to introduce the Shifted Exponential Linear Unit (ShELU) as a new activation function.Experiments on Cifar-100 show that the classification performance is further improved when using the ShELU activation function in comparison with ELU. The improvement is achieved when learning an individual bias shift for each neuron.

  • 123.
    Grelsson, Bertil
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Performance boost in Convolutional Neural Networks by tuning shifted activation functions2017Report (Other academic)
    Abstract [en]

    The Exponential Linear Unit (ELU) has been proven to speed up learning and improve the classification performance over activation functions such as ReLU and Leaky ReLU for convolutional neural networks. The reasons behind the improved behavior are that ELU reduces the bias shift, it saturates for large negative inputs and it is continuously differentiable. However, it remains open whether ELU has the optimal shape and we address the quest for a superior activation function.

    We use a new formulation to tune a piecewise linear activation function during training, to investigate the above question, and learn the shape of the locally optimal activation function. With this tuned activation function, the classification performance is improved and the resulting, learned activation function shows to be ELU-shaped irrespective if it is initialized as a RELU, LReLU or ELU. Interestingly, the learned activation function does not exactly pass through the origin indicating that a shifted ELU-shaped activation function is preferable. This observation leads us to introduce the Shifted Exponential Linear Unit (ShELU) as a new activation function.

    Experiments on Cifar-100 show that the classification performance is further improved when using the ShELU activation function in comparison with ELU. The improvement is achieved when learning an individual bias shift for each neuron.

  • 124.
    Grelsson, Bertil
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images2013In: Image Analysis: 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings / [ed] Joni-Kristian Kämäräinen and Markus Koskela, Springer Berlin/Heidelberg, 2013, p. 478-488Conference paper (Refereed)
    Abstract [en]

    For navigation of unmanned aerial vehicles (UAVs), attitude estimation is essential. We present a method for attitude estimation (pitch and roll angle) from aerial fisheye images through horizon detection. The method is based on edge detection and a probabilistic Hough voting scheme.  In a flight scenario, there is often some prior knowledge of the vehicle altitude and attitude. We exploit this prior to make the attitude estimation more robust by letting the edge pixel votes be weighted based on the probability distributions for the altitude and pitch and roll angles. The method does not require any sky/ground segmentation as most horizon detection methods do. Our method has been evaluated on aerial fisheye images from the internet. The horizon is robustly detected in all tested images. The deviation in the attitude estimate between our automated horizon detection and a manual detection is less than 1 degree.

  • 125.
    Grelsson, Bertil
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Isaksson, Folke
    Efficient 7D Aerial Pose Estimation2013In: 2013 IEEE Workshop on Robot Vision (WORV), IEEE , 2013, p. 88-95Conference paper (Refereed)
    Abstract [en]

    A method for online global pose estimation of aerial images by alignment with a georeferenced 3D model is presented.Motion stereo is used to reconstruct a dense local height patch from an image pair. The global pose is inferred from the 3D transform between the local height patch and the model.For efficiency, the sought 3D similarity transform is found by least-squares minimizations of three 2D subproblems.The method does not require any landmarks or reference points in the 3D model, but an approximate initialization of the global pose, in our case provided by onboard navigation sensors, is assumed.Real aerial images from helicopter and aircraft flights are used to evaluate the method. The results show that the accuracy of the position and orientation estimates is significantly improved compared to the initialization and our method is more robust than competing methods on similar datasets.The proposed matching error computed between the transformed patch and the map clearly indicates whether a reliable pose estimate has been obtained.

  • 126.
    Grelsson, Bertil
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Saab Dynamics, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Isaksson, Folke
    Vricon Systems, Saab, Linköping, Sweden.
    Highly Accurate Attitude Estimation via Horizon Detection2016In: Journal of Field Robotics, ISSN 1556-4959, E-ISSN 1556-4967, Vol. 33, no 7, p. 967-993Article in journal (Refereed)
    Abstract [en]

    Attitude (pitch and roll angle) estimation from visual information is necessary for GPS-free navigation of airborne vehicles. We propose a highly accurate method to estimate the attitude by horizon detection in fisheye images. A Canny edge detector and a probabilistic Hough voting scheme are used to compute an approximate attitude and the corresponding horizon line in the image. Horizon edge pixels are extracted in a band close to the approximate horizon line. The attitude estimates are refined through registration of the extracted edge pixels with the geometrical horizon from a digital elevation map (DEM), in our case the SRTM3 database, extracted at a given approximate position. The proposed method has been evaluated using 1629 images from a flight trial with flight altitudes up to 600 m in an area with ground elevations ranging from sea level up to 500 m. Compared with the ground truth from a filtered inertial measurement unit (IMU)/GPS solution, the standard deviation for the pitch and roll angle errors obtained with 30 Mpixel images are 0.04° and 0.05°, respectively, with mean errors smaller than 0.02°. To achieve the high-accuracy attitude estimates, the ray refraction in the earth's atmosphere has been taken into account. The attitude errors obtained on real images are less or equal to those achieved on synthetic images for previous methods with DEM refinement, and the errors are about one order of magnitude smaller than for any previous vision-based method without DEM refinement.

  • 127.
    Grelsson, Bertil
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Robinson, Andreas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    HorizonNet for visual terrain navigation2018In: Proceedings of 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS), Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 149-155Conference paper (Refereed)
    Abstract [en]

    This paper investigates the problem of position estimation of unmanned surface vessels (USVs) operating in coastal areas or in the archipelago. We propose a position estimation method where the horizon line is extracted in a 360 degree panoramic image around the USV. We design a CNN architecture to determine an approximate horizon line in the image and implicitly determine the camera orientation (the pitch and roll angles). The panoramic image is warped to compensate for the camera orientation and to generate an image from an approximately level camera. A second CNN architecture is designed to extract the pixelwise horizon line in the warped image. The extracted horizon line is correlated with digital elevation model (DEM) data in the Fourier domain using a MOSSE correlation filter. Finally, we determine the location of the maximum correlation score over the search area to estimate the position of the USV. Comprehensive experiments are performed in a field trial in the archipelago. Our approach provides promising results by achieving position estimates with GPS-level accuracy.

  • 128.
    Grundström, Tobias
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Automated Measurements of Liver Fat Using Machine Learning2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The purpose of the thesis was to investigate the possibility of using machine learn-ing for automation of liver fat measurements in fat-water magnetic resonancei maging (MRI). The thesis presents methods for texture based liver classificationand Proton Density Fat Fraction (PDFF) regression using multi-layer perceptrons utilizing 2D and 3D textural image features. The first proposed method was a data classification method with the goal to distinguish between suitable andunsuitable regions to measure PDFF in. The second proposed method was a combined classification and regression method where the classification distinguishes between liver and non-liver tissue. The goal of the regression model was to predict the difference d = pdff mean − pdff ROI between the manual ground truth mean and the fat fraction of the active Region of Interest (ROI).Tests were performed on varying sizes of Image Feature Regions (froi) and combinations of image features on both of the proposed methods. The tests showed that 3D measurements using image features from discrete wavelet transforms produced measurements similar to the manual fat measurements. The first method resulted in lower relative errors while the second method had a higher method agreement compared to manual measurements.

  • 129.
    Grönlund, Jakob
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Johansson, Angelina
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Defect Detection and OCR on Steel2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In large scale productions of metal sheets, it is important to maintain an effective way to continuously inspect the products passing through the production line. The inspection mainly consists of detection of defects and tracking of ID numbers. This thesis investigates the possibilities to create an automatic inspection system by evaluating different machine learning algorithms for defect detection and optical character recognition (OCR) on metal sheet data. Digit recognition and defect detection are solved separately, where the former compares the object detection algorithm Faster R-CNN and the classical machine learning algorithm NCGF, and the latter is based on unsupervised learning using a convolutional autoencoder (CAE).

    The advantage of the feature extraction method is that it only needs a couple of samples to be able to classify new digits, which is desirable in this case due to the lack of training data. Faster R-CNN, on the other hand, needs much more training data to solve the same problem. NCGF does however fail to classify noisy images and images of metal sheets containing an alloy, while Faster R-CNN seems to be a more promising solution with a final mean average precision of 98.59%.

    The CAE approach for defect detection showed promising result. The algorithm learned how to only reconstruct images without defects, resulting in reconstruction errors whenever a defect appears. The errors are initially classified using a basic thresholding approach, resulting in a 98.9% accuracy. However, this classifier requires supervised learning, which is why the clustering algorithm Gaussian mixture model (GMM) is investigated as well. The result shows that it should be possible to use GMM, but that it requires a lot of GPU resources to use it in an end-to-end solution with a CAE.

  • 130.
    Gustafsson, Fredrik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Linder-Norén, Erik
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Automotive 3D Object Detection Without Target Domain Annotations2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In this thesis we study a perception problem in the context of autonomous driving. Specifically, we study the computer vision problem of 3D object detection, in which objects should be detected from various sensor data and their position in the 3D world should be estimated. We also study the application of Generative Adversarial Networks in domain adaptation techniques, aiming to improve the 3D object detection model's ability to transfer between different domains.

    The state-of-the-art Frustum-PointNet architecture for LiDAR-based 3D object detection was implemented and found to closely match its reported performance when trained and evaluated on the KITTI dataset. The architecture was also found to transfer reasonably well from the synthetic SYN dataset to KITTI, and is thus believed to be usable in a semi-automatic 3D bounding box annotation process. The Frustum-PointNet architecture was also extended to explicitly utilize image features, which surprisingly degraded its detection performance. Furthermore, an image-only 3D object detection model was designed and implemented, which was found to compare quite favourably with current state-of-the-art in terms of detection performance.

    Additionally, the PixelDA approach was adopted and successfully applied to the MNIST to MNIST-M domain adaptation problem, which validated the idea that unsupervised domain adaptation using Generative Adversarial Networks can improve the performance of a task network for a dataset lacking ground truth annotations. Surprisingly, the approach did however not significantly improve upon the performance of the image-based 3D object detection models when trained on the SYN dataset and evaluated on KITTI.

  • 131.
    Habrman, David
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Face Recognition with Preprocessing and Neural Networks2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Face recognition is the problem of identifying individuals in images. This thesis evaluates two methods used to determine if pairs of face images belong to the same individual or not. The first method is a combination of principal component analysis and a neural network and the second method is based on state-of-the-art convolutional neural networks. They are trained and evaluated using two different data sets. The first set contains many images with large variations in, for example, illumination and facial expression. The second consists of fewer images with small variations.

    Principal component analysis allowed the use of smaller networks. The largest network has 1.7 million parameters compared to the 7 million used in the convolutional network. The use of smaller networks lowered the training time and evaluation time significantly. Principal component analysis proved to be well suited for the data set with small variations outperforming the convolutional network which need larger data sets to avoid overfitting. The reduction in data dimensionality, however, led to difficulties classifying the data set with large variations. The generous amount of images in this set allowed the convolutional method to reach higher accuracies than the principal component method.

  • 132.
    Hanning, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Video Stabilization and Rolling Shutter Correction using Inertial Measurement Sensors2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Most mobile video-recording devices of today, e.g. cell phones and music players, make use of a rolling shutter camera. A rolling shutter camera captures video by recording every frame line-by-line from top to bottom of the image, leading to image distortions in situations where either the device or the target is moving. Recording video by hand also leads to visible frame-to-frame jitter.

    In this thesis, methods to decrease distortion caused by the motion of a video-recording device with a rolling shutter camera are presented. The methods are based on estimating the orientation of the camera from gyroscope and accelerometer measurements.

    The algorithms are implemented on the iPod Touch 4, and the resulting videos are compared to those of competing stabilization software, both commercial and free, in a series of blind experiments. The results from this user study shows that the methods presented in the thesis perform equal to or better than the others.

  • 133.
    Hanning, Gustav
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forslöw, Nicklas
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ringaby, Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Törnqvist, David
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Callmer, Jonas
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Stabilizing Cell Phone Video using Inertial Measurement Sensors2011In: The Second IEEE International Workshop on Mobile Vision, Barcelona Spain, 2011, p. 1-8Conference paper (Other academic)
    Abstract [en]

    We present a system that rectifies and stabilizes video sequences on mobile devices with rolling-shutter cameras. The system corrects for rolling-shutter distortions using measurements from accelerometer and gyroscope sensors, and a 3D rotational distortion model. In order to obtain a stabilized video, and at the same time keep most content in view, we propose an adaptive low-pass filter algorithm to obtain the output camera trajectory. The accuracy of the orientation estimates has been evaluated experimentally using ground truth data from a motion capture system. We have conducted a user study, where the output from our system, implemented in iOS, has been compared to that of three other applications, as well as to the uncorrected video. The study shows that users prefer our sensor-based system.

  • 134.
    Hansson, Niklas
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Color Features for Boosted Pedestrian Detection2015Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The car has increasingly become more and more intelligent throughout the years. Today's radar and vision based safety systems can warn a driver and brake the vehicle automatically if obstacles are detected. Research projects such as the Google Car have even succeeded in creating fully autonomous cars.

    The demands to obtain the highest rating in safety tests such as Euro NCAP are also steadily increasing, and as a result, the development of these systems have become more attractive for car manufacturers. In the near future, a car must have a system for detecting, and performing automatic braking for pedestrians to receive the highest safety rating of five stars. The prospect is that the volume of active safety system will increase drastically when the car manufacturers start installing them in not only luxury cars, but also in the regularly priced ones. The use of automatic braking comes with a high demand on the performance of active safety systems, false positives must be avoided at all costs.

    Dollar et al. [2014] introduced Aggregated Channel Features (ACF) which is based on a 10-channel LUV+HOG feature map. The method uses decision trees learned from boosting and has been shown to outperform previous algorithms in object detection tasks. The rediscovery of neural networks, and especially Convolutional Neural Networks (CNN) has increased the performance in almost every field of machine learning, including pedestrian detection. Recently Yang et al.[2015] combined the two approaches by using the the feature maps from a CNN as input to a decision tree based boosting framework. This resulted in state of the art performance on the challenging Caltech pedestrian data set.

    This thesis presents an approach to improve the performance of a cascade of boosted classifiers by investigating the impact of using color information for pedestrian detection. The color self similarity feature introduced by Walk et al.[2010] was used to create a version better adapted for boosting. This feature is then used in combination with a gradient based feature at the last step of a cascade.

    The presented feature increases the performance compared to currently used classifiers at Autoliv, on data recorded by Autoliv and on the benchmark Caltech pedestrian data set.

  • 135.
    Hatami, Sepehr
    et al.
    Swerea IVF AB, Mölndal, Sweden.
    Dahl-Jendelin, Anton
    Swerea IVF AB, Mölndal, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Nelsson, Claes
    Termisk Systemteknik AB, Linköping, Sweden.
    Selective Laser Melting Process Monitoring by Means of Thermography2018In: Proceedings of Euro Powder Metallurgy Congress (Euro PM), European Powder Metallurgy Association (EPMA) , 2018, article id 3957771Conference paper (Refereed)
    Abstract [en]

    Selective laser melting (SLM) enables production of highly intricate components. From this point of view, the capabilities of this technology are known to the industry and have been demonstrated in numerous applications. Nonetheless, for serial production purposes the manufacturing industry has so far been reluctant in substituting its conventional methods with SLM. One underlying reason is the lack of simple and reliable process monitoring methods. This study examines the feasibility of using thermography for process monitoring. To this end, an infra-red (IR) camera was mounted off-axis to monitor and record the temperature of every layer. The recorded temperature curves are analysed and interpreted with respect to different stages of the process. Furthermore, the possibility of detecting variations in laser settings by means of thermography is demonstrated. The results show that once thermal patterns are identified, this data can be utilized for in-process and post-process monitoring of SLM production.

  • 136.
    He, Linbo
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data: Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Semantic segmentation is a key approach to comprehensive image data analysis. It can be applied to analyze 2D images, videos, and even point clouds that contain 3D data points. On the first two problems, CNNs have achieved remarkable progress, but on point cloud segmentation, the results are less satisfactory due to challenges such as limited memory resource and difficulties in 3D point annotation. One of the research studies carried out by the Computer Vision Lab at Linköping University was aiming to ease the semantic segmentation of 3D point cloud. The idea is that by first projecting 3D data points to 2D space and then focusing only on the analysis of 2D images, we can reduce the overall workload for the segmentation process as well as exploit the existing well-developed 2D semantic segmentation techniques. In order to improve the performance of CNNs for 2D semantic segmentation, the study has used input data derived from different modalities. However, how different modalities can be optimally fused is still an open question. Based on the above-mentioned study, this thesis aims to improve the multistream framework architecture. More concretely, we investigate how different singlestream architectures impact the multistream framework with a given fusion method, and how different fusion methods contribute to the overall performance of a given multistream framework. As a result, our proposed fusion architecture outperformed all the investigated traditional fusion methods. Along with the best singlestream candidate and few additional training techniques, our final proposed multistream framework obtained a relative gain of 7.3\% mIoU compared to the baseline on the semantic3D point cloud test set, increasing the ranking from 12th to 5th position on the benchmark leaderboard.

  • 137.
    Hedborg, Johan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Motion and Structure Estimation From Video2012Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Digital camera equipped cell phones were introduced in Japan in 2001, they quickly became popular and by 2003 outsold the entire stand-alone digital camera market. In 2010 sales passed one billion units and the market is still growing. Another trend is the rising popularity of smartphones which has led to a rapid development of the processing power on a phone, and many units sold today bear close resemblance to a personal computer. The combination of a powerful processor and a camera which is easily carried in your pocket, opens up a large eld of interesting computer vision applications.

    The core contribution of this thesis is the development of methods that allow an imaging device such as the cell phone camera to estimates its own motion and to capture the observed scene structure. One of the main focuses of this thesis is real-time performance, where a real-time constraint does not only result in shorter processing times, but also allows for user interaction.

    In computer vision, structure from motion refers to the process of estimating camera motion and 3D structure by exploring the motion in the image plane caused by the moving camera. This thesis presents several methods for estimating camera motion. Given the assumption that a set of images has known camera poses associated to them, we train a system to solve the camera pose very fast for a new image. For the cases where no a priory information is available a fast minimal case solver is developed. The solver uses ve points in two camera views to estimate the cameras relative position and orientation. This type of minimal case solver is usually used within a RANSAC framework. In order to increase accuracy and performance a renement to the random sampling strategy of RANSAC is proposed. It is shown that the new scheme doubles the performance for the ve point solver used on video data. For larger systems of cameras a new Bundle Adjustment method is developed which are able to handle video from cell phones.

    Demands for reduction in size, power consumption and price has led to a redesign of the image sensor. As a consequence the sensors have changed from a global shutter to a rolling shutter, where a rolling shutter image is acquired row by row. Classical structure from motion methods are modeled on the assumption of a global shutter and a rolling shutter can severely degrade their performance. One of the main contributions of this thesis is a new Bundle Adjustment method for cameras with a rolling shutter. The method accurately models the camera motion during image exposure with an interpolation scheme for both position and orientation.

    The developed methods are not restricted to cellphones only, but is rather applicable to any type of mobile platform that is equipped with cameras, such as a autonomous car or a robot. The domestic robot comes in many  avors, everything from vacuum cleaners to service and pet robots. A robot equipped with a camera that is capable of estimating its own motion while sensing its environment, like the human eye, can provide an eective means of navigation for the robot. Many of the presented methods are well suited of robots, where low latency and real-time constraints are crucial in order to allow them to interact with their environment.

    List of papers
    1. Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization
    Open this publication in new window or tab >>Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization
    2007 (English)In: Journal of Real-Time Image Processing, ISSN 1861-8200, E-ISSN 1861-8219, Journal of real-time image processing, ISSN 1861-8200, Vol. 2, no 2-3, p. 103-115Article in journal (Refereed) Published
    Abstract [en]

    In this paper we propose a new approach to real-time view-based pose recognition and interpolation. Pose recognition is particularly useful for identifying camera views in databases, video sequences, video streams, and live recordings. All of these applications require a fast pose recognition process, in many cases video real-time. It should further be possible to extend the database with new material, i.e., to update the recognition system online. The method that we propose is based on P-channels, a special kind of information representation which combines advantages of histograms and local linear models. Our approach is motivated by its similarity to information representation in biological systems but its main advantage is its robustness against common distortions such as clutter and occlusion. The recognition algorithm consists of three steps: (1) low-level image features for color and local orientation are extracted in each point of the image; (2) these features are encoded into P-channels by combining similar features within local image regions; (3) the query P-channels are compared to a set of prototype P-channels in a database using a least-squares approach. The algorithm is applied in two scene registration experiments with fisheye camera data, one for pose interpolation from synthetic images and one for finding the nearest view in a set of real images. The method compares favorable to SIFT-based methods, in particular concerning interpolation. The method can be used for initializing pose-tracking systems, either when starting the tracking or when the tracking has failed and the system needs to re-initialize. Due to its real-time performance, the method can also be embedded directly into the tracking system, allowing a sensor fusion unit choosing dynamically between the frame-by-frame tracking and the pose recognition.

    Keywords
    computer vision
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-39505 (URN)10.1007/s11554-007-0044-y (DOI)49062 (Local ID)49062 (Archive number)49062 (OAI)
    Note
    Original Publication: Michael Felsberg and Johan Hedborg, Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization, 2007, Journal of real-time image processing, (2), 2-3, 103-115. http://dx.doi.org/10.1007/s11554-007-0044-y Copyright: Springer Science Business MediaAvailable from: 2009-10-10 Created: 2009-10-10 Last updated: 2017-12-13Bibliographically approved
    2. KLT Tracking Implementation on the GPU
    Open this publication in new window or tab >>KLT Tracking Implementation on the GPU
    2007 (English)In: Proceedings SSBA 2007 / [ed] Magnus Borga, Anders Brun and Michael Felsberg;, 2007Conference paper, Oral presentation only (Other academic)
    Abstract [en]

    The GPU is the main processing unit on a graphics card. A modern GPU typically provides more than ten times the computational power of an ordinary PC processor. This is a result of the high demands for speed and image quality in computer games. This paper investigates the possibility of exploiting this computational power for tracking points in image sequences. Tracking points is used in many computer vision tasks, such as tracking moving objects, structure from motion, face tracking etc. The algorithm was successfully implemented on the GPU and a large speed up was achieved.

    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-21602 (URN)
    Conference
    SSBA, Swedish Symposium in Image Analysis 2007, 14-15 March, Linköping, Sweden
    Available from: 2009-10-05 Created: 2009-10-05 Last updated: 2016-05-04
    3. Fast and Accurate Structure and Motion Estimation
    Open this publication in new window or tab >>Fast and Accurate Structure and Motion Estimation
    2009 (English)In: International Symposium on Visual Computing / [ed] George Bebis, Richard Boyle, Bahram Parvin, Darko Koracin, Yoshinori Kuno, Junxian Wang, Jun-Xuan Wang, Junxian Wang, Renato Pajarola and Peter Lindstrom et al., Berlin Heidelberg: Springer-Verlag , 2009, p. 211-222Conference paper, Oral presentation only (Refereed)
    Abstract [en]

    This paper describes a system for structure-and-motion estimation for real-time navigation and obstacle avoidance. We demonstrate it technique to increase the efficiency of the 5-point solution to the relative pose problem. This is achieved by a novel sampling scheme, where We add a distance constraint on the sampled points inside the RANSAC loop. before calculating the 5-point solution. Our setup uses the KLT tracker to establish point correspondences across tone in live video We also demonstrate how an early outlier rejection in the tracker improves performance in scenes with plenty of occlusions. This outlier rejection scheme is well Slated to implementation on graphics hardware. We evaluate the proposed algorithms using real camera sequences with fine-tuned bundle adjusted data as ground truth. To strenghten oar results we also evaluate using sequences generated by a state-of-the-art rendering software. On average we are able to reduce the number of RANSAC iterations by half and thereby double the speed.

    Place, publisher, year, edition, pages
    Berlin Heidelberg: Springer-Verlag, 2009
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743 ; Volume 5875
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-50624 (URN)10.1007/978-3-642-10331-5_20 (DOI)000278937300020 ()
    Conference
    5th International Symposium, ISVC 2009, November 30 - December 2, Las Vegas, NV, USA
    Projects
    DIPLECS
    Available from: 2009-10-13 Created: 2009-10-13 Last updated: 2016-05-04Bibliographically approved
    4. Fast Iterative Five point Relative Pose Estimation
    Open this publication in new window or tab >>Fast Iterative Five point Relative Pose Estimation
    (English)Manuscript (preprint) (Other academic)
    Abstract [en]

    Robust estimation of the relative pose between two cameras is a fundamental part of Structure and Motion methods. For calibrated cameras, the five point method together with a robust estimator such as RANSAC gives the best result in most cases. The current state-of-the-art method for solving the relative pose problem from five points is due to Nist´er [1], because it is faster than other methods and in the RANSAC scheme one can improve precision by increasing the number of iterations.

    In this paper, we propose a new iterative method, which is based on Powell’s Dog Leg algorithm. The new method has the same precision and is approximately twice as fast as Nist´er’s algorithm. The proposed algorithm is systematically evaluated on two types of datasets with known ground truth.

    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-76902 (URN)
    Available from: 2012-04-24 Created: 2012-04-24 Last updated: 2016-05-04Bibliographically approved
    5. Structure and Motion Estimation from Rolling Shutter Video
    Open this publication in new window or tab >>Structure and Motion Estimation from Rolling Shutter Video
    2011 (English)In: IEEE International Conference onComputer Vision Workshops (ICCV Workshops), 2011, IEEE Xplore , 2011, p. 17-23Conference paper, Published paper (Refereed)
    Abstract [en]

    The majority of consumer quality cameras sold today have CMOS sensors with rolling shutters. In a rolling shutter camera, images are read out row by row, and thus each row is exposed during a different time interval. A rolling-shutter exposure causes geometric image distortions when either the camera or the scene is moving, and this causes state-of-the-art structure and motion algorithms to fail. We demonstrate a novel method for solving the structure and motion problem for rolling-shutter video. The method relies on exploiting the continuity of the camera motion, both between frames, and across a frame. We demonstrate the effectiveness of our method by controlled experiments on real video sequences. We show, both visually and quantitatively, that our method outperforms standard structure and motion, and is more accurate and efficient than a two-step approach, doing image rectification and structure and motion.

    Place, publisher, year, edition, pages
    IEEE Xplore, 2011
    Keywords
    Structure and Motion, Rolling Shutter, Bundel Adjustment
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-75258 (URN)10.1109/ICCVW.2011.6130217 (DOI)978-1-4673-0062-9 (ISBN)
    Conference
    2nd IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2011,6-13 November,Barcelona, Spain
    Available from: 2012-03-01 Created: 2012-02-23 Last updated: 2018-01-12Bibliographically approved
    6. Rolling Shutter Bundle Adjustment
    Open this publication in new window or tab >>Rolling Shutter Bundle Adjustment
    2012 (English)Conference paper, Published paper (Refereed)
    Abstract [en]

    This paper introduces a bundle adjustment (BA) method that obtains accurate structure and motion from rolling shutter (RS) video sequences: RSBA. When a classical BA algorithm processes a rolling shutter video, the resultant camera trajectory is brittle, and complete failures are not uncommon. We exploit the temporal continuity of the camera motion to define residuals of image point trajectories with respect to the camera trajectory. We compare the camera trajectories from RSBA to those from classical BA, and from classical BA on rectified videos. The comparisons are done on real video sequences from an iPhone 4, with ground truth obtained from a global shutter camera, rigidly mounted to the iPhone 4. Compared to classical BA, the rolling shutter model requires just six extra parameters. It also degrades the sparsity of the system Jacobian slightly, but as we demonstrate, the increase in computation time is moderate. Decisive advantages are that RSBA succeeds in cases where competing methods diverge, and consistently produces more accurate results.

    Place, publisher, year, edition, pages
    IEEE Computer Society; 1999, 2012
    Series
    Computer Vision and Pattern Recognition, ISSN 1063-6919
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-76903 (URN)10.1109/CVPR.2012.6247831 (DOI)000309166201074 ()978-1-4673-1227-1 (ISBN)
    Conference
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012
    Projects
    VPS
    Available from: 2012-04-24 Created: 2012-04-24 Last updated: 2017-06-01Bibliographically approved
  • 138.
    Hedborg, Johan
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Fast Iterative Five point Relative Pose Estimation2013Conference paper (Refereed)
    Abstract [en]

    Robust estimation of the relative pose between two cameras is a fundamental part of Structure and Motion methods. For calibrated cameras, the five point method together with a robust estimator such as RANSAC gives the best result in most cases. The current state-of-the-art method for solving the relative pose problem from five points is due to Nistér [9], because it is faster than other methods and in the RANSAC scheme one can improve precision by increasing the number of iterations. In this paper, we propose a new iterative method, which is based on Powell's Dog Leg algorithm. The new method has the same precision and is approximately twice as fast as Nister's algorithm. The proposed method is easily extended to more than five points while retaining a efficient error metrics. This makes it also very suitable as an refinement step. The proposed algorithm is systematically evaluated on three types of datasets with known ground truth.

  • 139.
    Hedborg, Johan
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Fast Iterative Five point Relative Pose EstimationManuscript (preprint) (Other academic)
    Abstract [en]

    Robust estimation of the relative pose between two cameras is a fundamental part of Structure and Motion methods. For calibrated cameras, the five point method together with a robust estimator such as RANSAC gives the best result in most cases. The current state-of-the-art method for solving the relative pose problem from five points is due to Nist´er [1], because it is faster than other methods and in the RANSAC scheme one can improve precision by increasing the number of iterations.

    In this paper, we propose a new iterative method, which is based on Powell’s Dog Leg algorithm. The new method has the same precision and is approximately twice as fast as Nist´er’s algorithm. The proposed algorithm is systematically evaluated on two types of datasets with known ground truth.

  • 140.
    Hedborg, Johan
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ringaby, Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Rolling Shutter Bundle Adjustment2012Conference paper (Refereed)
    Abstract [en]

    This paper introduces a bundle adjustment (BA) method that obtains accurate structure and motion from rolling shutter (RS) video sequences: RSBA. When a classical BA algorithm processes a rolling shutter video, the resultant camera trajectory is brittle, and complete failures are not uncommon. We exploit the temporal continuity of the camera motion to define residuals of image point trajectories with respect to the camera trajectory. We compare the camera trajectories from RSBA to those from classical BA, and from classical BA on rectified videos. The comparisons are done on real video sequences from an iPhone 4, with ground truth obtained from a global shutter camera, rigidly mounted to the iPhone 4. Compared to classical BA, the rolling shutter model requires just six extra parameters. It also degrades the sparsity of the system Jacobian slightly, but as we demonstrate, the increase in computation time is moderate. Decisive advantages are that RSBA succeeds in cases where competing methods diverge, and consistently produces more accurate results.

  • 141.
    Hedborg, Johan
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ringaby, Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Structure and Motion Estimation from Rolling Shutter Video2011In: IEEE International Conference onComputer Vision Workshops (ICCV Workshops), 2011, IEEE Xplore , 2011, p. 17-23Conference paper (Refereed)
    Abstract [en]

    The majority of consumer quality cameras sold today have CMOS sensors with rolling shutters. In a rolling shutter camera, images are read out row by row, and thus each row is exposed during a different time interval. A rolling-shutter exposure causes geometric image distortions when either the camera or the scene is moving, and this causes state-of-the-art structure and motion algorithms to fail. We demonstrate a novel method for solving the structure and motion problem for rolling-shutter video. The method relies on exploiting the continuity of the camera motion, both between frames, and across a frame. We demonstrate the effectiveness of our method by controlled experiments on real video sequences. We show, both visually and quantitatively, that our method outperforms standard structure and motion, and is more accurate and efficient than a two-step approach, doing image rectification and structure and motion.

  • 142.
    Hedborg, Johan
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Robinson, Andreas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Robust Three-View Triangulation Done Fast2014In: Proceedings: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2014, IEEE , 2014, p. 152-157Conference paper (Refereed)
    Abstract [en]

    Estimating the position of a 3-dimensional world point given its 2-dimensional projections in a set of images is a key component in numerous computer vision systems. There are several methods dealing with this problem, ranging from sub-optimal, linear least square triangulation in two views, to finding the world point that minimized the L2-reprojection error in three views. This leads to the statistically optimal estimate under the assumption of Gaussian noise. In this paper we present a solution to the optimal triangulation in three views. The standard approach for solving the three-view triangulation problem is to find a closed-form solution. In contrast to this, we propose a new method based on an iterative scheme. The method is rigorously tested on both synthetic and real image data with corresponding ground truth, on a midrange desktop PC and a Raspberry Pi, a low-end mobile platform. We are able to improve the precision achieved by the closed-form solvers and reach a speed-up of two orders of magnitude compared to the current state-of-the-art solver. In numbers, this amounts to around 300K triangulations per second on the PC and 30K triangulations per second on Raspberry Pi.

  • 143.
    Heggenes, Jan
    et al.
    Department of Environmental and Health Sciences, University College of Southeast Norway, Bø i Telemark, Norway.
    Odland, Arvid
    Department of Environmental and Health Sciences, University College of Southeast Norway, Bø i Telemark, Norway.
    Chevalier, Tomas
    Scienvisic AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Bjerketvedt, Dag
    Department of Environmental and Health Sciences, University College of Southeast Norway, Bø i Telemark, Norway.
    Herbivore grazing—or trampling? Trampling effects by a large ungulate in cold high-latitude ecosystems2017In: Ecology and Evolution, ISSN 2045-7758, Vol. 7, no 16, p. 6423-6431Article in journal (Refereed)
    Abstract [en]

    Mammalian herbivores have important top-down effects on ecological processes and landscapes by generating vegetation changes through grazing and trampling. For free-ranging herbivores on large landscapes, trampling is an important ecological factor. However, whereas grazing is widely studied, low-intensity trampling is rarely studied and quantified. The cold-adapted northern tundra reindeer (Rangifer tarandus) is a wide-ranging keystone herbivore in large open alpine and Arctic ecosystems. Reindeer may largely subsist on different species of slow-growing ground lichens, particularly in winter. Lichen grows in dry, snow-poor habitats with frost. Their varying elasticity makes them suitable for studying trampling. In replicated factorial experiments, high-resolution 3D laser scanning was used to quantify lichen volume loss from trampling by a reindeer hoof. Losses were substantial, that is, about 0.3 dm3 per imprint in dry thick lichen, but depended on type of lichen mat and humidity. Immediate trampling volume loss was about twice as high in dry, compared to humid thin (2–3 cm), lichen mats and about three times as high in dry vs. humid thick (6–8 cm) lichen mats, There was no significant difference in volume loss between 100% and 50% wetted lichen. Regained volume with time was insignificant for dry lichen, whereas 50% humid lichen regained substantial volumes, and 100% humid lichen regained almost all lost volume, and mostly within 10–20 min. Reindeer trampling may have from near none to devastating effects on exposed lichen forage. During a normal week of foraging, daily moving 5 km across dry 6- to 8-cm-thick continuous lichen mats, one adult reindeer may trample a lichen volume corresponding to about a year's supply of lichen. However, the lichen humidity appears to be an important factor for trampling loss, in addition to the extent of reindeer movement.

  • 144.
    Heinemann, Christian
    et al.
    Forschungszentrum Jülich, Germany.
    Åström, Freddie
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Baravdish, George
    Linköping University, Department of Science and Technology, Communications and Transport Systems. Linköping University, The Institute of Technology.
    Krajsek, Kai
    Forschungszentrum Jülich, Germany.
    Felsberg, Michael
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Scharr, Hanno
    Forschungszentrum Jülich, Germany.
    Using Channel Representations in Regularization Terms: A Case Study on Image Diffusion2014In: Proceedings of the 9th International Conference on Computer Vision Theory and Applications, SciTePress, 2014, Vol. 1, p. 48-55Conference paper (Refereed)
    Abstract [en]

    In this work we propose a novel non-linear diffusion filtering approach for images based on their channel representation. To derive the diffusion update scheme we formulate a novel energy functional using a soft-histogram representation of image pixel neighborhoods obtained from the channel encoding. The resulting Euler-Lagrange equation yields a non-linear robust diffusion scheme with additional weighting terms stemming from the channel representation which steer the diffusion process. We apply this novel energy formulation to image reconstruction problems, showing good performance in the presence of mixtures of Gaussian and impulse-like noise, e.g. missing data. In denoising experiments of common scalar-valued images our approach performs competitive compared to other diffusion schemes as well as state-of-the-art denoising methods for the considered noise types.

  • 145.
    Heyden, Anders
    et al.
    Centre for Mathematical Sciences, Faculty of Engineering, LTH, Lund University.
    Laurendeau, DenisDépartement de génie électrique et de génie informatique, Université Laval, Québec, Canada.Felsberg, MichaelLinköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.Borga, MagnusLinköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Proceedings. 22nd International Conferenceon Pattern Recognition ICPR 2014, 24-28 August 2014, Stockholm, Sweden2014Conference proceedings (editor) (Refereed)
    Abstract [en]

    On behalf of the Organizing Committee, it is my honor and privilege to present the scientific program of the 22nd International Conference on Pattern Recognition. ICPR 2014 is hosted by the Swedish Society for Automated Image Analysis (SSBA) and supported by the universities of Linkoping, Lund and Uppsala.

    ICPR 2014 has five scientific tracks: Computer Vision; Pattern Recognition and Machine Learning; Image, Speech, Signal and Video Processing; Document Analysis, Biometrics and Pattern Recognition Applications; and Biomedical Image Analysis. For each track there is an Invited Speaker who will share their deep knowledge and experience with us. The perhaps most apparent novelty in this ICPR is the change from four to six paged papers, which is significantly more than a 50% increase in the actual content, disregarding the title, abstract and reference list. Our hope and belief is that this has improved the possibility for the reviewers to make well-justified evaluations of the manuscripts, and also improved the readability of the final papers and, as a consequence, improved the general quality of the accepted papers.

    The organization of ICPR 2014 would not have been possible without the generous contributions by our major partners, The City of Stockholm, SSBA, eSSENCE and SeRC. Also the financial contributions of our other partners and exhibitors as well as the technical co-sponsorship by IEEE Computer Society are gratefully acknowledged, and so is the support and advices from IAPR and the ICPR Liaison Committee. I also want to express my sincere gratitude to the Program and Publication Chairs, the Track Chairs, Area Chairs and all reviewers for their great efforts in putting this scientific program together. And, perhaps most of all, I want to thank all the contributing authors who filled it with contents of highest scientific quality. Finally, I would like to express my gratitude to all attendees. Without your presence, there simply wouldn't be any conference.

  • 146.
    Hillgren, Patrik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Geometric Scene Labeling for Long-Range Obstacle Detection2015Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Autonomous Driving or self driving vehicles are concepts of vehicles knowing their environment and making driving manoeuvres without instructions from a driver. The concepts have been around for decades but has improved significantly in the last years since research in this area has made significant progress. Benefits of autonomous driving include the possibility to decrease the number of accidents in traffic and thereby saving lives.

    A major challenge in autonomous driving is to acquire 3D information and relations between all objects in surrounding traffic. This is referred to as \textit{spatial perception}. Stereo camera systems have become a central sensor module for advanced driver assistance systems and autonomous driving. For object detection and measurements at large distances stereo vision encounter difficulties. This includes objects being small, having low contrast and the presence of image noise. Having an accurate perception of the environment at large distances is however of high interest for many applications, especially autonomous driving.

    This thesis proposes a method which tries to increase the range to where generic objects are first detected using a given stereo camera setup. Objects are represented by planes in 3D space. The input image is segmented into the various objects and the 3D plane parameters are estimated jointly. The 3D plane parameters are estimated directly from the stereo image pairs. In particular, this thesis investigates methods to introduce geometric constraints to the segmentation or labeling task, i.e assigning each considered pixel in the image to a plane.

    The methods provided in this thesis show that despite the difficulties at large distances it is possible to exploit planar primitives in 3D space for obstacle detection at distances where other methods fail.

  • 147.
    Holm Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Emilsson, Erika
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Missile approach warning using multi-spectral imagery2010Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Man portable air defence systems, MANPADS, pose a big threat to civilian and military aircraft. This thesis aims to find methods that could be used in a missile approach warning system based on infrared cameras.

    The two main tasks of the completed system are to classify the type of missile, and also to estimate its position and velocity from a sequence of images.

    The classification is based on hidden Markov models, one-class classifiers, and multi-class classifiers.

    Position and velocity estimation uses a model of the observed intensity as a function of real intensity, image coordinates, distance and missile orientation. The estimation is made by an extended Kalman filter.

    We show that fast classification of missiles based on radiometric data and a hidden Markov model is possible and works well, although more data would be needed to verify the results.

    Estimating the position and velocity works fairly well if the initial parameters are known. Unfortunately, some of these parameters can not be computed using the available sensor data.

  • 148.
    Holmquist, Karl
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    SLAMIt A Sub-Map Based SLAM System: On-line creation of multi-leveled map2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In many situations after a big catastrophe such as the one in Fukushima, the disaster area is highly dangerous for humans to enter. It is in such environments that a semi-autonomous robot could limit the risks to humans by exploring and mapping the area on its own. This thesis intends to design and implement a software based SLAM system which has potential to run in real-time using a Kinect 2 sensor as input.

    The focus of the thesis has been to create a system which allows for efficient storage and representation of the map, in order to be able to explore large environments. This is done by separating the map in different abstraction levels corresponding to local maps connected by a global map.

    During the implementation, this structure has been kept in mind in order to allow modularity. This makes it possible for each sub-component in the system to be exchanged if needed.

    The thesis is broad in the sense that it uses techniques from distinct areas to solve the sub-problems that exist. Some examples being, object detection and classification, point-cloud registration and efficient 3D-based occupancy trees.

  • 149.
    Holmquist, Karl
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Senel, Deniz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Computing a Collision-Free Path using the monogenic scale space2018In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2018, p. 8097-8102Conference paper (Refereed)
    Abstract [en]

    Mobile robots have been used for various purposes with different functionalities which require them to freely move in environments containing both static and dynamic obstacles to accomplish given tasks. One of the most relevant capabilities in terms of navigating a mobile robot in such an environment is to find a safe path to a goal position. This paper shows that there exists an accurate solution to the Laplace equation which allows finding a collision-free path and that it can be efficiently calculated for a rectangular bounded domain such as a map which is represented as an image. This is accomplished by the use of the monogenic scale space resulting in a vector field which describes the attracting and repelling forces from the obstacles and the goal. The method is shown to work in reasonably convex domains and by the use of tessellation of the environment map for non-convex environments.

  • 150.
    Hultberg, Johanna
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Dehazing of Satellite Images2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The aim of this work is to find a method for removing haze from satellite imagery. This is done by taking two algorithms developed for images taken from the sur- face of the earth and adapting them for satellite images. The two algorithms are Single Image Haze Removal Using Dark Channel Prior by He et al. and Color Im- age Dehazing Using the Near-Infrared by Schaul et al. Both algorithms, altered to fit satellite images, plus the combination are applied on four sets of satellite images. The results are compared with each other and the unaltered images. The evaluation is both qualitative, i.e. looking at the images, and quantitative using three properties: colorfulness, contrast and saturated pixels. Both the qualitative and the quantitative evaluation determined that using only the altered version of Dark Channel Prior gives the result with the least amount of haze and whose colors look most like reality. 

1234567 101 - 150 of 367
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf