liu.seSearch for publications in DiVA
Change search
Refine search result
1234567 101 - 150 of 467
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 101.
    Estgren, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Bone Fragment Segmentation Using Deep Interactive Object Selection2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In recent years semantic segmentation models utilizing Convolutional Neural Networks (CNN) have seen significant success for multiple different segmentation problems. Models such as U-Net have produced promising results within the medical field for both regular 2D and volumetric imaging, rivalling some of the best classical segmentation methods.

    In this thesis we examined the possibility of using a convolutional neural network-based model to perform segmentation of discrete bone fragments in CT-volumes with segmentation-hints provided by a user. We additionally examined different classical segmentation methods used in a post-processing refinement stage and their effect on the segmentation quality. We compared the performance of our model to similar approaches and provided insight into how the interactive aspect of the model affected the quality of the result.

    We found that the combined approach of interactive segmentation and deep learning produced results on par with some of the best methods presented, provided there were adequate amount of annotated training data. We additionally found that the number of segmentation hints provided to the model by the user significantly affected the quality of the result, with convergence of the result around 8 provided hints.

  • 102.
    Estgren, Martin
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Intergrated Computer systems.
    Lightweight User Agents2016Independent thesis Basic level (degree of Bachelor), 10,5 credits / 16 HE creditsStudent thesis
    Abstract [en]

    The unit for information security and IT architecture at The Swedish Defence Research Agency (FOI) conducts work with a cyber range called CRATE (Cyber Range and Training Environment). Currently, simulation of user activity involves scripts inside the simulated network. This solution is not ideal because of the traces it leaves in the system and the general lack of standardised GUI API between different operating systems. FOI are interested in testing the use of artificial user agent located outside the virtual environment using computer vision and the virtualisation API to execute actions and extract information from the system.

    This paper focuses on analysing the reliability of template matching, a computer vision algorithm used to localise objects in images using already identified images of said object as templates. The analysis will evaluate both the reliability of localising objects and the algorithms ability to correctly identify if an object is present in the virtual environment.

    Analysis of template matching is performed by first creating a prototype of the agent's sensory system and then simulate scenarios which the agent might encounter. By simulating the environment, testing parameters can be manipulated and monitored in a reliable way. The parameters manipulated involves both the amount and type of image noise in the template and screenshot, the agent’s discrimination threshold for what constitutes a positive match, and information about the template such as template generality.

    This paper presents the performance and reliability of the agent in regards to what type of image noise affects the result, the amount of correctly identified objects given different discrimination thresholds, and computational time of template matching when different image filters are applied. Furthermore the best cases for each study are presented as comparison for the other results.

    In the end of the thesis we present how for screenshots with objects very similar to the templates used by the agent, template matching can result in a high degree of accuracy in both object localization and object identification and that a small reduction of similarity between template and screenshot to reduce the agent's ability to reliably identifying specific objects in the environment.

  • 103.
    Fanani, Nolang
    et al.
    Goethe University of Frankfurt, Germany.
    Barnada, Marc
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Motion Priors Estimation for Robust Matching Initialization in Automotive Applications2015In: Advances in Visual Computing: 11th International Symposium, ISVC 2015, Las Vegas, NV, USA, December 14-16, 2015, Proceedings, Part I, SPRINGER INT PUBLISHING AG , 2015, Vol. 9474, p. 115-126Conference paper (Refereed)
    Abstract [en]

    Tracking keypoints through a video sequence is a crucial first step in the processing chain of many visual SLAM approaches. This paper presents a robust initialization method to provide the initial match for a keypoint tracker, from the 1st frame where a keypoint is detected to the 2nd frame, that is: when no depth information is available. We deal explicitly with the case of long displacements. The starting position is obtained through an optimization that employs a distribution of motion priors based on pyramidal phase correlation, and epipolar geometry constraints. Experiments on the KITTI dataset demonstrate the significant impact of applying a motion prior to the matching. We provide detailed comparisons to the state-of-the-art methods.

  • 104.
    Fanani, Nolang
    et al.
    Goethe University, Germany.
    Ochs, Matthias
    Goethe University, Germany.
    Bradler, Henry
    Goethe University, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University, Germany.
    Keypoint Trajectory Estimation Using Propagation Based Tracking2016In: 2016 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), IEEE , 2016, p. 933-939Conference paper (Refereed)
    Abstract [en]

    One of the major steps in visual environment perception for automotive applications is to track keypoints and to subsequently estimate egomotion and environment structure from the trajectories of these keypoints. This paper presents a propagation based tracking method to obtain the 2D trajectories of keypoints from a sequence of images in a monocular camera setup. Instead of relying on the classical RANSAC to obtain accurate keypoint correspondences, we steer the search for keypoint matches by means of propagating the estimated 3D position of the keypoint into the next frame and verifying the photometric consistency. In this process, we continuously predict, estimate and refine the frame-to-frame relative pose which induces the epipolar relation. Experiments on the KITTI dataset as well as on the synthetic COnGRATS dataset show promising results on the estimated courses and accurate keypoint trajectories.

  • 105.
    Fanani, Nolang
    et al.
    Goethe Univ, Germany.
    Stuerck, Alina
    Goethe Univ, Germany.
    Barnada, Marc
    Goethe Univ, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe Univ, Germany.
    Multimodal Scale Estimation for Monocular Visual Odometry2017In: 2017 28TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV 2017), IEEE , 2017, p. 1714-1721Conference paper (Refereed)
    Abstract [en]

    Monocular visual odometry / SLAM requires the ability to deal with the scale ambiguity problem, or equivalently to transform the estimated unscaled poses into correctly scaled poses. While propagating the scale from frame to frame is possible, it is very prone to the scale drift effect. We address the problem of monocular scale estimation by proposing a multimodal mechanism of prediction, classification, and correction. Our scale correction scheme combines cues from both dense and sparse ground plane estimation; this makes the proposed method robust towards varying availability and distribution of trackable ground structure. Instead of optimizing the parameters of the ground plane related homography, we parametrize and optimize the underlying motion parameters directly. Furthermore, we employ classifiers to detect scale outliers based on various features (e.g. moments on residuals). We test our method on the challenging KITTI dataset and show that the proposed method is capable to provide scale estimates that are on par with current state-of-the-art monocular methods without using bundle adjustment or RANSAC.

  • 106.
    Fanani, Nolang
    et al.
    Goethe University, Germany.
    Stuerck, Alina
    Goethe University, Germany.
    Ochs, Matthias
    Goethe University, Germany.
    Bradler, Henry
    Goethe University, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University, Germany.
    Predictive monocular odometry (PMO): What is possible without RANSAC and multiframe bundle adjustment?2017In: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 68Article in journal (Refereed)
    Abstract [en]

    Visual odometry using only a monocular camera faces more algorithmic challenges than stereo odometry. We present a robust monocular visual odometry framework for automotive applications. An extended propagation-based tracking framework is proposed which yields highly accurate (unscaled) pose estimates. Scale is supplied by ground plane pose estimation employing street pixel labeling using a convolutional neural network (CNN). The proposed framework has been extensively tested on the KITTI dataset and achieves a higher rank than current published state-of-the-art monocular methods in the KITTI odometry benchmark. Unlike other VO/SLAM methods, this result is achieved without loop closing mechanism, without RANSAC and also without multiframe bundle adjustment. Thus, we challenge the common belief that robust systems can only be built using iterative robustification tools like RANSAC. (C) 2017 Published by Elsevier B.V.

  • 107.
    Fanelli, Gabriele
    Linköping University, Department of Electrical Engineering.
    Facial Features Tracking using Active Appearance Models2006Independent thesis Advanced level (degree of Magister), 20 points / 30 hpStudent thesis
    Abstract [en]

    This thesis aims at building a system capable of automatically extracting and parameterizing the position of a face and its features in images acquired from a low-end monocular camera. Such a challenging task is justified by the importance and variety of its possible applications, ranging from face and expression recognition to animation of virtual characters using video depicting real actors. The implementation includes the construction of Active Appearance Models of the human face from training images. The existing face model Candide-3 is used as a starting point, making the translation of the tracking parameters to standard MPEG-4 Facial Animation Parameters easy.

    The Inverse Compositional Algorithm is employed to adapt the models to new images, working on a subspace where the appearance is "projected out" and thus focusing only on shape.

    The algorithm is tested on a generic model, aiming at tracking different people’s faces, and on a specific model, considering one person only. In the former case, the need for improvements in the robustness of the system is highlighted. By contrast, the latter case gives good results regarding both quality and speed, with real time performance being a feasible goal for future developments.

  • 108.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Autocorrelation-Driven Diffusion Filtering2011Data set
    Abstract [en]

    In this paper, we present a novel scheme for anisotropic diffusion driven by the image autocorrelation function. We show the equivalence of this scheme to a special case of iterated adaptive filtering. By determining the diffusion tensor field from an autocorrelation estimate, we obtain an evolution equation that is computed from a scalar product of diffusion tensor and the image Hessian. We propose further a set of filters to approximate the Hessian on a minimized spatial support. On standard benchmarks, the resulting method performs favorable in many cases, in particular at low noise levels. In a GPU implementation, video real-time performance is easily achieved.

  • 109.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Five years after the Deep Learning revolution of computer vision: State of the art methods for online image and video analysis2017Report (Other academic)
    Abstract [en]

    The purpose of this document is to reect on novel and upcoming methods for computer vision that might have relevance for application in robot vision and video analytics. The document covers many dierent sub-elds of computer vision, most of which have been addressed by our research activity at the computer vision laboratory. The report has been written based on a request of, and supported by, FOI.

  • 110.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Kristan, Matej
    University of Ljubljana, Slovenia.
    Matas, Jiri
    Czech Technical University, Czech Republic.
    Leonardis, Ales
    University of Birmingham, United Kingdom.
    Cehovin, Luka
    University of Ljubljana, Slovenia.
    Fernandez, Gustavo
    Austrian Institute of Technology, Austria.
    Vojır, Tomas
    Czech Technical University, Czech Republic.
    Nebehay, Georg
    Austrian Institute of Technology, Austria.
    Pflugfelder, Roman
    Austrian Institute of Technology, Austria.
    Lukezic, Alan
    University of Ljubljana, Slovenia.
    Garcia-Martin8, Alvaro
    Universidad Autonoma de Madrid, Spain.
    Saffari, Amir
    Affectv, United Kingdom.
    Li, Ang
    Xi’an Jiaotong University.
    Solıs Montero, Andres
    University of Ottawa, Canada.
    Zhao, Baojun
    Beijing Institute of Technology, China.
    Schmid, Cordelia
    INRIA Grenoble Rhˆone-Alpes, France.
    Chen, Dapeng
    Xi’an Jiaotong University.
    Du, Dawei
    University at Albany, USA.
    Shahbaz Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Porikli, Fatih
    Australian National University, Australia.
    Zhu, Gao
    Australian National University, Australia.
    Zhu, Guibo
    NLPR, Chinese Academy of Sciences, China.
    Lu, Hanqing
    NLPR, Chinese Academy of Sciences, China.
    Kieritz, Hilke
    Fraunhofer IOSB, Germany.
    Li, Hongdong
    Australian National University, Australia.
    Qi, Honggang
    University at Albany, USA.
    Jeong, Jae-chan
    Electronics and Telecommunications Research Institute, Korea.
    Cho, Jae-il
    Electronics and Telecommunications Research Institute, Korea.
    Lee, Jae-Yeong
    Electronics and Telecommunications Research Institute, Korea.
    Zhu, Jianke
    Zhejiang University, China.
    Li, Jiatong
    University of Technology, Australia.
    Feng, Jiayi
    Institute of Automation, Chinese Academy of Sciences, China.
    Wang, Jinqiao
    NLPR, Chinese Academy of Sciences, China.
    Kim, Ji-Wan
    Electronics and Telecommunications Research Institute, Korea.
    Lang, Jochen
    University of Ottawa, Canada.
    Martinez, Jose M.
    Universidad Aut´onoma de Madrid, Spain.
    Xue, Kai
    INRIA Grenoble Rhˆone-Alpes, France.
    Alahari, Karteek
    INRIA Grenoble Rhˆone-Alpes, France.
    Ma, Liang
    Harbin Engineering University, China.
    Ke, Lipeng
    University at Albany, USA.
    Wen, Longyin
    University at Albany, USA.
    Bertinetto, Luca
    Oxford University, United Kingdom.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Arens, Michael
    Fraunhofer IOSB, Germany.
    Tang, Ming
    Institute of Automation, Chinese Academy of Sciences, China.
    Chang, Ming-Ching
    University at Albany, USA.
    Miksik, Ondrej
    Oxford University, United Kingdom.
    Torr, Philip H S
    Oxford University, United Kingdom.
    Martin-Nieto, Rafael
    Universidad Aut´onoma de Madrid, Spain.
    Laganiere, Robert
    University of Ottawa, Canada.
    Hare, Sam
    Obvious Engineering, United Kingdom.
    Lyu, Siwei
    University at Albany, USA.
    Zhu, Song-Chun
    University of California, USA.
    Becker, Stefan
    Fraunhofer IOSB, Germany.
    Hicks, Stephen L
    Oxford University, United Kingdom.
    Golodetz, Stuart
    Oxford University, United Kingdom.
    Choi, Sunglok
    Electronics and Telecommunications Research Institute, Korea.
    Wu, Tianfu
    University of California, USA.
    Hubner, Wolfgang
    Fraunhofer IOSB, Germany.
    Zhao, Xu
    Institute of Automation, Chinese Academy of Sciences, China.
    Hua, Yang
    INRIA Grenoble Rhˆone-Alpes, France.
    Li, Yang
    Zhejiang University, China.
    Lu, Yang
    University of California, USA.
    Li, Yuezun
    University at Albany, USA.
    Yuan, Zejian
    Xi’an Jiaotong University.
    Hong, Zhibin
    University of Technology, Australia.
    The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results2015In: Proceedings of the IEEE International Conference on Computer Vision, Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 639-651Conference paper (Refereed)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking challenge 2015, VOTTIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply prelearned models of object appearance. VOT-TIR2015 is the first benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2015 challenge is based on the VOT2013 challenge, but introduces the following novelties: (i) the newly collected LTIR (Linköping TIR) dataset is used, (ii) the VOT2013 attributes are adapted to TIR data, (iii) the evaluation is performed using insights gained during VOT2013 and VOT2014 and is similar to VOT2015.

  • 111.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Heyden, AndersLund University, Lund, Sweden.Krüger, NorbertUniversity of Southern Denmark, Odense, Denmark.
    Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I2017Conference proceedings (editor) (Refereed)
    Abstract [en]

    The two volume set LNCS 10424 and 10425 constitutes the refereed proceedings of the 17th International Conference on Computer Analysis of Images and Patterns, CAIP 2017, held in Ystad, Sweden, in August 2017.

    The 72 papers presented were carefully reviewed and selected from 144 submissions The papers are organized in the following topical sections: Vision for Robotics; Motion and Tracking; Segmentation; Image/Video Indexing and Retrieval; Shape Representation and Analysis; Biomedical Image Analysis; Biometrics; Machine Learning; Image Restoration; and Poster Sessions.

  • 112.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Heyden, AndersLund University, Lund, Sweden.Krüger, NorbertUniversity of Southern Denmark, Odense, Denmark.
    Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II2017Conference proceedings (editor) (Refereed)
    Abstract [en]

    The two volume set LNCS 10424 and 10425 constitutes the refereed proceedings of the 17th International Conference on Computer Analysis of Images and Patterns, CAIP 2017, held in Ystad, Sweden, in August 2017.  The 72 papers presented were carefully reviewed and selected from 144 submissions The papers are organized in the following topical sections: Vision for Robotics; Motion and Tracking; Segmentation; Image/Video Indexing and Retrieval; Shape Representation and Analysis; Biomedical Image Analysis; Biometrics; Machine Learning; Image Restoration; and Poster Sessions.

  • 113.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Kristan, Matej
    University of Ljubljana, Slovenia.
    Matas, Jiri
    Czech Technical University, Czech Republic.
    Leonardis, Ales
    University of Birmingham, England.
    Pflugfelder, Roman
    Austrian Institute Technology, Austria.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Berg, Amanda
    Linköping University, Faculty of Science & Engineering. Linköping University, Department of Electrical Engineering, Computer Vision. Termisk Syst Tekn AB, Linkoping, Sweden.
    Eldesokey, Abdelrahman
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Syst Tekn AB, Linkoping, Sweden.
    Cehovin, Luka
    University of Ljubljana, Slovenia.
    Vojir, Tomas
    Czech Technical University, Czech Republic.
    Lukezic, Alan
    University of Ljubljana, Slovenia.
    Fernandez, Gustavo
    Austrian Institute Technology, Austria.
    Petrosino, Alfredo
    Parthenope University of Naples, Italy.
    Garcia-Martin, Alvaro
    University of Autonoma Madrid, Spain.
    Solis Montero, Andres
    University of Ottawa, Canada.
    Varfolomieiev, Anton
    Kyiv Polytech Institute, Ukraine.
    Erdem, Aykut
    Hacettepe University, Turkey.
    Han, Bohyung
    POSTECH, South Korea.
    Chang, Chang-Ming
    University of Albany, GA USA.
    Du, Dawei
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Erdem, Erkut
    Hacettepe University, Turkey.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Porikli, Fatih
    ARC Centre Excellence Robot Vis, Australia; CSIRO, Australia.
    Zhao, Fei
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Bunyak, Filiz
    University of Missouri, MO 65211 USA.
    Battistone, Francesco
    Parthenope University of Naples, Italy.
    Zhu, Gao
    University of Missouri, Columbia, USA.
    Seetharaman, Guna
    US Navy, DC 20375 USA.
    Li, Hongdong
    ARC Centre Excellence Robot Vis, Australia.
    Qi, Honggang
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Bischof, Horst
    Graz University of Technology, Austria.
    Possegger, Horst
    Graz University of Technology, Austria.
    Nam, Hyeonseob
    NAVER Corp, South Korea.
    Valmadre, Jack
    University of Oxford, England.
    Zhu, Jianke
    Zhejiang University, Peoples R China.
    Feng, Jiayi
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Lang, Jochen
    University of Ottawa, Canada.
    Martinez, Jose M.
    University of Autonoma Madrid, Spain.
    Palaniappan, Kannappan
    University of Missouri, MO 65211 USA.
    Lebeda, Karel
    University of Surrey, England.
    Gao, Ke
    University of Missouri, MO 65211 USA.
    Mikolajczyk, Krystian
    Imperial Coll London, England.
    Wen, Longyin
    University of Albany, GA USA.
    Bertinetto, Luca
    University of Oxford, England.
    Poostchi, Mahdieh
    University of Missouri, MO 65211 USA.
    Maresca, Mario
    Parthenope University of Naples, Italy.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Arens, Michael
    Fraunhofer IOSB, Germany.
    Tang, Ming
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Baek, Mooyeol
    POSTECH, South Korea.
    Fan, Nana
    Harbin Institute Technology, Peoples R China.
    Al-Shakarji, Noor
    University of Missouri, MO 65211 USA.
    Miksik, Ondrej
    University of Oxford, England.
    Akin, Osman
    Hacettepe University, Turkey.
    Torr, Philip H. S.
    University of Oxford, England.
    Huang, Qingming
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Martin-Nieto, Rafael
    University of Autonoma Madrid, Spain.
    Pelapur, Rengarajan
    University of Missouri, MO 65211 USA.
    Bowden, Richard
    University of Surrey, England.
    Laganiere, Robert
    University of Ottawa, Canada.
    Krah, Sebastian B.
    Fraunhofer IOSB, Germany.
    Li, Shengkun
    University of Albany, GA USA.
    Yao, Shizeng
    University of Missouri, MO 65211 USA.
    Hadfield, Simon
    University of Surrey, England.
    Lyu, Siwei
    University of Albany, GA USA.
    Becker, Stefan
    Fraunhofer IOSB, Germany.
    Golodetz, Stuart
    University of Oxford, England.
    Hu, Tao
    Australian National University, Australia; Chinese Academic Science, Peoples R China.
    Mauthner, Thomas
    Graz University of Technology, Austria.
    Santopietro, Vincenzo
    Parthenope University of Naples, Italy.
    Li, Wenbo
    Lehigh University, PA 18015 USA.
    Huebner, Wolfgang
    Fraunhofer IOSB, Germany.
    Li, Xin
    Harbin Institute Technology, Peoples R China.
    Li, Yang
    Zhejiang University, Peoples R China.
    Xu, Zhan
    Zhejiang University, Peoples R China.
    He, Zhenyu
    Harbin Institute Technology, Peoples R China.
    The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results2016In: Computer Vision – ECCV 2016 Workshops. ECCV 2016. / [ed] Hua G., Jégou H., SPRINGER INT PUBLISHING AG , 2016, p. 824-849Conference paper (Refereed)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking challenge 2016, VOT-TIR2016, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2016 is the second benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2016 challenge is similar to the 2015 challenge, the main difference is the introduction of new, more difficult sequences into the dataset. Furthermore, VOT-TIR2016 evaluation adopted the improvements regarding overlap calculation in VOT2016. Compared to VOT-TIR2015, a significant general improvement of results has been observed, which partly compensate for the more difficult sequences. The dataset, the evaluation kit, as well as the results are publicly available at the challenge website.

  • 114.
    Felsberg, Michael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Öfjäll, Kristoffer
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Lenz, Reiner
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Unbiased decoding of biologically motivated visual feature descriptors2015In: Frontiers in Robotics and AI, ISSN 2296-9144, Vol. 2, no 20Article in journal (Refereed)
    Abstract [en]

    Visual feature descriptors are essential elements in most computer and robot vision systems. They typically lead to an abstraction of the input data, images, or video, for further processing, such as clustering and machine learning. In clustering applications, the cluster center represents the prototypical descriptor of the cluster and estimates the corresponding signal value, such as color value or dominating flow orientation, by decoding the prototypical descriptor. Machine learning applications determine the relevance of respective descriptors and a visualization of the corresponding decoded information is very useful for the analysis of the learning algorithm. Thus decoding of feature descriptors is a relevant problem, frequently addressed in recent work. Also, the human brain represents sensorimotor information at a suitable abstraction level through varying activation of neuron populations. In previous work, computational models have been derived that agree with findings of neurophysiological experiments on the represen-tation of visual information by decoding the underlying signals. However, the represented variables have a bias toward centers or boundaries of the tuning curves. Despite the fact that feature descriptors in computer vision are motivated from neuroscience, the respec-tive decoding methods have been derived largely independent. From first principles, we derive unbiased decoding schemes for biologically motivated feature descriptors with a minimum amount of redundancy and suitable invariance properties. These descriptors establish a non-parametric density estimation of the underlying stochastic process with a particular algebraic structure. Based on the resulting algebraic constraints, we show formally how the decoding problem is formulated as an unbiased maximum likelihood estimator and we derive a recurrent inverse diffusion scheme to infer the dominating mode of the distribution. These methods are evaluated in experiments, where stationary points and bias from noisy image data are compared to existing methods.

  • 115.
    Feng, Bin
    et al.
    Information Networking Institute, Carnegie Mellon University, Pittsburgh, 15213, USA.
    Liu, Yang
    School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan, China.
    An Improved RRT Based Path Planning with Safe Navigation2014In: Proceedings from the 2013 International Symposium on Vehicle, Mechanical, and Electrical Engineering, ISVMEE 2013, Taiwan, China, 21-22 December, 2014, Vol. 494-495, p. 1080-1083Conference paper (Refereed)
    Abstract [en]

    Path planning has been a crucial problem in robotics research. Some algorithms, such as Probabilistic Roadmaps (PRM) and Rapidly-exploring Random Trees (RRT), have been proposed to tackle this problem. However, these algorithms have their own limitations when applying in a real-world domain, such as RoboCup Small-Size League (SSL) competition. This paper raises a novel improvement to the existing RRT algorithm to make it more applicable in real-world.

  • 116.
    Feng, Louis
    et al.
    University of California, Davis.
    Hotz, Ingrid
    Zuse Institue Berlin.
    Hamann, Bernd
    University of California, Davis, USA.
    Joy, Ken
    University of California, Davis, USA.
    Dense Glyph Sampling for Visualization2008In: Visualization and Processing of Tensor Fields: Advances and Perspectives / [ed] David Laidlaw, Joachim Weickert, Springer, 2008, p. 177-193Chapter in book (Refereed)
    Abstract [en]

    We present a simple and efficient approach to generate a dense set of anisotropic, spatially varying glyphs over a two-dimensional domain. Such glyph samples are useful for many visualization and graphics applications. The glyphs are embedded in a set of nonoverlapping ellipses whose size and density match a given anisotropic metric. An additional parameter controls the arrangement of the ellipses on lines, which can be favorable for some applications, for example, vector fields and distracting for others. To generate samples with the desired properties, we combine ideas from sampling theory and mesh generation. We start with constructing a first set of nonoverlapping ellipses whose distribution closely matches the underlying metric. This set of samples is used as input for a generalized anisotropic Lloyd relaxation to distribute samples more evenly.

  • 117.
    Foroughi Mobarakeh, Taraneh
    Linköping University, Department of Science and Technology.
    Analysis of RED ONE Digital Cinema Camera and RED Workflow2009Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    RED Digital Cinema is a rather new company that has developed a camera that has shaken the world of the film industry, the RED One camera. RED One is a digital cinema camera with the characteristics of a 35mm film camera. With a custom made 12 megapixel CMOS sensor it offers images with a filmic look that cannot be achieved with many other digital cinema cameras.

    With a new camera comes a new set of media files to work with, which brings new software applications supporting them. RED Digital Cinema has developed several applications of their own, but there are also a few other software supporting RED. However, as of today the way of working with the RED media files together with these software applications are yet in progress.

    During the short amount of time that RED One has existed, many questions has risen about what workflow is the best to use. This thesis presents a theoretical background of the RED camera and some software applications supporting RED media files. The main objective is to analyze RED material as well as existing workflows and find the optimal option.

  • 118.
    Forsberg, Daniel
    Linköping University, Department of Electrical Engineering.
    An efficient wavelet representation for large medical image stacks2007Independent thesis Advanced level (degree of Magister), 20 points / 30 hpStudent thesis
    Abstract [en]

    Like the rest of the society modern health care has to deal with the ever increasing information flow. Imaging modalities such as CT, MRI, US, SPECT and PET just keep producing more and more data. Especially CT and MRI and their 3D image stacks cause problems in terms of how to effectively handle these data sets. Usually a PACS is used to manage the information flow. Since a PACS often is implemented with a server-client setup, the management of these large data sets requires an efficient representation of medical image stacks that minimizes the amount of data transmitted between server and client and that efficiently supports the workflow of a practitioner.

    In this thesis an efficient wavelet representation for large medical image stacks is proposed for the use in a PACS. The representation supports features such as lossless viewing, random access, ROI-viewing, scalable resolution, thick slab viewing and progressive transmission. All of these features are believed to be essential to form an efficient tool for navigation and reconstruction of an image stack.

    The proposed wavelet representation has also been implemented and found to be better in terms of memory allocation and amount of data transmitted between server and client when compared to prior solutions. Performance tests of the implementation has also shown the proposed wavelet representation to have a good computational performance.

  • 119.
    Forslöw, Nicklas
    Linköping University, Department of Electrical Engineering, Automatic Control.
    Estimation and Adaptive Smoothing of Camera Orientations for Video Stabilization and Rolling Shutter Correction2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Most mobile video-recording devices of today, e.g. cell phones and music players, make use of a Rolling Shutter camera. The camera captures video by recording every frame line-by-line from top to bottom leading to image distortion when either the target or camera is moving. Capturing video by hand also leads to visible frame-to-frame jitter.

    This thesis presents algorithms for estimation of camera orientations using accelerometer and gyroscope. These estimates can be used to reduce the image distortion caused by camera motion using image processing. In addition an adaptive low pass filtering algorithm used to produce a smooth camera motion is presented. Using the smooth motion the frame-to-frame jitter can be reduced.

    The algorithms are implemented on the iPod 4 and two output videos are evaluated in a blind experiment with 30 participants. Here, videos are compared to those of competing video stabilization software. The results indicate that the iPod 4 application performs equal or better than its competitors. Also the iPod 4 accelerometer and gyroscope are compared to high end reference sensors in terms of bias and variance.

  • 120.
    Freddie, Åström
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Michael, Felsberg
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Reiner, Lenz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, The Institute of Technology.
    Color Persistent Anisotropic Diffusion of Images2011In: Image Analysis / [ed] Anders Heyden, Fredrik Kahl, Heidelberg: Springer, 2011, p. 262-272Conference paper (Refereed)
    Abstract [en]

    Techniques from the theory of partial differential equations are often used to design filter methods that are locally adapted to the image structure. These techniques are usually used in the investigation of gray-value images. The extension to color images is non-trivial, where the choice of an appropriate color space is crucial. The RGB color space is often used although it is known that the space of human color perception is best described in terms of non-euclidean geometry, which is fundamentally different from the structure of the RGB space. Instead of the standard RGB space, we use a simple color transformation based on the theory of finite groups. It is shown that this transformation reduces the color artifacts originating from the diffusion processes on RGB images. The developed algorithm is evaluated on a set of real-world images, and it is shown that our approach exhibits fewer color artifacts compared to state-of-the-art techniques. Also, our approach preserves details in the image for a larger number of iterations.

  • 121.
    Fridborn, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Reading Barcodes with Neural Networks2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Barcodes are ubiquituous in modern society and they have had industrial application for decades. However, for noisy images modern methods can underperform. Poor lighting conditions, occlusions and low resolution can be problematic in decoding. This thesis aims to solve this problem by using neural networks, which have enjoyed great success in many computer vision competitions the last years. We investigate how three different networks perform on data sets with noisy images. The first network is a single classifier, the second network is an ensemble classifier and the third is based on a pre-trained feature extractor. For comparison, we also test two baseline methods that are used in industry today. We generate training data using software and modify it to ensure proper generalization. Testing data is created by photographing barcodes in different settings, creating six image classes - normal, dark, white, rotated, occluded and wrinkled. The proposed single classifier and ensemble classifier outperform the baseline as well as the pre-trained feature extractor by a large margin. The thesis work was performed at SICK IVP, a machine vision company in Linköping in 2017.

  • 122.
    Gasslander, Maja
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Segmentation of Clouds in Satellite Images2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The usage of 3D modelling is increasing fast, both for civilian and military areas, such as navigation, targeting and urban planning. When creating a 3D model from satellite images, clouds canbe problematic. Thus, automatic detection ofclouds inthe imagesis ofgreat use. This master thesis was carried out at Vricon, who produces 3D models of the earth from satellite images.This thesis aimed to investigate if Support Vector Machines could classify pixels into cloud or non-cloud, with a combination of texture and color as features. To solve the stated goal, the task was divided into several subproblems, where the first part was to extract features from the images. Then the images were preprocessed before fed to the classifier. After that, the classifier was trained, and finally evaluated.The two methods that gave the best results in this thesis had approximately 95 % correctly classified pixels. This result is better than the existing cloud segmentation method at Vricon, for the tested terrain and cloud types.

  • 123.
    Gladh, Susanna
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Danelljan, Martin
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Khan, Fahad Shahbaz
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Felsberg, Michael
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Deep motion features for visual tracking2016In: Proceedings of the 23rd International Conference on, Pattern Recognition (ICPR), 2016, Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 1243-1248Conference paper (Refereed)
    Abstract [en]

    Robust visual tracking is a challenging computer vision problem, with many real-world applications. Most existing approaches employ hand-crafted appearance features, such as HOG or Color Names. Recently, deep RGB features extracted from convolutional neural networks have been successfully applied for tracking. Despite their success, these features only capture appearance information. On the other hand, motion cues provide discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. This paper presents an investigation of the impact of deep motion features in a tracking-by-detection framework. We further show that hand-crafted, deep RGB, and deep motion features contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly suggest that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.

  • 124.
    Granlund, Gösta H.
    Linköping University, Department of Electrical Engineering. Linköping University, The Institute of Technology.
    A Nonlinear, Image-content Dependent Measure of Image Quality1977Report (Other academic)
    Abstract [en]

    In recent years, considerable research effort has been devoted to the development of useful descriptors for image quality. The attempts have been hampered by i n complete understanding of the operation of the human visual system. This has made it difficult to relate physical measures and perceptual traits.

    A new model for determination of image quality is proposed. Its main feature is that it tries to invoke image content into consideration. The model builds upon a theory of image linearization, which means that the information in an image can well enough be represented using linear segments or structures within local spatial regions and frequency ranges. This implies a l so a suggestion that information in an image has to do with one- dimensional correlations. This gives a possibility to separate image content from noise in images, and measure them both.

    Also a hypothesis is proposed that the visual system of humans does in fact perform such a linearization.

  • 125.
    Gratorp, Christina
    Linköping University, Department of Electrical Engineering.
    Bitrate smooting: a study on traffic shaping and -analysis in data networks2007Independent thesis Basic level (professional degree), 20 points / 30 hpStudent thesis
    Abstract [sv]

    Examensarbetet bakom denna rapport utgör en undersökande studie om hur transmission av mediadata i nätverk kan göras effektivare. Det kan åstadkommas genom att viss tilläggsinformation avsedd för att jämna ut datatakten adderas i det realtidsprotokoll, Real Time Protocol, som används för strömmande media. Genom att försöka skicka lika mycket data under alla konsekutiva tidsintervall i sessionen kommer datatakten vid en godtycklig tidpunkt med större sannolikhet att vara densamma som vid tidigare klockslag. En streamingserver kan tolka, hantera och skicka data vidare enligt instruktionerna i protokollets sidhuvud. Datatakten jämnas ut genom att i förtid, under tidsintervall som innehåller mindre data, skicka även senare data i strömmen. Resultatet av detta är en utjämnad datataktskurva som i sin tur leder till en jämnare användning av nätverkskapaciteten.

    Arbetet inkluderar en översiktlig analys av beteendet hos strömmande media, bakgrundsteori om filkonstruktion och nätverksteknologier samt ett förslag på hur mediafiler kan modifieras för att uppfylla syftet med examensarbetet. Resultat och diskussion kan förhoppningsvis användas som underlag för en framtida implementation av en applikation ämnad att förbättra trafikflöden över nätverk.

  • 126.
    Grelsson, Bertil
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Global Pose Estimation from Aerial Images: Registration with Elevation Models2014Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Over the last decade, the use of unmanned aerial vehicles (UAVs) has increased drastically. Originally, the use of these aircraft was mainly military, but today many civil applications have emerged. UAVs are frequently the preferred choice for surveillance missions in disaster areas, after earthquakes or hurricanes, and in hazardous environments, e.g. for detection of nuclear radiation. The UAVs employed in these missions are often relatively small in size which implies payload restrictions.

    For navigation of the UAVs, continuous global pose (position and attitude) estimation is mandatory. Cameras can be fabricated both small in size and light in weight. This makes vision-based methods well suited for pose estimation onboard these vehicles. It is obvious that no single method can be used for pose estimation in all dierent phases throughout a ight. The image content will be very dierent on the runway, during ascent, during  ight at low or high altitude, above urban or rural areas, etc. In total, a multitude of pose estimation methods is required to handle all these situations. Over the years, a large number of vision-based pose estimation methods for aerial images have been developed. But there are still open research areas within this eld, e.g. the use of omnidirectional images for pose estimation is relatively unexplored.

    The contributions of this thesis are three vision-based methods for global egopositioning and/or attitude estimation from aerial images. The rst method for full 6DoF (degrees of freedom) pose estimation is based on registration of local height information with a geo-referenced 3D model. A dense local height map is computed using motion stereo. A pose estimate from navigation sensors is used as an initialization. The global pose is inferred from the 3D similarity transform between the local height map and the 3D model. Aligning height information is assumed to be more robust to season variations than feature matching in a single-view based approach.

    The second contribution is a method for attitude (pitch and roll angle) estimation via horizon detection. It is one of only a few methods in the literature that use an omnidirectional (sheye) camera for horizon detection in aerial images. The method is based on edge detection and a probabilistic Hough voting scheme. In a  ight scenario, there is often some knowledge on the probability density for the altitude and the attitude angles. The proposed method allows this prior information to be used to make the attitude estimation more robust.

    The third contribution is a further development of method two. It is the very rst method presented where the attitude estimates from the detected horizon in omnidirectional images is rened through registration with the geometrically expected horizon from a digital elevation model. It is one of few methods where the ray refraction in the atmosphere is taken into account, which contributes to the highly accurate pose estimates. The attitude errors obtained are about one order of magnitude smaller than for any previous vision-based method for attitude estimation from horizon detection in aerial images.

    List of papers
    1. Efficient 7D Aerial Pose Estimation
    Open this publication in new window or tab >>Efficient 7D Aerial Pose Estimation
    2013 (English)In: 2013 IEEE Workshop on Robot Vision (WORV), IEEE , 2013, p. 88-95Conference paper, Published paper (Refereed)
    Abstract [en]

    A method for online global pose estimation of aerial images by alignment with a georeferenced 3D model is presented.Motion stereo is used to reconstruct a dense local height patch from an image pair. The global pose is inferred from the 3D transform between the local height patch and the model.For efficiency, the sought 3D similarity transform is found by least-squares minimizations of three 2D subproblems.The method does not require any landmarks or reference points in the 3D model, but an approximate initialization of the global pose, in our case provided by onboard navigation sensors, is assumed.Real aerial images from helicopter and aircraft flights are used to evaluate the method. The results show that the accuracy of the position and orientation estimates is significantly improved compared to the initialization and our method is more robust than competing methods on similar datasets.The proposed matching error computed between the transformed patch and the map clearly indicates whether a reliable pose estimate has been obtained.

    Place, publisher, year, edition, pages
    IEEE, 2013
    Keywords
    Pose estimation, aerial images, registration, 3D model
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-89477 (URN)10.1109/WORV.2013.6521919 (DOI)000325279400014 ()978-1-4673-5646-6 (ISBN)978-1-4673-5647-3 (ISBN)
    Conference
    IEEE Workshop on Robot Vision 2013, Clearwater Beach, Florida, USA, January 16-17, 2013
    Available from: 2013-02-26 Created: 2013-02-26 Last updated: 2019-04-12
    2. Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images
    Open this publication in new window or tab >>Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images
    2013 (English)In: Image Analysis: 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings / [ed] Joni-Kristian Kämäräinen and Markus Koskela, Springer Berlin/Heidelberg, 2013, p. 478-488Conference paper, Published paper (Refereed)
    Abstract [en]

    For navigation of unmanned aerial vehicles (UAVs), attitude estimation is essential. We present a method for attitude estimation (pitch and roll angle) from aerial fisheye images through horizon detection. The method is based on edge detection and a probabilistic Hough voting scheme.  In a flight scenario, there is often some prior knowledge of the vehicle altitude and attitude. We exploit this prior to make the attitude estimation more robust by letting the edge pixel votes be weighted based on the probability distributions for the altitude and pitch and roll angles. The method does not require any sky/ground segmentation as most horizon detection methods do. Our method has been evaluated on aerial fisheye images from the internet. The horizon is robustly detected in all tested images. The deviation in the attitude estimate between our automated horizon detection and a manual detection is less than 1 degree.

    Place, publisher, year, edition, pages
    Springer Berlin/Heidelberg, 2013
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 7944
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-98066 (URN)10.1007/978-3-642-38886-6_45 (DOI)000342988500045 ()978-3-642-38885-9 (ISBN)978-3-642-38886-6 (ISBN)
    Conference
    18th Scandinavian Conferences on Image Analysis (SCIA 2013), 17-20 June 2013, Espoo, Finland.
    Projects
    CIMSMAP
    Available from: 2013-09-27 Created: 2013-09-27 Last updated: 2019-04-12Bibliographically approved
    3. Highly Accurate Attitude Estimation via Horizon Detection
    Open this publication in new window or tab >>Highly Accurate Attitude Estimation via Horizon Detection
    2016 (English)In: Journal of Field Robotics, ISSN 1556-4959, E-ISSN 1556-4967, Vol. 33, no 7, p. 967-993Article in journal (Refereed) Published
    Abstract [en]

    Attitude (pitch and roll angle) estimation from visual information is necessary for GPS-free navigation of airborne vehicles. We propose a highly accurate method to estimate the attitude by horizon detection in fisheye images. A Canny edge detector and a probabilistic Hough voting scheme are used to compute an approximate attitude and the corresponding horizon line in the image. Horizon edge pixels are extracted in a band close to the approximate horizon line. The attitude estimates are refined through registration of the extracted edge pixels with the geometrical horizon from a digital elevation map (DEM), in our case the SRTM3 database, extracted at a given approximate position. The proposed method has been evaluated using 1629 images from a flight trial with flight altitudes up to 600 m in an area with ground elevations ranging from sea level up to 500 m. Compared with the ground truth from a filtered inertial measurement unit (IMU)/GPS solution, the standard deviation for the pitch and roll angle errors obtained with 30 Mpixel images are 0.04° and 0.05°, respectively, with mean errors smaller than 0.02°. To achieve the high-accuracy attitude estimates, the ray refraction in the earth's atmosphere has been taken into account. The attitude errors obtained on real images are less or equal to those achieved on synthetic images for previous methods with DEM refinement, and the errors are about one order of magnitude smaller than for any previous vision-based method without DEM refinement.

    Place, publisher, year, edition, pages
    John Wiley & Sons, 2016
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-108212 (URN)10.1002/rob.21639 (DOI)000387925400005 ()
    Note

    At the date of the thesis presentation was this publication a manuscript.

    Funding agencies: Swedish Governmental Agency for Innovation Systems, VINNOVA [NFFP5 2013-05243]; Swedish Foundation for Strategic Research [RIT10-0047]; Swedish Research Council within the Linnaeus environment CADICS; Knut and Alice Wallenberg Foundation

    Available from: 2014-06-26 Created: 2014-06-26 Last updated: 2019-04-12Bibliographically approved
  • 127.
    Grelsson, Bertil
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Vision-based Localization and Attitude Estimation Methods in Natural Environments2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Over the last decade, the usage of unmanned systems such as Unmanned Aerial Vehicles (UAVs), Unmanned Surface Vessels (USVs) and Unmanned Ground Vehicles (UGVs) has increased drastically, and there is still a rapid growth. Today, unmanned systems are being deployed in many daily operations, e.g. for deliveries in remote areas, to increase efficiency of agriculture, and for environmental monitoring at sea. For safety reasons, unmanned systems are often the preferred choice for surveillance missions in hazardous environments, e.g. for detection of nuclear radiation, and in disaster areas after earthquakes, hurricanes, or during forest fires. For safe navigation of the unmanned systems during their missions, continuous and accurate global localization and attitude estimation is mandatory.

    Over the years, many vision-based methods for position estimation have been developed, primarily for urban areas. In contrast, this thesis is mainly focused on vision-based methods for accurate position and attitude estimates in natural environments, i.e. beyond the urban areas. Vision-based methods possess several characteristics that make them appealing as global position and attitude sensors. First, vision sensors can be realized and tailored for most unmanned vehicle applications. Second, geo-referenced terrain models can be generated worldwide from satellite imagery and can be stored onboard the vehicles. In natural environments, where the availability of geo-referenced images in general is low, registration of image information with terrain models is the natural choice for position and attitude estimation. This is the problem area that I addressed in the contributions of this thesis.

    The first contribution is a method for full 6DoF (degrees of freedom) pose estimation from aerial images. A dense local height map is computed using structure from motion. The global pose is inferred from the 3D similarity transform between the local height map and a digital elevation model. Aligning height information is assumed to be more robust to season variations than feature-based matching.

    The second contribution is a method for accurate attitude (pitch and roll angle) estimation via horizon detection. It is one of only a few methods that use an omnidirectional (fisheye) camera for horizon detection in aerial images. The method is based on edge detection and a probabilistic Hough voting scheme. The method allows prior knowledge of the attitude angles to be exploited to make the initial attitude estimates more robust. The estimates are then refined through registration with the geometrically expected horizon line from a digital elevation model. To the best of our knowledge, it is the first method where the ray refraction in the atmosphere is taken into account, which enables the highly accurate attitude estimates.

    The third contribution is a method for position estimation based on horizon detection in an omnidirectional panoramic image around a surface vessel. Two convolutional neural networks (CNNs) are designed and trained to estimate the camera orientation and to segment the horizon line in the image. The MOSSE correlation filter, normally used in visual object tracking, is adapted to horizon line registration with geometric data from a digital elevation model. Comprehensive field trials conducted in the archipelago demonstrate the GPS-level accuracy of the method, and that the method can be trained on images from one region and then applied to images from a previously unvisited test area.

    The CNNs in the third contribution apply the typical scheme of convolutions, activations, and pooling. The fourth contribution focuses on the activations and suggests a new formulation to tune and optimize a piecewise linear activation function during training of CNNs. Improved classification results from experiments when tuning the activation function led to the introduction of a new activation function, the Shifted Exponential Linear Unit (ShELU).

    List of papers
    1. Efficient 7D Aerial Pose Estimation
    Open this publication in new window or tab >>Efficient 7D Aerial Pose Estimation
    2013 (English)In: 2013 IEEE Workshop on Robot Vision (WORV), IEEE , 2013, p. 88-95Conference paper, Published paper (Refereed)
    Abstract [en]

    A method for online global pose estimation of aerial images by alignment with a georeferenced 3D model is presented.Motion stereo is used to reconstruct a dense local height patch from an image pair. The global pose is inferred from the 3D transform between the local height patch and the model.For efficiency, the sought 3D similarity transform is found by least-squares minimizations of three 2D subproblems.The method does not require any landmarks or reference points in the 3D model, but an approximate initialization of the global pose, in our case provided by onboard navigation sensors, is assumed.Real aerial images from helicopter and aircraft flights are used to evaluate the method. The results show that the accuracy of the position and orientation estimates is significantly improved compared to the initialization and our method is more robust than competing methods on similar datasets.The proposed matching error computed between the transformed patch and the map clearly indicates whether a reliable pose estimate has been obtained.

    Place, publisher, year, edition, pages
    IEEE, 2013
    Keywords
    Pose estimation, aerial images, registration, 3D model
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-89477 (URN)10.1109/WORV.2013.6521919 (DOI)000325279400014 ()978-1-4673-5646-6 (ISBN)978-1-4673-5647-3 (ISBN)
    Conference
    IEEE Workshop on Robot Vision 2013, Clearwater Beach, Florida, USA, January 16-17, 2013
    Available from: 2013-02-26 Created: 2013-02-26 Last updated: 2019-04-12
    2. Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images
    Open this publication in new window or tab >>Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images
    2013 (English)In: Image Analysis: 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings / [ed] Joni-Kristian Kämäräinen and Markus Koskela, Springer Berlin/Heidelberg, 2013, p. 478-488Conference paper, Published paper (Refereed)
    Abstract [en]

    For navigation of unmanned aerial vehicles (UAVs), attitude estimation is essential. We present a method for attitude estimation (pitch and roll angle) from aerial fisheye images through horizon detection. The method is based on edge detection and a probabilistic Hough voting scheme.  In a flight scenario, there is often some prior knowledge of the vehicle altitude and attitude. We exploit this prior to make the attitude estimation more robust by letting the edge pixel votes be weighted based on the probability distributions for the altitude and pitch and roll angles. The method does not require any sky/ground segmentation as most horizon detection methods do. Our method has been evaluated on aerial fisheye images from the internet. The horizon is robustly detected in all tested images. The deviation in the attitude estimate between our automated horizon detection and a manual detection is less than 1 degree.

    Place, publisher, year, edition, pages
    Springer Berlin/Heidelberg, 2013
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 7944
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-98066 (URN)10.1007/978-3-642-38886-6_45 (DOI)000342988500045 ()978-3-642-38885-9 (ISBN)978-3-642-38886-6 (ISBN)
    Conference
    18th Scandinavian Conferences on Image Analysis (SCIA 2013), 17-20 June 2013, Espoo, Finland.
    Projects
    CIMSMAP
    Available from: 2013-09-27 Created: 2013-09-27 Last updated: 2019-04-12Bibliographically approved
    3. Highly Accurate Attitude Estimation via Horizon Detection
    Open this publication in new window or tab >>Highly Accurate Attitude Estimation via Horizon Detection
    2016 (English)In: Journal of Field Robotics, ISSN 1556-4959, E-ISSN 1556-4967, Vol. 33, no 7, p. 967-993Article in journal (Refereed) Published
    Abstract [en]

    Attitude (pitch and roll angle) estimation from visual information is necessary for GPS-free navigation of airborne vehicles. We propose a highly accurate method to estimate the attitude by horizon detection in fisheye images. A Canny edge detector and a probabilistic Hough voting scheme are used to compute an approximate attitude and the corresponding horizon line in the image. Horizon edge pixels are extracted in a band close to the approximate horizon line. The attitude estimates are refined through registration of the extracted edge pixels with the geometrical horizon from a digital elevation map (DEM), in our case the SRTM3 database, extracted at a given approximate position. The proposed method has been evaluated using 1629 images from a flight trial with flight altitudes up to 600 m in an area with ground elevations ranging from sea level up to 500 m. Compared with the ground truth from a filtered inertial measurement unit (IMU)/GPS solution, the standard deviation for the pitch and roll angle errors obtained with 30 Mpixel images are 0.04° and 0.05°, respectively, with mean errors smaller than 0.02°. To achieve the high-accuracy attitude estimates, the ray refraction in the earth's atmosphere has been taken into account. The attitude errors obtained on real images are less or equal to those achieved on synthetic images for previous methods with DEM refinement, and the errors are about one order of magnitude smaller than for any previous vision-based method without DEM refinement.

    Place, publisher, year, edition, pages
    John Wiley & Sons, 2016
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-108212 (URN)10.1002/rob.21639 (DOI)000387925400005 ()
    Note

    At the date of the thesis presentation was this publication a manuscript.

    Funding agencies: Swedish Governmental Agency for Innovation Systems, VINNOVA [NFFP5 2013-05243]; Swedish Foundation for Strategic Research [RIT10-0047]; Swedish Research Council within the Linnaeus environment CADICS; Knut and Alice Wallenberg Foundation

    Available from: 2014-06-26 Created: 2014-06-26 Last updated: 2019-04-12Bibliographically approved
    4. Improved Learning in Convolutional Neural Networks with Shifted Exponential Linear Units (ShELUs)
    Open this publication in new window or tab >>Improved Learning in Convolutional Neural Networks with Shifted Exponential Linear Units (ShELUs)
    2018 (English)In: 2018 24th International Conference on Pattern Recognition (ICPR), IEEE, 2018, p. 517-522Conference paper, Published paper (Refereed)
    Abstract [en]

    The Exponential Linear Unit (ELU) has been proven to speed up learning and improve the classification performance over activation functions such as ReLU and Leaky ReLU for convolutional neural networks. The reasons behind the improved behavior are that ELU reduces the bias shift, it saturates for large negative inputs and it is continuously differentiable. However, it remains open whether ELU has the optimal shape and we address the quest for a superior activation function.We use a new formulation to tune a piecewise linear activation function during training, to investigate the above question, and learn the shape of the locally optimal activation function. With this tuned activation function, the classification performance is improved and the resulting, learned activation function shows to be ELU-shaped irrespective if it is initialized as a RELU, LReLU or ELU. Interestingly, the learned activation function does not exactly pass through the origin indicating that a shifted ELU-shaped activation function is preferable. This observation leads us to introduce the Shifted Exponential Linear Unit (ShELU) as a new activation function.Experiments on Cifar-100 show that the classification performance is further improved when using the ShELU activation function in comparison with ELU. The improvement is achieved when learning an individual bias shift for each neuron.

    Place, publisher, year, edition, pages
    IEEE, 2018
    Series
    International Conference on Pattern Recognition
    Keywords
    CNN, activation function
    National Category
    Other Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-151606 (URN)10.1109/ICPR.2018.8545104 (DOI)000455146800087 ()978-1-5386-3787-6 (ISBN)
    Conference
    24th International Conference on Pattern Recognition, ICPR 2018, Beijing, China, 20-24 Aug. 2018
    Funder
    Wallenberg Foundations
    Note

    Funding agencies:  Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation; Swedish Research Council [2014-6227]

    Available from: 2018-09-27 Created: 2018-09-27 Last updated: 2019-04-12
  • 128.
    Grelsson, Bertil
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Robinson, Andreas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Incept Inst Artificial Intelligence, U Arab Emirates.
    Signal Reconstruction Performance under Quantized Noisy Compressed Sensing2018In: 2019 DATA COMPRESSION CONFERENCE (DCC), IEEE , 2018, p. 149-155Conference paper (Refereed)
    Abstract [en]

    n/a

  • 129.
    Grönlund, Jakob
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Johansson, Angelina
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Defect Detection and OCR on Steel2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In large scale productions of metal sheets, it is important to maintain an effective way to continuously inspect the products passing through the production line. The inspection mainly consists of detection of defects and tracking of ID numbers. This thesis investigates the possibilities to create an automatic inspection system by evaluating different machine learning algorithms for defect detection and optical character recognition (OCR) on metal sheet data. Digit recognition and defect detection are solved separately, where the former compares the object detection algorithm Faster R-CNN and the classical machine learning algorithm NCGF, and the latter is based on unsupervised learning using a convolutional autoencoder (CAE).

    The advantage of the feature extraction method is that it only needs a couple of samples to be able to classify new digits, which is desirable in this case due to the lack of training data. Faster R-CNN, on the other hand, needs much more training data to solve the same problem. NCGF does however fail to classify noisy images and images of metal sheets containing an alloy, while Faster R-CNN seems to be a more promising solution with a final mean average precision of 98.59%.

    The CAE approach for defect detection showed promising result. The algorithm learned how to only reconstruct images without defects, resulting in reconstruction errors whenever a defect appears. The errors are initially classified using a basic thresholding approach, resulting in a 98.9% accuracy. However, this classifier requires supervised learning, which is why the clustering algorithm Gaussian mixture model (GMM) is investigated as well. The result shows that it should be possible to use GMM, but that it requires a lot of GPU resources to use it in an end-to-end solution with a CAE.

  • 130.
    Grönwall, Christina
    et al.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.
    Gustafsson, David
    FOI.
    Tolt, Gustav
    FOI.
    Steinvall, Ove
    FOI.
    Experiences from long-range passive and active imaging2015In: Proceedins of SPIE, 2015, Vol. 9649, p. 96490J-1-96490J-13Conference paper (Refereed)
    Abstract [en]

    We present algorithm evaluations for ATR of small sea vessels. The targets are at km distance from the sensors, which means that the algorithms have to deal with images affected by turbulence and mirage phenomena. We evaluate previously developed algorithms for registration of 3D-generating laser radar data. The evaluations indicate that some robustness to turbulence and mirage induced uncertainties can be handled by our probabilistic-based registration method.

    We also assess methods for target classification and target recognition on these new 3D data. An algorithm for detecting moving vessels in infrared image sequences is presented; it is based on optical flow estimation. Detection of moving target with an unknown spectral signature in a maritime environment is a challenging

    problem due to camera motion, background clutter, turbulence and the presence of mirage. First, the optical flow caused by the camera motion is eliminated by estimating the global flow in the image. Second, connected regions containing significant motions that differ from camera motion is extracted. It is assumed that motion caused by a moving vessel is more temporally stable than motion caused by mirage or turbulence. Furthermore, it is assumed that the motion caused by the vessel is more homogenous with respect to both magnitude and orientation, than motion caused by mirage and turbulence. Sufficiently large connected regions with a flow of acceptable magnitude and orientation are considered target regions. The method is evaluated on newly collected sequences of SWIR and MWIR images, with varying targets, target ranges and background clutter.

    Finally we discuss a concept for combining passive and active imaging in an ATR process. The main steps are passive imaging for target detection, active imaging for target/background segmentation and a fusion of passive and active imaging for target recognition.

  • 131.
    Grönwall, Christina
    et al.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering. Swedish Def Res Agcy FOI, Div C4ISR, Linkoping, Sweden.
    Rydell, Joakim
    Swedish Def Res Agcy FOI, Div C4ISR, Linkoping, Sweden.
    Tulldahl, Michael
    Swedish Def Res Agcy FOI, Div C4ISR, Linkoping, Sweden.
    Zhang, Erik
    Saab Aeronaut, Linkoping, Sweden.
    Bissmarck, Fredrik
    Swedish Def Res Agcy FOI, Div C4ISR, Linkoping, Sweden.
    Bilock, Erika
    Swedish Def Res Agcy FOI, Div C4ISR, Linkoping, Sweden.
    Two Imaging Systems for Positioning and Navigation2017In: 2017 WORKSHOP ON RESEARCH, EDUCATION AND DEVELOPMENT OF UNMANNED AERIAL SYSTEMS (RED-UAS), IEEE , 2017, p. 120-125Conference paper (Refereed)
    Abstract [en]

    We present two approaches for using imaging sensors on-board small unmanned aerial systems (UAS) for positioning and navigation. Two types of sensors are used; laser scanners and a camera operating in the visual wavelengths. The laser scanners produce sparse 3D data that are registered to produce a local map. For the images from the video camera the optical flow and height estimates are fused and then matched with a geo-referenced aerial image. Both approaches include data from the inertial navigation system. The approaches can be used for accurate ego-positioning, and thus for navigation. The approaches are GPS independent and can work in GPS denied conditions, for example urban canyons, indoor environments, forest areas or while jammed. Applications are primarily within societal security and military defense.

  • 132.
    Grönwall, Christina
    et al.
    FIO.
    Tolt, Gustav
    FOI.
    Chevalier, tomas
    FOI.
    Larsson, Håkan
    FOI.
    Spatial filtering for detection of partly occluded targets2011In: Optical Engineering: The Journal of SPIE, ISSN 0091-3286, E-ISSN 1560-2303, Vol. 50, no 4, p. 047201-1-047201-13Article in journal (Refereed)
    Abstract [en]

    A Bayesian approach for data reduction based on spatial filtering is proposed that enables detection of targets partly occluded by natural forest. The framework aims at creating a synergy between terrain mapping and target detection. It is demonstrates how spatial features can be extracted and combined in order to detect target samples in cluttered environments. In particular, it is illustrated how a priori scene information and assumptions about targets can be translated into algorithms for feature extraction. We also analyze the coupling between features and assumptions because it gives knowledge about which features are general enough to be useful in other environments and which are tailored for a specific situation. Two types of features are identified, nontarget indicators and target indicators. The filtering approach is based on a combination of several features. A theoretical framework for combining the features into a maximum likelihood classification scheme is presented. The approach is evaluated using data collected with a laser-based 3-D sensor in various forest environments with vehicles as targets. Over 70% of the target points are detected at a false-alarm rate of <1%. We also demonstrate how selecting different feature subsets influence the results.

  • 133.
    Gustavson, Stefan
    et al.
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, The Institute of Technology.
    Strand, Robin
    Uppsala University, Sweden.
    Anti-aliased Euclidean distance transform2011In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 32, no 2, p. 252-257Article in journal (Refereed)
    Abstract [en]

    We present a modified distance measure for use with distance transforms of anti-aliased, area sampled grayscale images of arbitrary binary contours. The modified measure can be used in any vector-propagation Euclidean distance transform. Our test implementation in the traditional SSED8 algorithm shows a considerable improvement in accuracy and homogeneity of the distance field compared to a traditional binary image transform. At the expense of a 10x slowdown for a particular image resolution, we achieve an accuracy comparable to a binary transform on a supersampled image with 16 × 16 higher resolution, which would require 256 times more computations and memory.

  • 134.
    Günther, David
    et al.
    Zuse Institute Berlin.
    Reininghaus, Jan
    Zuse Institue Berlin.
    Wagner, J
    Jagiellonian University,Krakow, Poland.
    Hotz, Ingrid
    Zuse Institue Berlin.
    Memory-Efficient Computation of Persistent Homology for 3D Images using Discrete Morse Theory2011Conference paper (Refereed)
    Abstract [en]

    We propose a memory-efficient method that computes persistent homology for 3D gray-scale images. The basic idea is to compute the persistence of the induced Morse-Smale complex. Since in practice this complex is much smaller than the input data, significantly less memory is required for the subsequent computations. We propose a novel algorithm that efficiently extracts the Morse-Smale complex based on algorithms from discrete Morse theory. The proposed algorithm is thereby optimal with a computational complexity of O(n2). The persistence is then computed using the Morse-Smale complex by applying an existing algorithm with a good practical running time. We demonstrate that our method allows for the computation of persistent homology for large data on commodity hardware.

  • 135.
    Hallenberg, Johan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Robot Tool Center Point Calibration using Computer Vision2007Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Today, tool center point calibration is mostly done by a manual procedure. The method is very time consuming and the result may vary due to how skilled the operators are.

    This thesis proposes a new automated iterative method for tool center point calibration of industrial robots, by making use of computer vision and image processing techniques. The new method has several advantages over the manual calibration method. Experimental verifications have shown that the proposed method is much faster, still delivering a comparable or even better accuracy. The setup of the proposed method is very easy, only one USB camera connected to a laptop computer is needed and no contact with the robot tool is necessary during the calibration procedure.

    The method can be split into three different parts. Initially, the transformation between the robot wrist and the tool is determined by solving a closed loop of homogeneous transformations. Second an image segmentation procedure is described for finding point correspondences on a rotation symmetric robot tool. The image segmentation part is necessary for performing a measurement with six degrees of freedom of the camera to tool transformation. The last part of the proposed method is an iterative procedure which automates an ordinary four point tool center point calibration algorithm. The iterative procedure ensures that the accuracy of the tool center point calibration only depends on the accuracy of the camera when registering a movement between two positions.

  • 136.
    Hanning, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Video Stabilization and Rolling Shutter Correction using Inertial Measurement Sensors2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Most mobile video-recording devices of today, e.g. cell phones and music players, make use of a rolling shutter camera. A rolling shutter camera captures video by recording every frame line-by-line from top to bottom of the image, leading to image distortions in situations where either the device or the target is moving. Recording video by hand also leads to visible frame-to-frame jitter.

    In this thesis, methods to decrease distortion caused by the motion of a video-recording device with a rolling shutter camera are presented. The methods are based on estimating the orientation of the camera from gyroscope and accelerometer measurements.

    The algorithms are implemented on the iPod Touch 4, and the resulting videos are compared to those of competing stabilization software, both commercial and free, in a series of blind experiments. The results from this user study shows that the methods presented in the thesis perform equal to or better than the others.

  • 137.
    Haque, Muhammad Fahim Ul
    et al.
    Department of Electronic Engineering , NED University of Engineering and Technology, Karachi, Pakistan.
    Pasha, Muhammad Touqir
    Linköping University, Department of Electrical Engineering, Integrated Circuits and Systems. Linköping University, Faculty of Science & Engineering.
    Johansson, Ted
    Linköping University, Department of Electrical Engineering, Integrated Circuits and Systems. Linköping University, Faculty of Science & Engineering.
    Power-efficient aliasing-free PWM transmitter2019In: IET Circuits, Devices & Systems, ISSN 1751-858X, E-ISSN 1751-8598, Vol. 13, no 3, p. 273-278Article in journal (Refereed)
    Abstract [en]

    Linearity and efficiency are important parameters in determining the performance of any wireless transmitter. Pulse-width modulation (PWM) based transmitters offer high efficiency but suffer from low linearity due to image and aliasing distortions. Although the problem of linearity can be addressed by using an aliasing-free PWM (AF-PWM), these transmitters have a lower efficiency as they can only use linear power amplifiers (PAs). Moreover, an all-digital implementation of such transmitters is not possible. The aliasing-compensated PWM transmitter (AC-PWMT) has a higher efficiency due to the use of switch-mode PAs (SMPAs) but uses outphasing to eliminate image and aliasing distortions and requires a larger silicon area. In this study, the authors propose a novel transmitter that eliminates both aliasing and image distortions while using a single SMPA. The transmitter can be implemented using all-digital techniques and achieves a higher efficiency as compared to both AF-PWM and AC-PWM based transmitters. Measurement results show an improvement of 11.3, 7.2, and 4.3 dBc in the ACLR as compared to the carrier-based PWM transmitter (C-PWMT), AF-PWMT, and AC-PWMT, respectively. The efficiency of the proposed transmitter is similar to that of C-PWMT, which is an improvement of 5% over AF-PWMT.

  • 138.
    He, Linbo
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data: Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Semantic segmentation is a key approach to comprehensive image data analysis. It can be applied to analyze 2D images, videos, and even point clouds that contain 3D data points. On the first two problems, CNNs have achieved remarkable progress, but on point cloud segmentation, the results are less satisfactory due to challenges such as limited memory resource and difficulties in 3D point annotation. One of the research studies carried out by the Computer Vision Lab at Linköping University was aiming to ease the semantic segmentation of 3D point cloud. The idea is that by first projecting 3D data points to 2D space and then focusing only on the analysis of 2D images, we can reduce the overall workload for the segmentation process as well as exploit the existing well-developed 2D semantic segmentation techniques. In order to improve the performance of CNNs for 2D semantic segmentation, the study has used input data derived from different modalities. However, how different modalities can be optimally fused is still an open question. Based on the above-mentioned study, this thesis aims to improve the multistream framework architecture. More concretely, we investigate how different singlestream architectures impact the multistream framework with a given fusion method, and how different fusion methods contribute to the overall performance of a given multistream framework. As a result, our proposed fusion architecture outperformed all the investigated traditional fusion methods. Along with the best singlestream candidate and few additional training techniques, our final proposed multistream framework obtained a relative gain of 7.3\% mIoU compared to the baseline on the semantic3D point cloud test set, increasing the ranking from 12th to 5th position on the benchmark leaderboard.

  • 139.
    Hedborg, Johan
    Linköping University, Department of Mathematics.
    GPGPU: Bildbehandling på grafikkort2006Independent thesis Basic level (professional degree), 20 points / 30 hpStudent thesis
    Abstract [en]

    GPGPU is a collective term for research involving general computation on graphics cards. A modern graphics card typically provides more than ten times the computational power of an ordinary PC processor. This is a result of the high demands for speed and image quality in computer games.

    This thesis investigates the possibility of exploiting this computational power for image processing purposes. Three well known methods where implemented on a graphics card: FFT (Fast Fourier Transform), KLT (Kanade Lucas Tomasi point tracking) and the generation of scale pyramids. All algorithms where successfully implemented and they are tree to ten times faster than correspondning optimized CPU implementation.

  • 140.
    Hedborg, Johan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Pose Estimation and Structure Analysisof Image Sequences2009Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Autonomous navigation for ground vehicles has many challenges. Autonomous systems must be able to self-localise, avoid obstacles and determine navigable surfaces. This thesis studies several aspects of autonomous navigation with a particular emphasis on vision, motivated by it being a primary component for navigation in many high-level biological organisms.  The key problem of self-localisation or pose estimation can be solved through analysis of the changes in appearance of rigid objects observed from different view points. We therefore describe a system for structure and motion estimation for real-time navigation and obstacle avoidance. With the explicit assumption of a calibrated camera, we have studied several schemes for increasing accuracy and speed of the estimation.The basis of most structure and motion pose estimation algorithms is a good point tracker. However point tracking is computationally expensive and can occupy a large portion of the CPU resources. In thisthesis we show how a point tracker can be implemented efficiently on the graphics processor, which results in faster tracking of points and the CPU being available to carry out additional processing tasks.In addition we propose a novel view interpolation approach, that can be used effectively for pose estimation given previously seen views. In this way, a vehicle will be able to estimate its location by interpolating previously seen data.Navigation and obstacle avoidance may be carried out efficiently using structure and motion, but only whitin a limited range from the camera. In order to increase this effective range, additional information needs to be incorporated, more specifically the location of objects in the image. For this, we propose a real-time object recognition method, which uses P-channel matching, which may be used for improving navigation accuracy at distances where structure estimation is unreliable.

    List of papers
    1. Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization
    Open this publication in new window or tab >>Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization
    2007 (English)In: Journal of Real-Time Image Processing, ISSN 1861-8200, E-ISSN 1861-8219, Journal of real-time image processing, ISSN 1861-8200, Vol. 2, no 2-3, p. 103-115Article in journal (Refereed) Published
    Abstract [en]

    In this paper we propose a new approach to real-time view-based pose recognition and interpolation. Pose recognition is particularly useful for identifying camera views in databases, video sequences, video streams, and live recordings. All of these applications require a fast pose recognition process, in many cases video real-time. It should further be possible to extend the database with new material, i.e., to update the recognition system online. The method that we propose is based on P-channels, a special kind of information representation which combines advantages of histograms and local linear models. Our approach is motivated by its similarity to information representation in biological systems but its main advantage is its robustness against common distortions such as clutter and occlusion. The recognition algorithm consists of three steps: (1) low-level image features for color and local orientation are extracted in each point of the image; (2) these features are encoded into P-channels by combining similar features within local image regions; (3) the query P-channels are compared to a set of prototype P-channels in a database using a least-squares approach. The algorithm is applied in two scene registration experiments with fisheye camera data, one for pose interpolation from synthetic images and one for finding the nearest view in a set of real images. The method compares favorable to SIFT-based methods, in particular concerning interpolation. The method can be used for initializing pose-tracking systems, either when starting the tracking or when the tracking has failed and the system needs to re-initialize. Due to its real-time performance, the method can also be embedded directly into the tracking system, allowing a sensor fusion unit choosing dynamically between the frame-by-frame tracking and the pose recognition.

    Keywords
    computer vision
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-39505 (URN)10.1007/s11554-007-0044-y (DOI)49062 (Local ID)49062 (Archive number)49062 (OAI)
    Note
    Original Publication: Michael Felsberg and Johan Hedborg, Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization, 2007, Journal of real-time image processing, (2), 2-3, 103-115. http://dx.doi.org/10.1007/s11554-007-0044-y Copyright: Springer Science Business MediaAvailable from: 2009-10-10 Created: 2009-10-10 Last updated: 2017-12-13Bibliographically approved
    2. Real-Time Visual Recognition of Objects and Scenes Using P-Channel Matching
    Open this publication in new window or tab >>Real-Time Visual Recognition of Objects and Scenes Using P-Channel Matching
    2007 (English)In: Proceedings 15th Scandinavian Conference on Image Analysis / [ed] Bjarne K. Ersboll and Kim S. Pedersen, Berlin, Heidelberg: Springer, 2007, Vol. 4522, p. 908-917Conference paper, Published paper (Refereed)
    Abstract [en]

    In this paper we propose a new approach to real-time view-based object recognition and scene registration. Object recognition is an important sub-task in many applications, as e.g., robotics, retrieval, and surveillance. Scene registration is particularly useful for identifying camera views in databases or video sequences. All of these applications require a fast recognition process and the possibility to extend the database with new material, i.e., to update the recognition system online. The method that we propose is based on P-channels, a special kind of information representation which combines advantages of histograms and local linear models. Our approach is motivated by its similarity to information representation in biological systems but its main advantage is its robustness against common distortions as clutter and occlusion. The recognition algorithm extracts a number of basic, intensity invariant image features, encodes them into P-channels, and compares the query P-channels to a set of prototype P-channels in a database. The algorithm is applied in a cross-validation experiment on the COIL database, resulting in nearly ideal ROC curves. Furthermore, results from scene registration with a fish-eye camera are presented.

    Place, publisher, year, edition, pages
    Berlin, Heidelberg: Springer, 2007
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 4522
    Keywords
    Object recognition - scene registration - P-channels - real-time processing - view-based computer vision
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-21618 (URN)10.1007/978-3-540-73040-8 (DOI)978-3-540-73039-2 (ISBN)
    Conference
    15th Scandinavian Conference, SCIA 2007, June 10-24, Aalborg, Denmark
    Note

    Original Publication: Michael Felsberg and Johan Hedborg, Real-Time Visual Recognition of Objects and Scenes Using P-Channel Matching, 2007, Proc. 15th Scandinavian Conference on Image Analysis, 908-917. http://dx.doi.org/10.1007/978-3-540-73040-8 Copyright: Springer

    Available from: 2009-10-05 Created: 2009-10-05 Last updated: 2017-03-23Bibliographically approved
    3. Fast and Accurate Structure and Motion Estimation
    Open this publication in new window or tab >>Fast and Accurate Structure and Motion Estimation
    2009 (English)In: International Symposium on Visual Computing / [ed] George Bebis, Richard Boyle, Bahram Parvin, Darko Koracin, Yoshinori Kuno, Junxian Wang, Jun-Xuan Wang, Junxian Wang, Renato Pajarola and Peter Lindstrom et al., Berlin Heidelberg: Springer-Verlag , 2009, p. 211-222Conference paper, Oral presentation only (Refereed)
    Abstract [en]

    This paper describes a system for structure-and-motion estimation for real-time navigation and obstacle avoidance. We demonstrate it technique to increase the efficiency of the 5-point solution to the relative pose problem. This is achieved by a novel sampling scheme, where We add a distance constraint on the sampled points inside the RANSAC loop. before calculating the 5-point solution. Our setup uses the KLT tracker to establish point correspondences across tone in live video We also demonstrate how an early outlier rejection in the tracker improves performance in scenes with plenty of occlusions. This outlier rejection scheme is well Slated to implementation on graphics hardware. We evaluate the proposed algorithms using real camera sequences with fine-tuned bundle adjusted data as ground truth. To strenghten oar results we also evaluate using sequences generated by a state-of-the-art rendering software. On average we are able to reduce the number of RANSAC iterations by half and thereby double the speed.

    Place, publisher, year, edition, pages
    Berlin Heidelberg: Springer-Verlag, 2009
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743 ; Volume 5875
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-50624 (URN)10.1007/978-3-642-10331-5_20 (DOI)000278937300020 ()
    Conference
    5th International Symposium, ISVC 2009, November 30 - December 2, Las Vegas, NV, USA
    Projects
    DIPLECS
    Available from: 2009-10-13 Created: 2009-10-13 Last updated: 2016-05-04Bibliographically approved
    4. Real time camera ego-motion compensation and lens undistortion on GPU
    Open this publication in new window or tab >>Real time camera ego-motion compensation and lens undistortion on GPU
    2007 (English)Manuscript (preprint) (Other academic)
    Abstract [en]

    This paper describes a GPU implementation for simultaneous camera ego-motion compensation and lens undistortion. The main idea is to transform the image under an ego-motion constraint so that trackedpoints in the image, that are assumed to come from the ego-motion, maps as close as possible to their averageposition in time. The lens undistortion is computed si-multaneously. We compare the performance with and without compensation using two measures; mean timedifference and mean statistical background subtraction.

    Publisher
    p. 8
    Keywords
    GPU, camera ego-motion compensation, lens undistortion
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-58547 (URN)
    Available from: 2010-08-18 Created: 2010-08-13 Last updated: 2011-01-25Bibliographically approved
    5. KLT Tracking Implementation on the GPU
    Open this publication in new window or tab >>KLT Tracking Implementation on the GPU
    2007 (English)In: Proceedings SSBA 2007 / [ed] Magnus Borga, Anders Brun and Michael Felsberg;, 2007Conference paper, Oral presentation only (Other academic)
    Abstract [en]

    The GPU is the main processing unit on a graphics card. A modern GPU typically provides more than ten times the computational power of an ordinary PC processor. This is a result of the high demands for speed and image quality in computer games. This paper investigates the possibility of exploiting this computational power for tracking points in image sequences. Tracking points is used in many computer vision tasks, such as tracking moving objects, structure from motion, face tracking etc. The algorithm was successfully implemented on the GPU and a large speed up was achieved.

    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-21602 (URN)
    Conference
    SSBA, Swedish Symposium in Image Analysis 2007, 14-15 March, Linköping, Sweden
    Available from: 2009-10-05 Created: 2009-10-05 Last updated: 2016-05-04
    6. Synthetic Ground Truth for Feature Trackers
    Open this publication in new window or tab >>Synthetic Ground Truth for Feature Trackers
    2008 (English)In: Swedish Symposium on Image Analysis 2008, 2008Conference paper, Published paper (Other academic)
    Abstract [en]

    Good data sets for evaluation of computer visionalgorithms are important for the continuedprogress of the field. There exist good evaluationsets for many applications, but there are othersfor which good evaluation sets are harder to comeby. One such example is feature tracking, wherethere is an obvious difficulty in the collection ofdata. Good evaluation data is important both forcomparisons of different algorithms, and to detectweaknesses in a specific method.All image data is a result of light interactingwith its environment. These interactions are sowell modelled in rendering software that sometimesnot even the sharpest human eye can tell the differencebetween reality and simulation. In this paperwe thus propose to use a high quality renderingsystem to create evaluation data for sparse pointcorrespondence trackers.

    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-58548 (URN)
    Conference
    Swedish Symposium on Image Analysis 2008, 13-14 Marsh, Lund, Sweden
    Available from: 2010-08-18 Created: 2010-08-13 Last updated: 2015-12-10Bibliographically approved
  • 141.
    Hedborg, Johan
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ringaby, Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Structure and Motion Estimation from Rolling Shutter Video2011In: IEEE International Conference onComputer Vision Workshops (ICCV Workshops), 2011, IEEE Xplore , 2011, p. 17-23Conference paper (Refereed)
    Abstract [en]

    The majority of consumer quality cameras sold today have CMOS sensors with rolling shutters. In a rolling shutter camera, images are read out row by row, and thus each row is exposed during a different time interval. A rolling-shutter exposure causes geometric image distortions when either the camera or the scene is moving, and this causes state-of-the-art structure and motion algorithms to fail. We demonstrate a novel method for solving the structure and motion problem for rolling-shutter video. The method relies on exploiting the continuity of the camera motion, both between frames, and across a frame. We demonstrate the effectiveness of our method by controlled experiments on real video sequences. We show, both visually and quantitatively, that our method outperforms standard structure and motion, and is more accurate and efficient than a two-step approach, doing image rectification and structure and motion.

  • 142.
    Hedborg, Johan
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Robinson, Andreas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Robust Three-View Triangulation Done Fast2014In: Proceedings: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2014, IEEE , 2014, p. 152-157Conference paper (Refereed)
    Abstract [en]

    Estimating the position of a 3-dimensional world point given its 2-dimensional projections in a set of images is a key component in numerous computer vision systems. There are several methods dealing with this problem, ranging from sub-optimal, linear least square triangulation in two views, to finding the world point that minimized the L2-reprojection error in three views. This leads to the statistically optimal estimate under the assumption of Gaussian noise. In this paper we present a solution to the optimal triangulation in three views. The standard approach for solving the three-view triangulation problem is to find a closed-form solution. In contrast to this, we propose a new method based on an iterative scheme. The method is rigorously tested on both synthetic and real image data with corresponding ground truth, on a midrange desktop PC and a Raspberry Pi, a low-end mobile platform. We are able to improve the precision achieved by the closed-form solvers and reach a speed-up of two orders of magnitude compared to the current state-of-the-art solver. In numbers, this amounts to around 300K triangulations per second on the PC and 30K triangulations per second on Raspberry Pi.

  • 143.
    Hedlund, Gunnar
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Närmaskbestämning från stereoseende2005Independent thesis Basic level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [sv]

    Detta examensarbete utreder avståndsbedömning med hjälp av bildbehandling och stereoseende för känd kamerauppställning.

    Idag existerar ett stort antal beräkningsmetoder för att få ut avstånd till objekt, men metodernas prestanda har knappt mätts. Detta arbete tittar huvudsakligen på olika blockbaserade metoder för avståndsbedömning och tittar på möjligheter samt begränsningar då man använder sig av känd kunskap inom bildbehandling och stereoseende för avståndsbedömning. Arbetet är gjort på Bofors Defence AB i Karlskoga, Sverige, i syfte att slutligen användas i ett optiskt sensorsystem. Arbetet utreder beprövade

    Resultaten pekar mot att det är svårt att bestämma en närmask, avstånd till samtliga synliga objekt, men de testade metoderna bör ändå kunna användas punktvis för att beräkna avstånd. Den bästa metoden bygger på att man beräknar minsta absolutfelet och enbart behåller de säkraste värdena.

  • 144.
    Heinemann, Christian
    et al.
    Forschungszentrum Jülich, Germany.
    Åström, Freddie
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Baravdish, George
    Linköping University, Department of Science and Technology, Communications and Transport Systems. Linköping University, The Institute of Technology.
    Krajsek, Kai
    Forschungszentrum Jülich, Germany.
    Felsberg, Michael
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Scharr, Hanno
    Forschungszentrum Jülich, Germany.
    Using Channel Representations in Regularization Terms: A Case Study on Image Diffusion2014In: Proceedings of the 9th International Conference on Computer Vision Theory and Applications, SciTePress, 2014, Vol. 1, p. 48-55Conference paper (Refereed)
    Abstract [en]

    In this work we propose a novel non-linear diffusion filtering approach for images based on their channel representation. To derive the diffusion update scheme we formulate a novel energy functional using a soft-histogram representation of image pixel neighborhoods obtained from the channel encoding. The resulting Euler-Lagrange equation yields a non-linear robust diffusion scheme with additional weighting terms stemming from the channel representation which steer the diffusion process. We apply this novel energy formulation to image reconstruction problems, showing good performance in the presence of mixtures of Gaussian and impulse-like noise, e.g. missing data. In denoising experiments of common scalar-valued images our approach performs competitive compared to other diffusion schemes as well as state-of-the-art denoising methods for the considered noise types.

  • 145.
    Heintz, Fredrik
    et al.
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
    Löfgren, Fredrik
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
    Linköping Humanoids: Application RoboCup 2016 Standard Platform League2016Conference paper (Other academic)
    Abstract [en]

    This is the application for the RoboCup 2016 Standard Platform League from the Linköping Humanoids team.

    Linköping Humanoids participated in RoboCup 2015. We didn’t do very well, but we learned a lot. When we arrived nothing worked. However, we fixed more and more of the open issues and managed to play a draw in our final game. We also participated in some of the technical challenges and scored some points. At the end of the competition we had a working team. This was both frustrating and rewarding. Analyzing the competition we have identified both what we did well and the main issues that we need to fix. One important lesson is that it takes time to develop a competitive RoboCup SPL team. Weare dedicated to improving our performance over time in order to be competitive in 2017.

  • 146.
    Heintz, Fredrik
    et al.
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
    Löfgren, Fredrik
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
    Linköping Humanoids: Application RoboCup 2017 Standard Platform League2017Conference paper (Other academic)
    Abstract [en]

    This is the application for the RoboCup 2017 Standard Platform League from the Link¨oping Humanoids team

    Linköping Humanoids participated in both RoboCup 2015 and 2016 with the intention of incrementally developing a good team by learning as much as  possible. We significantly improved from 2015 to 2016, even though we still didn’t perform very well. Our main challenge is that we are building our software from the ground up using the Robot Operating System (ROS) as the integration and development infrastructure. When the system became overloaded, the ROS infrastructure became very unpredictable. This made it very hard to debug during the contest, so we basically had to remove things until the load was constantly low. Our top priority has since been to make the system stable and more resource efficient. This will take  us to the next level.

    From the start we have been clear that our goal is to have a competitive team by 2017 since we are developing our own software from scratch we are very well aware that we needed time to build up the competence and the software infrastructure. We believe we are making good progress towards this goal. The team of about 10 students has been very actively working during the fall with weekly workshops and bi-weekly one day hackathons.

  • 147.
    Hellsten, Jonas
    Linköping University, Department of Science and Technology.
    Evaluation of tone mapping operators for use in real time environments2007Independent thesis Advanced level (degree of Magister), 20 points / 30 hpStudent thesis
    Abstract [en]

    As real time visualizations become more realistic it also becomes more important to simulate the perceptual effects of the human visual system. Such effects include the response to varying illumination, glare and differences between photopic and scotopic vision. This thesis evaluates several different tone mapping methods to allow a greater dynamic range to be used in real time visualisations. Several tone mapping methods have been implemented in the Avalanche Game Engine and evaluated using a small test group. To increase immersion in the visualization several filters aimed to simulate perceptual effects has also been implemented. The primary goal of these filters is to simulate scotopic vision. The tests showed that two tone mapping methods would be suitable for the environment used in the tests. The S-curve tone mapping method gave the best result while the Mean Value method gave good results while being the simplest to implement and the cheapest. The test subjects agreed that the simulation of scotopic vision enhanced the immersion in a visualization. The primary difficulties in this work has been lack of dynamic range in the input images and the challenges in coding real time graphics using a graphics processing unit.

  • 148.
    Hemstrom, Jennifer
    et al.
    Linköping University, Faculty of Medicine and Health Sciences. Univ British Columbia, Canada.
    Albonico, Andrea
    Univ British Columbia, Canada.
    Djouab, Sarra
    Univ British Columbia, Canada; Univ Auvergne, France.
    Barton, Jason J. S.
    Univ British Columbia, Canada.
    Visual search for complex objects: Set-size effects for faces, words and cars2019In: Vision Research, ISSN 0042-6989, E-ISSN 1878-5646, Vol. 162Article in journal (Refereed)
    Abstract [en]

    To compare visual processing for different object types, we developed visual search tests that generated accuracy and response time parameters, including an object set-size effect that indexes perceptual processing load. Our goal was to compare visual search for two expert object types, faces and visual words, as well as a less expert type, cars. We first asked if faces and words showed greater inversion effects in search. Second, we determined whether search with upright stimuli correlated with other perceptual indices. Last we assessed for correlations between tests within a single orientation, and between orientations for a single object type. Object set-size effects were smaller for faces and words than cars. All accuracy and temporal measures showed an inversion effect for faces and words, but not cars. Face-search accuracy measures correlated with accuracy on the Cambridge Face Memory Test and word-search temporal measures correlated with single-word reading times, but car search did not correlate with semantic car knowledge. There were cross-orientation correlations for all object types, as well as cross-object correlations in the inverted orientation, while in the upright orientation face search did not correlate with word or car search. We conclude that object search shows effects of expertise. Compared to cars, words and faces showed smaller object set-size effects, greater inversion effects, and their search results correlated with other indices of perceptual expertise. The correlation analyses provide preliminary evidence supporting contributions from common processes in the case of inverted stimuli, object-specific processes that operate in both orientations, and distinct processing for upright faces.

  • 149.
    Henriksson, Markus
    et al.
    FOI.
    Olofsson, Tomas
    FOI.
    Grönwall, Christina
    FOI.
    Brännlund, Carl
    FOI.
    Sjöqvist, Lars
    FOI.
    Optical reflectance tomography using TCSPC laser radar2012In: Proc. SPIE, 2012, Vol. 8542Conference paper (Refereed)
    Abstract [en]

    Tomographic signal processing is used to transform multiple one-dimensional range profiles of a target from different angles to a two-dimensional image of the object. The range profiles are measured by a time-correlated single-photon counting (TCSPC) laser radar system with approximately 50 ps range resolution and a field of view that is wide compared to the measured objects. Measurements were performed in a lab environment with the targets mounted on a rotation stage. We show successful reconstruction of 2D-projections along the rotation axis of a boat model and removal of artefacts using a mask based on the convex hull. The independence of spatial resolution and the high sensitivity at a first glance makes this an interesting technology for very long range identification of passing objects such as high altitude UAVs and orbiting satellites but also the opposite problem of ship identification from high altitude platforms. To obtain an image with useful information measurements from a large angular sector around the object is needed, which is hard to obtain in practice. Examples of reconstructions using 90 and 150° sectors are given. In addition, the projection of the final image is along the rotation axis for the measurement and if this is not aligned with a major axis of the target the image information is limited. There are also practical problems to solve, for example that the distance from the sensor to the rotation centre needs to be known with an accuracy corresponding to the measurement resolution. The conclusion is that that laser radar tomography is useful only when the sensor is fixed and the target rotates around its own axis. © (2012) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.

  • 150.
    Henrysson, Anders
    Linköping University, Department of Science and Technology, Visual Information Technology and Applications (VITA). Linköping University, The Institute of Technology.
    Bringing Augmented Reality to Mobile Phones2007Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    With its mixing of real and virtual, Augmented Reality (AR) is a technology that has attracted lots of attention from the science community and is seen as a perfect way to visualize context-related information. Computer generated graphics is presented to the user overlaid and registered with the real world and hence augmenting it. Promising intelligence amplification and higher productivity, AR has been intensively researched over several decades but has yet to reach a broad audience.

    This thesis presents efforts in bringing Augmented Reality to mobile phones and thus to the general public. Implementing technologies on limited devices, such as mobile phones, poses a number of challenges that differ from traditional research directions. These include: limited computational resources with little or no possibility to upgrade or add hardware, limited input and output capabilities for interactive 3D graphics. The research presented in this thesis addresses these challenges and makes contributions in the following areas:

    Mobile Phone Computer Vision-Based Tracking

    The first contribution of thesis has been to migrate computer vision algorithms for tracking the mobile phone camera in a real world reference frame - a key enabling technology for AR. To tackle performance issues, low-level optimized code, using fixed-point algorithms, has been developed.

    Mobile Phone 3D Interaction Techniques

    Another contribution of this thesis has been to research interaction techniques for manipulating virtual content. This is in part realized by exploiting camera tracking for position-controlled interaction where motion of the device is used as input. Gesture input, made possible by a separate front camera, is another approach that is investigated. The obtained results are not unique to AR and could also be applicable to general mobile 3D graphics.

    Novel Single User AR Applications

    With short range communication technologies, mobile phones can exchange data not only with other phones but also with an intelligent environment. Data can be obtained for tracking or visualization; displays can be used to render graphics with the tracked mobile phone acting as an interaction device. Work is presented where a mobile phone harvests a sensor-network to use AR to visualize live data in context.

    Novel Collaboration AR Applications

    One of the most promising areas for mobile phone based AR is enhancing face-to-face computer supported cooperative work. This is because the AR display permits non-verbal cues to be used to a larger extent. In this thesis, face-to-face collaboration has been researched to examine whether AR increases awareness of collaboration partners even on small devices such as mobile phones. User feedback indicates that this is the case, confirming the hypothesis that mobile phones are increasingly able to deliver an AR experience to a large audience.

    List of papers
    1. Face to Face Collaborative AR on Mobile Phones
    Open this publication in new window or tab >>Face to Face Collaborative AR on Mobile Phones
    2005 (English)In: Proceedings of the Fourth IEEE and ACM international Symposium on Mixed and Augmented Reality, 2005, p. 80-89Conference paper, Published paper (Other academic)
    Abstract [en]

    Mobile phones are an ideal platform for augmented reality. In this paper we describe how they also can be used to support face to face collaborative AR applications. We have created a custom port of the ARToolKit library to the Symbian mobile phone operating system and then developed a sample collaborative AR game based on this. We describe the game in detail and user feedback from people who have played it. We also provide general design guidelines that could be useful for others who are developing mobile phone collaborative AR applications.

    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-12743 (URN)10.1109/ISMAR.2005.32 (DOI)
    Available from: 2007-11-20 Created: 2007-11-20 Last updated: 2011-01-04
    2. Virtual Object Manipulation using a Mobile Phone
    Open this publication in new window or tab >>Virtual Object Manipulation using a Mobile Phone
    2005 (English)In: Proceedings of the 2005 international Conference on Augmented Tele-Existence, 2005, p. 164-171Conference paper, Published paper (Other academic)
    Abstract [en]

    Augmented Reality (AR) on mobile phones has reached a level of maturity where it can be used as a tool for 3D object manipulation. In this paper we look at user interface issues where an AR enabled mobile phone acts as an interaction device. We discuss how traditional 3D manipulation techniques apply to this new platform. The high tangibility of the device and its button interface makes it interesting to compare manipulation techniques. We describe AR manipulation techniques we have implemented on a mobile phone and present a small pilot study evaluating these methods.

    Keywords
    augmented reality, manipulation, mobile phone
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-12744 (URN)10.1145/1152399.1152430 (DOI)
    Available from: 2007-11-20 Created: 2007-11-20 Last updated: 2011-01-04
    3. Experiments in 3D Interaction for Mobile Phone AR
    Open this publication in new window or tab >>Experiments in 3D Interaction for Mobile Phone AR
    2007 (English)In: Proceedings of the 5th international conference on Computer graphics and interactive techniques in Australia and Southeast Asia, Perth, Australia, New York: The Association for Computing Machinery, Inc. , 2007, p. 187-194Chapter in book (Other academic)
    Abstract [en]

    In this paper we present an evaluation of several different techniques for virtual object positioning and rotation on a mobile phone. We compare gesture input captured by the phone's front camera, to tangible input, keypad interaction and phone tilting in increasingly complex positioning and rotation tasks in an AR context. Usability experiments found that tangible input techniques are best for translation tasks, while keypad input is best for rotation tasks. Implications for the design of mobile phone 3D interfaces are presented as well as directions for future research.

    Place, publisher, year, edition, pages
    New York: The Association for Computing Machinery, Inc., 2007
    Keywords
    3D interaction, augmented reality, mobile graphics
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-12745 (URN)10.1145/1321261.1321295 (DOI)978-1-59593-912-8 (ISBN)
    Available from: 2007-11-20 Created: 2007-11-20 Last updated: 2018-01-13Bibliographically approved
    4. Mobile Phone Based AR Scene Assembly
    Open this publication in new window or tab >>Mobile Phone Based AR Scene Assembly
    2005 (English)In: Proceedings of the 4th international Conference on Mobile and Ubiquitous Multimedia, 2005, p. 95-102Conference paper, Published paper (Other academic)
    Abstract [en]

    In this paper we describe a mobile phone based Augmented Reality application for 3D scene assembly. Augmented Reality on mobile phones extends the interaction capabilities on such handheld devices. It adds a 6 DOF isomorphic interaction technique for manipulating 3D content. We give details of an application that we believe to be the first where 3D content can be manipulated using both the movement of a camera tracked mobile phone and a traditional button interface as input for transformations. By centering the scene in a tangible marker space in front of the phone we provide a mean for bimanual interaction. We describe the implementation, the interaction techniques we have developed and initial user response to trying the application.

    Keywords
    CAD, augmented reality, mobile phone
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-12746 (URN)10.1145/1149488.1149504 (DOI)
    Available from: 2007-11-20 Created: 2007-11-20 Last updated: 2011-01-04
    5. Using a Mobile Phone for 6DOF Mesh Editing
    Open this publication in new window or tab >>Using a Mobile Phone for 6DOF Mesh Editing
    2007 (English)In: Proceedings of the 7th ACM SIGCHI New Zealand Chapter's international Conference on Computer-Human interaction: Design Centered HCI., 2007, p. 9-16Chapter in book (Other academic)
    Abstract [en]

    This paper describes how a mobile phone can be used as a six degree of freedom interaction device for 3D mesh editing. Using a video see-through Augmented Reality approach, the mobile phone meets several design guidelines for a natural, easy to learn, 3D human computer interaction device. We have developed a system that allows a user to select one or more vertices in an arbitrary sized polygon mesh and freely translate and rotate them by translating and rotating the device itself. The mesh is registered in 3D and viewed through the device and hence the system provides a unified perception-action space. We present the implementation details and discuss the possible advantages and disadvantages of this approach.

    Keywords
    3D interfaces, content creation, mobile computer graphics, mobile phone augmented reality
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-12747 (URN)10.1145/1278960.1278962 (DOI)1-59593-473-1 (ISBN)
    Available from: 2007-11-20 Created: 2007-11-20 Last updated: 2018-01-13Bibliographically approved
    6. Interactive Collaborative Scene Assembly Using AR on Mobile Phones
    Open this publication in new window or tab >>Interactive Collaborative Scene Assembly Using AR on Mobile Phones
    2006 (English)In: Artificial Reality and Telexistence, ICAT, Springer , 2006, p. 1008-1017Conference paper, Published paper (Refereed)
    Abstract [en]

    In this paper we present and evaluate a platform for interactive collaborative face-to-face Augmented Reality using a distributed scene graph on mobile phones. The results of individual actions are viewed on the screen in real-time on every connected phone. We show how multiple collaborators can use consumer mobile camera phones to furnish a room together in an Augmented Reality environment. We have also presented a user case study to investigate how untrained users adopt this novel technology and to study the collaboration between multiple users. The platform is totally independent of a PC server though it is possible to connect a PC client to be used for high quality visualization on a big screen device such as a projector or a plasma display.

    Place, publisher, year, edition, pages
    Springer, 2006
    Series
    Lecture Notes in Computer Science, ISSN 1611-3349 ; 4282
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-12748 (URN)10.1007/11941354_104 (DOI)
    Available from: 2007-11-20 Created: 2007-11-20 Last updated: 2009-04-22
    7. A Novel Interface to Sensor Networks using Handheld Augmented Reality
    Open this publication in new window or tab >>A Novel Interface to Sensor Networks using Handheld Augmented Reality
    2006 (English)In: Proceedings of the 8th Conference on Human-Computer interaction with Mobile Devices and Services, Espoo, Finland, 2006, p. 145-148Conference paper, Published paper (Other academic)
    Abstract [en]

    Augmented Reality technology enables a mobile phone to be used as an x-ray tool, visualizing structures and states not visible to the naked eye. In this paper we evaluate a set of techniques used augmenting the world with a visualization of data from a sensor network. Combining virtual and real information introduces challenges as information from the two domains might interfere. We have applied our system to humidity data and present a user study together with feedback from domain experts. The prototype system can be seen as the first step towards a novel tool for inspection of building elements.

    Keywords
    Algorithms, Design, Human Factors, Measurement, intelligent environments, mobile phone augmented reality, sensor networks, visualization
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-12749 (URN)10.1145/1152215.1152245 (DOI)
    Available from: 2007-11-20 Created: 2007-11-20 Last updated: 2015-09-22
    8. LUMAR: A Hybrid Spatial Display System for 2D and 3D Handheld Augmented Reality
    Open this publication in new window or tab >>LUMAR: A Hybrid Spatial Display System for 2D and 3D Handheld Augmented Reality
    2007 (English)In: 17th International Conference on Artificial Reality and Telexistence (ICAT 2007), Esbjerg, Denmark, 2007, Los Alamitos, CA, USA: IEEE Computer Society Press , 2007, p. 63-70Conference paper, Published paper (Other academic)
    Abstract [en]

    LUMAR is a hybrid system for spatial displays, allowing cell phones to be tracked in 2D and 3D through combined egocentric and exocentric techniques based on the Light-Sense and UMAR frameworks. LUMAR differs from most other spatial display systems based on mobile phones with its three-layered information space. The hybrid spatial display system consists of printed matter that is augmented with context-sensitive, dynamic 2D media when the device is on the surface, and with overlaid 3D visualizations when it is held in mid-air.

    Place, publisher, year, edition, pages
    Los Alamitos, CA, USA: IEEE Computer Society Press, 2007
    Keywords
    spatially aware, portable, mobile, handheld, cell, phone, augmented reality, mixed reality, ubiquitous
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-12750 (URN)10.1109/ICAT.2007.13 (DOI)
    Conference
    17th International Conference on Artificial Reality and Telexistence (ICAT 2007), Esbjerg, Denmark, 2007
    Available from: 2007-11-20 Created: 2007-11-20 Last updated: 2018-03-05
1234567 101 - 150 of 467
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf