liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Pose Estimation and Structure Analysisof Image Sequences
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
2009 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomous navigation for ground vehicles has many challenges. Autonomous systems must be able to self-localise, avoid obstacles and determine navigable surfaces. This thesis studies several aspects of autonomous navigation with a particular emphasis on vision, motivated by it being a primary component for navigation in many high-level biological organisms.  The key problem of self-localisation or pose estimation can be solved through analysis of the changes in appearance of rigid objects observed from different view points. We therefore describe a system for structure and motion estimation for real-time navigation and obstacle avoidance. With the explicit assumption of a calibrated camera, we have studied several schemes for increasing accuracy and speed of the estimation.The basis of most structure and motion pose estimation algorithms is a good point tracker. However point tracking is computationally expensive and can occupy a large portion of the CPU resources. In thisthesis we show how a point tracker can be implemented efficiently on the graphics processor, which results in faster tracking of points and the CPU being available to carry out additional processing tasks.In addition we propose a novel view interpolation approach, that can be used effectively for pose estimation given previously seen views. In this way, a vehicle will be able to estimate its location by interpolating previously seen data.Navigation and obstacle avoidance may be carried out efficiently using structure and motion, but only whitin a limited range from the camera. In order to increase this effective range, additional information needs to be incorporated, more specifically the location of objects in the image. For this, we propose a real-time object recognition method, which uses P-channel matching, which may be used for improving navigation accuracy at distances where structure estimation is unreliable.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press , 2009. , 28 p.
Series
Linköping Studies in Science and Technology. Thesis, ISSN 0280-7971 ; 1418
Keyword [en]
KLT, GPU, structure from motion, stereo, pose estimation
National Category
Engineering and Technology Computer Vision and Robotics (Autonomous Systems) Signal Processing
Identifiers
URN: urn:nbn:se:liu:diva-58706Local ID: LiU-TEK-LIC-2009:26ISBN: 978-91-7393-516-6 (print)OAI: oai:DiVA.org:liu-58706DiVA: diva2:345040
Opponent
Supervisors
Projects
Diplecs
Available from: 2011-01-25 Created: 2010-08-23 Last updated: 2016-05-04Bibliographically approved
List of papers
1. Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization
Open this publication in new window or tab >>Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization
2007 (English)In: Journal of Real-Time Image Processing, ISSN 1861-8200, E-ISSN 1861-8219, Journal of real-time image processing, ISSN 1861-8200, Vol. 2, no 2-3, 103-115 p.Article in journal (Refereed) Published
Abstract [en]

In this paper we propose a new approach to real-time view-based pose recognition and interpolation. Pose recognition is particularly useful for identifying camera views in databases, video sequences, video streams, and live recordings. All of these applications require a fast pose recognition process, in many cases video real-time. It should further be possible to extend the database with new material, i.e., to update the recognition system online. The method that we propose is based on P-channels, a special kind of information representation which combines advantages of histograms and local linear models. Our approach is motivated by its similarity to information representation in biological systems but its main advantage is its robustness against common distortions such as clutter and occlusion. The recognition algorithm consists of three steps: (1) low-level image features for color and local orientation are extracted in each point of the image; (2) these features are encoded into P-channels by combining similar features within local image regions; (3) the query P-channels are compared to a set of prototype P-channels in a database using a least-squares approach. The algorithm is applied in two scene registration experiments with fisheye camera data, one for pose interpolation from synthetic images and one for finding the nearest view in a set of real images. The method compares favorable to SIFT-based methods, in particular concerning interpolation. The method can be used for initializing pose-tracking systems, either when starting the tracking or when the tracking has failed and the system needs to re-initialize. Due to its real-time performance, the method can also be embedded directly into the tracking system, allowing a sensor fusion unit choosing dynamically between the frame-by-frame tracking and the pose recognition.

Keyword
computer vision
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-39505 (URN)10.1007/s11554-007-0044-y (DOI)49062 (Local ID)49062 (Archive number)49062 (OAI)
Note
Original Publication: Michael Felsberg and Johan Hedborg, Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization, 2007, Journal of real-time image processing, (2), 2-3, 103-115. http://dx.doi.org/10.1007/s11554-007-0044-y Copyright: Springer Science Business MediaAvailable from: 2009-10-10 Created: 2009-10-10 Last updated: 2017-12-13Bibliographically approved
2. Real-Time Visual Recognition of Objects and Scenes Using P-Channel Matching
Open this publication in new window or tab >>Real-Time Visual Recognition of Objects and Scenes Using P-Channel Matching
2007 (English)In: Proceedings 15th Scandinavian Conference on Image Analysis / [ed] Bjarne K. Ersboll and Kim S. Pedersen, Berlin, Heidelberg: Springer, 2007, Vol. 4522, 908-917 p.Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we propose a new approach to real-time view-based object recognition and scene registration. Object recognition is an important sub-task in many applications, as e.g., robotics, retrieval, and surveillance. Scene registration is particularly useful for identifying camera views in databases or video sequences. All of these applications require a fast recognition process and the possibility to extend the database with new material, i.e., to update the recognition system online. The method that we propose is based on P-channels, a special kind of information representation which combines advantages of histograms and local linear models. Our approach is motivated by its similarity to information representation in biological systems but its main advantage is its robustness against common distortions as clutter and occlusion. The recognition algorithm extracts a number of basic, intensity invariant image features, encodes them into P-channels, and compares the query P-channels to a set of prototype P-channels in a database. The algorithm is applied in a cross-validation experiment on the COIL database, resulting in nearly ideal ROC curves. Furthermore, results from scene registration with a fish-eye camera are presented.

Place, publisher, year, edition, pages
Berlin, Heidelberg: Springer, 2007
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 4522
Keyword
Object recognition - scene registration - P-channels - real-time processing - view-based computer vision
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-21618 (URN)10.1007/978-3-540-73040-8 (DOI)978-3-540-73039-2 (ISBN)
Conference
15th Scandinavian Conference, SCIA 2007, June 10-24, Aalborg, Denmark
Note

Original Publication: Michael Felsberg and Johan Hedborg, Real-Time Visual Recognition of Objects and Scenes Using P-Channel Matching, 2007, Proc. 15th Scandinavian Conference on Image Analysis, 908-917. http://dx.doi.org/10.1007/978-3-540-73040-8 Copyright: Springer

Available from: 2009-10-05 Created: 2009-10-05 Last updated: 2017-03-23Bibliographically approved
3. Fast and Accurate Structure and Motion Estimation
Open this publication in new window or tab >>Fast and Accurate Structure and Motion Estimation
2009 (English)In: International Symposium on Visual Computing / [ed] George Bebis, Richard Boyle, Bahram Parvin, Darko Koracin, Yoshinori Kuno, Junxian Wang, Jun-Xuan Wang, Junxian Wang, Renato Pajarola and Peter Lindstrom et al., Berlin Heidelberg: Springer-Verlag , 2009, 211-222 p.Conference paper, Oral presentation only (Refereed)
Abstract [en]

This paper describes a system for structure-and-motion estimation for real-time navigation and obstacle avoidance. We demonstrate it technique to increase the efficiency of the 5-point solution to the relative pose problem. This is achieved by a novel sampling scheme, where We add a distance constraint on the sampled points inside the RANSAC loop. before calculating the 5-point solution. Our setup uses the KLT tracker to establish point correspondences across tone in live video We also demonstrate how an early outlier rejection in the tracker improves performance in scenes with plenty of occlusions. This outlier rejection scheme is well Slated to implementation on graphics hardware. We evaluate the proposed algorithms using real camera sequences with fine-tuned bundle adjusted data as ground truth. To strenghten oar results we also evaluate using sequences generated by a state-of-the-art rendering software. On average we are able to reduce the number of RANSAC iterations by half and thereby double the speed.

Place, publisher, year, edition, pages
Berlin Heidelberg: Springer-Verlag, 2009
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; Volume 5875
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-50624 (URN)10.1007/978-3-642-10331-5_20 (DOI)000278937300020 ()
Conference
5th International Symposium, ISVC 2009, November 30 - December 2, Las Vegas, NV, USA
Projects
DIPLECS
Available from: 2009-10-13 Created: 2009-10-13 Last updated: 2016-05-04Bibliographically approved
4. Real time camera ego-motion compensation and lens undistortion on GPU
Open this publication in new window or tab >>Real time camera ego-motion compensation and lens undistortion on GPU
2007 (English)Manuscript (preprint) (Other academic)
Abstract [en]

This paper describes a GPU implementation for simultaneous camera ego-motion compensation and lens undistortion. The main idea is to transform the image under an ego-motion constraint so that trackedpoints in the image, that are assumed to come from the ego-motion, maps as close as possible to their averageposition in time. The lens undistortion is computed si-multaneously. We compare the performance with and without compensation using two measures; mean timedifference and mean statistical background subtraction.

Publisher
8 p.
Keyword
GPU, camera ego-motion compensation, lens undistortion
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-58547 (URN)
Available from: 2010-08-18 Created: 2010-08-13 Last updated: 2011-01-25Bibliographically approved
5. KLT Tracking Implementation on the GPU
Open this publication in new window or tab >>KLT Tracking Implementation on the GPU
2007 (English)In: Proceedings SSBA 2007 / [ed] Magnus Borga, Anders Brun and Michael Felsberg;, 2007Conference paper, Oral presentation only (Other academic)
Abstract [en]

The GPU is the main processing unit on a graphics card. A modern GPU typically provides more than ten times the computational power of an ordinary PC processor. This is a result of the high demands for speed and image quality in computer games. This paper investigates the possibility of exploiting this computational power for tracking points in image sequences. Tracking points is used in many computer vision tasks, such as tracking moving objects, structure from motion, face tracking etc. The algorithm was successfully implemented on the GPU and a large speed up was achieved.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-21602 (URN)
Conference
SSBA, Swedish Symposium in Image Analysis 2007, 14-15 March, Linköping, Sweden
Available from: 2009-10-05 Created: 2009-10-05 Last updated: 2016-05-04
6. Synthetic Ground Truth for Feature Trackers
Open this publication in new window or tab >>Synthetic Ground Truth for Feature Trackers
2008 (English)In: Swedish Symposium on Image Analysis 2008, 2008Conference paper, Published paper (Other academic)
Abstract [en]

Good data sets for evaluation of computer visionalgorithms are important for the continuedprogress of the field. There exist good evaluationsets for many applications, but there are othersfor which good evaluation sets are harder to comeby. One such example is feature tracking, wherethere is an obvious difficulty in the collection ofdata. Good evaluation data is important both forcomparisons of different algorithms, and to detectweaknesses in a specific method.All image data is a result of light interactingwith its environment. These interactions are sowell modelled in rendering software that sometimesnot even the sharpest human eye can tell the differencebetween reality and simulation. In this paperwe thus propose to use a high quality renderingsystem to create evaluation data for sparse pointcorrespondence trackers.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-58548 (URN)
Conference
Swedish Symposium on Image Analysis 2008, 13-14 Marsh, Lund, Sweden
Available from: 2010-08-18 Created: 2010-08-13 Last updated: 2015-12-10Bibliographically approved

Open Access in DiVA

Pose Estimation and Structure Analysisof Image Sequences(954 kB)1108 downloads
File information
File name FULLTEXT02.pdfFile size 954 kBChecksum SHA-512
0b0b1b436361e5afbe9ac1944433e464c50abc438e97d2b54bac512586a988d61b4a821f5566ec72b926dbd03a0d88e966d474c878de27d7660570945dcad7e2
Type fulltextMimetype application/pdf
cover(202 kB)33 downloads
File information
File name COVER01.pdfFile size 202 kBChecksum SHA-512
904c4a636210f2a0f90bcc5e2436f5e3b103c9b409c3bb90a4b2cb3d2820714c35520e54d85c74d00743ee419c81890ad13adebeb1fb8d2d1c6812a7435affdc
Type coverMimetype application/pdf

Authority records BETA

Hedborg, Johan

Search in DiVA

By author/editor
Hedborg, Johan
By organisation
Computer VisionThe Institute of Technology
Engineering and TechnologyComputer Vision and Robotics (Autonomous Systems)Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 1108 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 445 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf