liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Visual Odometryin Principle and Practice
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Vision is the primary means by which we know where we are, what is nearby, and how we are moving. The corresponding computer-vision task is the simultaneous mapping of the surroundings and the localization of the camera. This goes by many names of which this thesis uses Visual Odometry. This name implies the images are sequential and emphasizes the accuracy of the pose and the real time requirements. This field has seen substantial improvements over the past decade and visual odometry is used extensively in robotics for localization, navigation and obstacle detection. 

The main purpose of this thesis is the study and advancement of visual odometry systems, and makes several contributions. The first of which is a high performance stereo visual odometry system, which through geometrically supported tracking achieved top rank on the KITTI odometry benchmark. 

The second is the state-of-the-art perspective three point solver. Such solvers find the pose of a camera given the projections of three known 3d points and are a core part of many visual odometry systems. By reformulating the underlying problem we avoided a problematic quartic polynomial. As a result we achieved substantially higher computational performance and numerical accuracy. 

The third is a system which generalizes stereo visual odometry to the simultaneous estimation of multiple independently moving objects. The main contribution is a real time system which allows the identification of generic moving rigid objects and the prediction of their trajectories in real time, with applications to robotic navigation in in dynamic environments. 

The fourth is an improved spline type continuous pose trajectory estimation framework, which simplifies the integration of general dynamic models. The framework is used to show that visual odometry systems based on continuous pose trajectories are both practical and can operate in real time. 

The visual odometry pipeline is considered from both a theoretical and a practical perspective. The systems described have been tested both on benchmarks and real vehicles. This thesis places the published work into context, highlighting key insights and practical observations.  

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2022. , p. 133
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2201
Keywords [en]
Visual Odometry, Continuous Pose Trajectory, P3P, PNP, VO, Tracking, Calibration
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:liu:diva-182731DOI: 10.3384/9789179291693ISBN: 9789179291686 (print)ISBN: 9789179291693 (electronic)OAI: oai:DiVA.org:liu-182731DiVA, id: diva2:1635583
Public defence
2022-03-04, Ada Lovelace, B-building and Zoom: https://liuse. zoom.us/j/66219624757, Campus Valla, Linköping, 09:00 (English)
Opponent
Supervisors
Note

ISBN has been added for the PDF-version.

URL has been corrected in the PDF-version.

Available from: 2022-02-07 Created: 2022-02-07 Last updated: 2025-02-07Bibliographically approved
List of papers
1. Robust Stereo Visual Odometry from Monocular Techniques
Open this publication in new window or tab >>Robust Stereo Visual Odometry from Monocular Techniques
2015 (English)In: 2015 IEEE Intelligent Vehicles Symposium (IV), Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 686-691Conference paper, Published paper (Refereed)
Abstract [en]

Visual odometry is one of the most active topics in computer vision. The automotive industry is particularly interested in this field due to the appeal of achieving a high degree of accuracy with inexpensive sensors such as cameras. The best results on this task are currently achieved by systems based on a calibrated stereo camera rig, whereas monocular systems are generally lagging behind in terms of performance. We hypothesise that this is due to stereo visual odometry being an inherently easier problem, rather than than due to higher quality of the state of the art stereo based algorithms. Under this hypothesis, techniques developed for monocular visual odometry systems would be, in general, more refined and robust since they have to deal with an intrinsically more difficult problem. In this work we present a novel stereo visual odometry system for automotive applications based on advanced monocular techniques. We show that the generalization of these techniques to the stereo case result in a significant improvement of the robustness and accuracy of stereo based visual odometry. We support our claims by the system results on the well known KITTI benchmark, achieving the top rank for visual only systems∗ .

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2015
Series
Intelligent Vehicle, IEEE Symposium, ISSN 1931-0587
Keywords
Visual odometry, VSLAM, structure from motion
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-121829 (URN)10.1109/IVS.2015.7225764 (DOI)000380565800112 ()978-1-4673-7266-4 (ISBN)
Conference
2015 IEEE Intelligent Vehicles Symposium (IV), Seoul, South Korea, June 28 2015-July 1 2015
Available from: 2015-10-08 Created: 2015-10-08 Last updated: 2025-02-07Bibliographically approved
2. Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver
Open this publication in new window or tab >>Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver
2018 (English)In: European Conference on Computer VisionECCV 2018: Computer Vision – ECCV 2018 / [ed] Ferrari V., Hebert M., Sminchisescu C., Weiss Y., Cham: Springer, 2018, p. 334-349Conference paper, Published paper (Refereed)
Abstract [en]

We present Lambda Twist; a novel P3P solver which is accurate, fast and robust. Current state-of-the-art P3P solvers find all roots to a quartic and discard geometrically invalid and duplicate solutions in a post-processing step. Instead of solving a quartic, the proposed P3P solver exploits the underlying elliptic equations which can be solved by a fast and numerically accurate diagonalization. This diagonalization requires a single real root of a cubic which is then used to find the, up to four, P3P solutions. Unlike the direct quartic solvers our method never computes geometrically invalid or duplicate solutions.

Extensive evaluation on synthetic data shows that the new solver has better numerical accuracy and is faster compared to the state-of-the-art P3P implementations. Implementation and benchmark are available on github.

Place, publisher, year, edition, pages
Cham: Springer, 2018
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11208
National Category
Signal Processing
Identifiers
urn:nbn:se:liu:diva-161264 (URN)10.1007/978-3-030-01225-0_20 (DOI)000594212900020 ()2-s2.0-85055452588 (Scopus ID)978-3-030-01225-0 (ISBN)978-3-030-01224-3 (ISBN)
Conference
European Conference on Computer Vision ECCV 2018
Available from: 2019-10-25 Created: 2019-10-25 Last updated: 2026-02-12
3. Independently Moving Object Trajectories from Sequential Hierarchical Ransac
Open this publication in new window or tab >>Independently Moving Object Trajectories from Sequential Hierarchical Ransac
2021 (English)In: VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, SCITEPRESS , 2021, p. 722-731Conference paper, Published paper (Refereed)
Abstract [en]

Safe robot navigation in a dynamic environment, requires the trajectories of each independently moving object (IMO). We present the novel and effective system Sequential Hierarchical Ransac Estimation (Shire) designed for this purpose. The system uses a stereo camera stream to find the objects and trajectories in real time. Shire detects moving objects using geometric consistency and finds their trajectories using bundle adjustment. Relying on geometric consistency allows the system to handle objects regardless of semantic class, unlike approaches based on semantic segmentation. Most Visual Odometry (VO) systems are inherently limited to single motion by the choice of tracker. This limitation allows for efficient and robust ego-motion estimation in real time, but preclude tracking the multiple motions sought. Shire instead uses a generic tracker and achieves accurate VO and IMO estimates using track analysis. This removes the restriction to a single motion while retaining the real-time performance required for live navigation. We evaluate the system by bounding box intersection over union and ID persistence on a public dataset, collected from an autonomous test vehicle driving in real traffic. We also show the velocities of estimated IMOs. We investigate variations of the system that provide trade offs between accuracy, performance and limitations.

Place, publisher, year, edition, pages
SCITEPRESS, 2021
Keywords
Robot Navigation; Moving Object Trajectory Estimation; Visual Odometry; SLAM
National Category
Robotics and automation
Identifiers
urn:nbn:se:liu:diva-180066 (URN)10.5220/0010253407220731 (DOI)000661288200077 ()9789897584886 (ISBN)
Conference
16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP) / 16th International Conference on Computer Vision Theory and Applications (VISAPP), ELECTR NETWORK, feb 08-10, 2021
Available from: 2021-10-12 Created: 2021-10-12 Last updated: 2025-02-09
4. Practical Pose Trajectory Splines With Explicit Regularization
Open this publication in new window or tab >>Practical Pose Trajectory Splines With Explicit Regularization
2021 (English)In: 2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 156-165Conference paper, Published paper (Refereed)
Abstract [en]

We investigate spline-based continuous-time pose trajectory estimation using non-linear explicit motion priors. Current regularization priors either linearize the orientation, rely on the implicit regularization obtained from the used spline basis function, or use sampling based regularization schemes. The latter is a special case of a Riemann sum approximation, and we demonstrate when and why this can fail, and propose a way to avoid these issues. In addition we provide a number of novel practically useful theoretical contributions, including requirements on knot spacing for orientation splines, new basis functions for constant velocity extrapolation, and a generalization of the popular P-Spline penalty to orientation. We analyze the properties of the proposed approach using synthetic data. We validate our system using the standard task of visual-inertial calibration, and apply it to stereo visual odometry where we demonstrate real-time performance on KITTI.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Series
International Conference on 3D Vision, ISSN 2378-3826, E-ISSN 2475-7888
National Category
Computer graphics and computer vision Computer Sciences
Identifiers
urn:nbn:se:liu:diva-182729 (URN)10.1109/3DV53792.2021.00026 (DOI)000786496000016 ()2-s2.0-85125011120 (Scopus ID)9781665426886 (ISBN)9781665426893 (ISBN)
Conference
9th International Conference on 3D Vision (3DV), ELECTR NETWORK, dec 01-03, 2021
Funder
Vinnova
Note

Funding: Vinnova through the Visual Sweden networkVinnova [Dnr 2019-02261]

Available from: 2022-02-07 Created: 2022-02-07 Last updated: 2026-03-16Bibliographically approved
5. Good Edgels to Track: Beating the Aperture Problem with Epipolar Geometry
Open this publication in new window or tab >>Good Edgels to Track: Beating the Aperture Problem with Epipolar Geometry
Show others...
2015 (English)In: COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II / [ed] Agapito, Lourdes and Bronstein, Michael M. and Rother, Carsten, Elsevier, 2015, p. 652-664Conference paper, Published paper (Refereed)
Abstract [en]

An open issue in multiple view geometry and structure from motion, applied to real life scenarios, is the sparsity of the matched key-points and of the reconstructed point cloud. We present an approach that can significantly improve the density of measured displacement vectors in a sparse matching or tracking setting, exploiting the partial information of the motion field provided by linear oriented image patches (edgels). Our approach assumes that the epipolar geometry of an image pair already has been computed, either in an earlier feature-based matching step, or by a robustified differential tracker. We exploit key-points of a lower order, edgels, which cannot provide a unique 2D matching, but can be employed if a constraint on the motion is already given. We present a method to extract edgels, which can be effectively tracked given a known camera motion scenario, and show how a constrained version of the Lucas-Kanade tracking procedure can efficiently exploit epipolar geometry to reduce the classical KLT optimization to a 1D search problem. The potential of the proposed methods is shown by experiments performed on real driving sequences.

Place, publisher, year, edition, pages
Elsevier, 2015
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 8926
Keywords
Densification; Tracking; Epipolar geometry; Lucas-Kanade; Feature extraction; Edgels; Edges
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:liu:diva-121565 (URN)10.1007/978-3-319-16181-5_50 (DOI)000362495500050 ()978-3-319-16180-8 (ISBN)
Conference
13th European Conference on Computer Vision (ECCV)
Available from: 2015-09-25 Created: 2015-09-25 Last updated: 2022-02-07Bibliographically approved
6. Robust Accurate Extrinsic Calibration of Static Non-overlapping Cameras
Open this publication in new window or tab >>Robust Accurate Extrinsic Calibration of Static Non-overlapping Cameras
2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10425, p. 342-353Conference paper, Published paper (Refereed)
Abstract [en]

An increasing number of robots and autonomous vehicles are equipped with multiple cameras to achieve surround-view sensing. The estimation of their relative poses, also known as extrinsic parameter calibration, is a challenging problem, particularly in the non-overlapping case. We present a simple and novel extrinsic calibration method based on standard components that performs favorably to existing approaches. We further propose a framework for predicting the performance of different calibration configurations and intuitive error metrics. This makes selecting a good camera configuration straightforward. We evaluate on rendered synthetic images and show good results as measured by angular and absolute pose differences, as well as the reprojection error distributions.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10425
National Category
Computer graphics and computer vision Computer Engineering
Identifiers
urn:nbn:se:liu:diva-145371 (URN)10.1007/978-3-319-64698-5_29 (DOI)000432084600029 ()9783319646978 (ISBN)9783319646985 (ISBN)
Conference
17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II
Note

Funding agencies: Vinnova, Swedens innovation agency; Daimler AG; EC; Swedish Research Council [2014-6227]

Available from: 2018-02-26 Created: 2018-02-26 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

fulltext(14397 kB)4781 downloads
File information
File name FULLTEXT02.pdfFile size 14397 kBChecksum SHA-512
89b1a119cf0f7e99f8f3f7383a67abb110d7e59cdbc9718f1ed16e4ba9dab1399ec9afe75eb464f1e44fda4b6af6097efad48f41a1162710cb3807df7fbff820
Type fulltextMimetype application/pdf
Order online >>

Other links

Publisher's full text

Authority records

Persson, Mikael

Search in DiVA

By author/editor
Persson, Mikael
By organisation
Computer VisionFaculty of Science & Engineering
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar
Total: 4793 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 3789 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf