liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
Persson, Mikael
Publications (8 of 8) Show all publications
Persson, M. (2022). Visual Odometryin Principle and Practice. (Doctoral dissertation). Linköping: Linköping University Electronic Press
Open this publication in new window or tab >>Visual Odometryin Principle and Practice
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Vision is the primary means by which we know where we are, what is nearby, and how we are moving. The corresponding computer-vision task is the simultaneous mapping of the surroundings and the localization of the camera. This goes by many names of which this thesis uses Visual Odometry. This name implies the images are sequential and emphasizes the accuracy of the pose and the real time requirements. This field has seen substantial improvements over the past decade and visual odometry is used extensively in robotics for localization, navigation and obstacle detection. 

The main purpose of this thesis is the study and advancement of visual odometry systems, and makes several contributions. The first of which is a high performance stereo visual odometry system, which through geometrically supported tracking achieved top rank on the KITTI odometry benchmark. 

The second is the state-of-the-art perspective three point solver. Such solvers find the pose of a camera given the projections of three known 3d points and are a core part of many visual odometry systems. By reformulating the underlying problem we avoided a problematic quartic polynomial. As a result we achieved substantially higher computational performance and numerical accuracy. 

The third is a system which generalizes stereo visual odometry to the simultaneous estimation of multiple independently moving objects. The main contribution is a real time system which allows the identification of generic moving rigid objects and the prediction of their trajectories in real time, with applications to robotic navigation in in dynamic environments. 

The fourth is an improved spline type continuous pose trajectory estimation framework, which simplifies the integration of general dynamic models. The framework is used to show that visual odometry systems based on continuous pose trajectories are both practical and can operate in real time. 

The visual odometry pipeline is considered from both a theoretical and a practical perspective. The systems described have been tested both on benchmarks and real vehicles. This thesis places the published work into context, highlighting key insights and practical observations.  

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2022. p. 133
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2201
Keywords
Visual Odometry, Continuous Pose Trajectory, P3P, PNP, VO, Tracking, Calibration
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-182731 (URN)10.3384/9789179291693 (DOI)9789179291686 (ISBN)9789179291693 (ISBN)
Public defence
2022-03-04, Ada Lovelace, B-building and Zoom: https://liuse. zoom.us/j/66219624757, Campus Valla, Linköping, 09:00 (English)
Opponent
Supervisors
Note

ISBN has been added for the PDF-version.

URL has been corrected in the PDF-version.

Available from: 2022-02-07 Created: 2022-02-07 Last updated: 2025-02-07Bibliographically approved
Persson, M., Häger, G., Ovrén, H. & Forssén, P.-E. (2021). Practical Pose Trajectory Splines With Explicit Regularization. In: 2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021): . Paper presented at 9th International Conference on 3D Vision (3DV), ELECTR NETWORK, dec 01-03, 2021 (pp. 156-165). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Practical Pose Trajectory Splines With Explicit Regularization
2021 (English)In: 2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 156-165Conference paper, Published paper (Refereed)
Abstract [en]

We investigate spline-based continuous-time pose trajectory estimation using non-linear explicit motion priors. Current regularization priors either linearize the orientation, rely on the implicit regularization obtained from the used spline basis function, or use sampling based regularization schemes. The latter is a special case of a Riemann sum approximation, and we demonstrate when and why this can fail, and propose a way to avoid these issues. In addition we provide a number of novel practically useful theoretical contributions, including requirements on knot spacing for orientation splines, new basis functions for constant velocity extrapolation, and a generalization of the popular P-Spline penalty to orientation. We analyze the properties of the proposed approach using synthetic data. We validate our system using the standard task of visual-inertial calibration, and apply it to stereo visual odometry where we demonstrate real-time performance on KITTI.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Series
International Conference on 3D Vision, ISSN 2378-3826, E-ISSN 2475-7888
National Category
Computer graphics and computer vision Computer Sciences
Identifiers
urn:nbn:se:liu:diva-182729 (URN)10.1109/3DV53792.2021.00026 (DOI)000786496000016 ()9781665426886 (ISBN)9781665426893 (ISBN)
Conference
9th International Conference on 3D Vision (3DV), ELECTR NETWORK, dec 01-03, 2021
Funder
Vinnova
Note

Funding: Vinnova through the Visual Sweden networkVinnova [Dnr 2019-02261]

Available from: 2022-02-07 Created: 2022-02-07 Last updated: 2025-02-01Bibliographically approved
Häger, G., Persson, M. & Felsberg, M. (2021). Predicting Disparity Distributions. In: 2021 IEEE International Conference on Robotics and Automation (ICRA): . Paper presented at IEEE Conference on robotics and automation 2021,Xi'an, China, 30 May-5 June 2021. IEEE
Open this publication in new window or tab >>Predicting Disparity Distributions
2021 (English)In: 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2021Conference paper, Published paper (Refereed)
Abstract [en]

We investigate a novel deep-learning-based approach to estimate uncertainty in stereo disparity prediction networks. Current state-of-the-art methods often formulate disparity prediction as a regression problem with a single scalar output in each pixel. This can be problematic in practical applications as in many cases there might not exist a single well defined disparity, for example in cases of occlusions or at depth-boundaries. While current neural-network-based disparity estimation approaches  obtain good performance on benchmarks, the disparity prediction is treated as a black box at inference time. In this paper we show that by formulating the learning problem as a regression with a distribution target, we obtain a robust estimate of the uncertainty in each pixel, while maintaining the performance of the original method. The proposed method is evaluated both on a large-scale standard benchmark, as well on our own data. We also show that the uncertainty estimate significantly improves by maximizing the uncertainty in those pixels that have no well defined disparity during learning.

Place, publisher, year, edition, pages
IEEE, 2021
Series
IEEE International Conference on Robotics and Automation (ICRA), ISSN 1050-4729, E-ISSN 2577-087X
Keywords
Uncertainty, Automation, Conferences, Estimation, Benchmark testing, Standards
National Category
Robotics and automation Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-179770 (URN)10.1109/ICRA48506.2021.9561617 (DOI)000765738803062 ()2-s2.0-85125504242 (Scopus ID)978-1-7281-9077-8 (ISBN)978-1-7281-9078-5 (ISBN)
Conference
IEEE Conference on robotics and automation 2021,Xi'an, China, 30 May-5 June 2021
Available from: 2021-10-01 Created: 2021-10-01 Last updated: 2025-02-05
Eldesokey, A., Felsberg, M., Holmquist, K. & Persson, M. (2020). Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): . Paper presented at 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 12011-12020). IEEE
Open this publication in new window or tab >>Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End
2020 (English)In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2020, p. 12011-12020Conference paper, Published paper (Refereed)
Abstract [en]

The focus in deep learning research has been mostly to push the limits of prediction accuracy. However, this was often achieved at the cost of increased complexity, raising concerns about the interpretability and the reliability of deep networks. Recently, an increasing attention has been given to untangling the complexity of deep networks and quantifying their uncertainty for different computer vision tasks. Differently, the task of depth completion has not received enough attention despite the inherent noisy nature of depth sensors. In this work, we thus focus on modeling the uncertainty of depth data in depth completion starting from the sparse noisy input all the way to the final prediction. We propose a novel approach to identify disturbed measurements in the input by learning an input confidence estimator in a self-supervised manner based on the normalized convolutional neural networks (NCNNs). Further, we propose a probabilistic version of NCNNs that produces a statistically meaningful uncertainty measure for the final prediction. When we evaluate our approach on the KITTI dataset for depth completion, we outperform all the existing Bayesian Deep Learning approaches in terms of prediction accuracy, quality of the uncertainty measure, and the computational efficiency. Moreover, our small network with 670k parameters performs on-par with conventional approaches with millions of parameters. These results give strong evidence that separating the network into parallel uncertainty and prediction streams leads to state-of-the-art performance with accurate uncertainty estimates.

Place, publisher, year, edition, pages
IEEE, 2020
Series
Conference on Computer Vision and Pattern Recognition (CVPR), ISSN 1063-6919, E-ISSN 2575-7075
Keywords
Uncertainty, Task analysis, Probabilistic logic, Measurement uncertainty, Noise measurement, Convolution, Computer vision
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-169106 (URN)10.1109/CVPR42600.2020.01203 (DOI)001309199904086 ()978-1-7281-7168-5 (ISBN)978-1-7281-7169-2 (ISBN)
Conference
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Available from: 2020-09-09 Created: 2020-09-09 Last updated: 2025-02-07
Persson, M. & Nordberg, K. (2018). Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver. In: Ferrari V., Hebert M., Sminchisescu C., Weiss Y. (Ed.), European Conference on Computer VisionECCV 2018: Computer Vision – ECCV 2018: . Paper presented at European Conference on Computer Vision ECCV 2018 (pp. 334-349). Cham: Springer
Open this publication in new window or tab >>Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver
2018 (English)In: European Conference on Computer VisionECCV 2018: Computer Vision – ECCV 2018 / [ed] Ferrari V., Hebert M., Sminchisescu C., Weiss Y., Cham: Springer, 2018, p. 334-349Conference paper, Published paper (Refereed)
Abstract [en]

We present Lambda Twist; a novel P3P solver which is accurate, fast and robust. Current state-of-the-art P3P solvers find all roots to a quartic and discard geometrically invalid and duplicate solutions in a post-processing step. Instead of solving a quartic, the proposed P3P solver exploits the underlying elliptic equations which can be solved by a fast and numerically accurate diagonalization. This diagonalization requires a single real root of a cubic which is then used to find the, up to four, P3P solutions. Unlike the direct quartic solvers our method never computes geometrically invalid or duplicate solutions.

Extensive evaluation on synthetic data shows that the new solver has better numerical accuracy and is faster compared to the state-of-the-art P3P implementations. Implementation and benchmark are available on github.

Place, publisher, year, edition, pages
Cham: Springer, 2018
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11208
National Category
Signal Processing
Identifiers
urn:nbn:se:liu:diva-161264 (URN)10.1007/978-3-030-01225-0_20 (DOI)000594212900020 ()2-s2.0-85055452588 (Scopus ID)978-3-030-01225-0 (ISBN)978-3-030-01224-3 (ISBN)
Conference
European Conference on Computer Vision ECCV 2018
Available from: 2019-10-25 Created: 2019-10-25 Last updated: 2026-02-12
Robinson, A., Persson, M. & Felsberg, M. (2017). Robust Accurate Extrinsic Calibration of Static Non-overlapping Cameras. In: Michael Felsberg, Anders Heyden and Norbert Krüger (Ed.), Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II. Paper presented at 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II (pp. 342-353). Springer, 10425
Open this publication in new window or tab >>Robust Accurate Extrinsic Calibration of Static Non-overlapping Cameras
2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10425, p. 342-353Conference paper, Published paper (Refereed)
Abstract [en]

An increasing number of robots and autonomous vehicles are equipped with multiple cameras to achieve surround-view sensing. The estimation of their relative poses, also known as extrinsic parameter calibration, is a challenging problem, particularly in the non-overlapping case. We present a simple and novel extrinsic calibration method based on standard components that performs favorably to existing approaches. We further propose a framework for predicting the performance of different calibration configurations and intuitive error metrics. This makes selecting a good camera configuration straightforward. We evaluate on rendered synthetic images and show good results as measured by angular and absolute pose differences, as well as the reprojection error distributions.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10425
National Category
Computer graphics and computer vision Computer Engineering
Identifiers
urn:nbn:se:liu:diva-145371 (URN)10.1007/978-3-319-64698-5_29 (DOI)000432084600029 ()9783319646978 (ISBN)9783319646985 (ISBN)
Conference
17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II
Note

Funding agencies: Vinnova, Swedens innovation agency; Daimler AG; EC; Swedish Research Council [2014-6227]

Available from: 2018-02-26 Created: 2018-02-26 Last updated: 2025-02-01Bibliographically approved
Piccini, T., Persson, M., Nordberg, K., Felsberg, M. & Mester, R. (2015). Good Edgels to Track: Beating the Aperture Problem with Epipolar Geometry. In: Agapito, Lourdes and Bronstein, Michael M. and Rother, Carsten (Ed.), COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II: . Paper presented at 13th European Conference on Computer Vision (ECCV) (pp. 652-664). Elsevier
Open this publication in new window or tab >>Good Edgels to Track: Beating the Aperture Problem with Epipolar Geometry
Show others...
2015 (English)In: COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II / [ed] Agapito, Lourdes and Bronstein, Michael M. and Rother, Carsten, Elsevier, 2015, p. 652-664Conference paper, Published paper (Refereed)
Abstract [en]

An open issue in multiple view geometry and structure from motion, applied to real life scenarios, is the sparsity of the matched key-points and of the reconstructed point cloud. We present an approach that can significantly improve the density of measured displacement vectors in a sparse matching or tracking setting, exploiting the partial information of the motion field provided by linear oriented image patches (edgels). Our approach assumes that the epipolar geometry of an image pair already has been computed, either in an earlier feature-based matching step, or by a robustified differential tracker. We exploit key-points of a lower order, edgels, which cannot provide a unique 2D matching, but can be employed if a constraint on the motion is already given. We present a method to extract edgels, which can be effectively tracked given a known camera motion scenario, and show how a constrained version of the Lucas-Kanade tracking procedure can efficiently exploit epipolar geometry to reduce the classical KLT optimization to a 1D search problem. The potential of the proposed methods is shown by experiments performed on real driving sequences.

Place, publisher, year, edition, pages
Elsevier, 2015
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 8926
Keywords
Densification; Tracking; Epipolar geometry; Lucas-Kanade; Feature extraction; Edgels; Edges
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:liu:diva-121565 (URN)10.1007/978-3-319-16181-5_50 (DOI)000362495500050 ()978-3-319-16180-8 (ISBN)
Conference
13th European Conference on Computer Vision (ECCV)
Available from: 2015-09-25 Created: 2015-09-25 Last updated: 2022-02-07Bibliographically approved
Persson, M., Piccini, T., Felsberg, M. & Mester, R. (2015). Robust Stereo Visual Odometry from Monocular Techniques. In: 2015 IEEE Intelligent Vehicles Symposium (IV): . Paper presented at 2015 IEEE Intelligent Vehicles Symposium (IV), Seoul, South Korea, June 28 2015-July 1 2015 (pp. 686-691). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Robust Stereo Visual Odometry from Monocular Techniques
2015 (English)In: 2015 IEEE Intelligent Vehicles Symposium (IV), Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 686-691Conference paper, Published paper (Refereed)
Abstract [en]

Visual odometry is one of the most active topics in computer vision. The automotive industry is particularly interested in this field due to the appeal of achieving a high degree of accuracy with inexpensive sensors such as cameras. The best results on this task are currently achieved by systems based on a calibrated stereo camera rig, whereas monocular systems are generally lagging behind in terms of performance. We hypothesise that this is due to stereo visual odometry being an inherently easier problem, rather than than due to higher quality of the state of the art stereo based algorithms. Under this hypothesis, techniques developed for monocular visual odometry systems would be, in general, more refined and robust since they have to deal with an intrinsically more difficult problem. In this work we present a novel stereo visual odometry system for automotive applications based on advanced monocular techniques. We show that the generalization of these techniques to the stereo case result in a significant improvement of the robustness and accuracy of stereo based visual odometry. We support our claims by the system results on the well known KITTI benchmark, achieving the top rank for visual only systems∗ .

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2015
Series
Intelligent Vehicle, IEEE Symposium, ISSN 1931-0587
Keywords
Visual odometry, VSLAM, structure from motion
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-121829 (URN)10.1109/IVS.2015.7225764 (DOI)000380565800112 ()978-1-4673-7266-4 (ISBN)
Conference
2015 IEEE Intelligent Vehicles Symposium (IV), Seoul, South Korea, June 28 2015-July 1 2015
Available from: 2015-10-08 Created: 2015-10-08 Last updated: 2025-02-07Bibliographically approved
Organisations

Search in DiVA

Show all publications