liu.seSearch for publications in DiVA
Change search
Refine search result
345678 251 - 300 of 367
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 251. Nyberg, Adam
    et al.
    Eldesokey, Abdelrahman
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Bergström, David
    Gustafsson, David
    Unpaired Thermal to Visible Spectrum Transfer using Adversarial Training2018Conference paper (Refereed)
  • 252.
    Nyström, Axel
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Evaluation of Multiple Object Tracking in Surveillance Video2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Multiple object tracking is the process of assigning unique and consistent identities to objects throughout a video sequence. A popular approach to multiple object tracking, and object tracking in general, is to use a method called tracking-by-detection. Tracking-by-detection is a two-stage procedure: an object detection algorithm first detects objects in a frame, these objects are then associated with already tracked objects by a tracking algorithm. One of the main concerns of this thesis is to investigate how different object detection algorithms perform on surveillance video supplied by National Forensic Centre. The thesis then goes on to explore how the stand-alone alone performance of the object detection algorithm correlates with overall performance of a tracking-by-detection system. Finally, the thesis investigates how the use of visual descriptors in the tracking stage of a tracking-by-detection system effects performance. 

    Results presented in this thesis suggest that the capacity of the object detection algorithm is highly indicative of the overall performance of the tracking-by-detection system. Further, this thesis also shows how the use of visual descriptors in the tracking stage can reduce the number of identity switches and thereby increase performance of the whole system.

  • 253.
    Ochs, Matthias
    et al.
    Goethe University of Frankfurt, Germany.
    Bradler, Henry
    Goethe University of Frankfurt, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Enhanced Phase Correlation for Reliable and Robust Estimation of Multiple Motion Distributions2016In: IMAGE AND VIDEO TECHNOLOGY, PSIVT 2015, Springer Publishing Company, 2016, Vol. 9431, p. 368-379Conference paper (Refereed)
    Abstract [en]

    Phase correlation is one of the classic methods for sparse motion or displacement estimation. It is renowned in the literature for high precision and insensitivity against illumination variations. We propose several important enhancements to the phase correlation (PhC) method which render it more robust against those situations where a motion measurement is not possible (low structure, too much noise, too different image content in the corresponding measurement windows). This allows the method to perform self-diagnosis in adverse situations. Furthermore, we extend the PhC method by a robust scheme for detecting and classifying the presence of multiple motions and estimating their uncertainties. Experimental results on the Middlebury Stereo Dataset and on the KITTI Optical Flow Dataset show the potential offered by the enhanced method in contrast to the PhC implementation of OpenCV.

  • 254.
    Ochs, Matthias
    et al.
    Goethe Univ, Germany.
    Bradler, Henry
    Goethe Univ, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe Univ, Germany.
    Learning Rank Reduced Interpolation with Principal Component Analysis2017In: 2017 28TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV 2017), IEEE , 2017, p. 1126-1133Conference paper (Refereed)
    Abstract [en]

    Most iterative optimization algorithms for motion, depth estimation or scene reconstruction, both sparse and dense, rely on a coarse and reliable dense initialization to bootstrap their optimization procedure. This makes techniques important that allow to obtain a dense but still approximative representation of a desired 2D structure (e.g., depth maps, optical flow, disparity maps) from a very sparse measurement of this structure. The method presented here exploits the complete information given by the principal component analysis (PCA), the principal basis and its prior distribution. The method is able to determine a dense reconstruction even if only a very sparse measurement is available. When facing such situations, typically the number of principal components is further reduced which results in a loss of expressiveness of the basis. We overcome this problem and inject prior knowledge in a maximum a posteriori (MAP) approach. We test our approach on the KITTI and the Virtual KITTI dataset and focus on the interpolation of depth maps for driving scenes. The evaluation of the results shows good agreement to the ground truth and is clearly superior to the results of an interpolation by the nearest neighbor method which disregards statistical information.

  • 255.
    Ogniewski, Jens
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Pushing the Limits for View Prediction in Video Coding2017In: PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 4, SCITEPRESS , 2017, p. 68-76Conference paper (Refereed)
    Abstract [en]

    More and more devices have depth sensors, making RGB+D video (colour+depth video) increasingly common. RGB+D video allows the use of depth image based rendering (DIBR) to render a given scene from different viewpoints, thus making it a useful asset in view prediction for 3D and free-viewpoint video coding. In this paper we evaluate a multitude of algorithms for scattered data interpolation, in order to optimize the performance of DIBR for video coding. This also includes novel contributions like a Kriging refinement step, an edge suppression step to suppress artifacts, and a scale-adaptive kernel. Our evaluation uses the depth extension of the Sintel datasets. Using ground-truth sequences is crucial for such an optimization, as it ensures that all errors and artifacts are caused by the prediction itself rather than noisy or erroneous data. We also present a comparison with the commonly used mesh-based projection.

  • 256.
    Ogniewski, Jens
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    What is the best depth-map compression for Depth Image Based Rendering?2017In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10425, p. 403-415Conference paper (Refereed)
    Abstract [en]

    Many of the latest smart phones and tablets come with integrated depth sensors, that make depth-maps freely available, thus enabling new forms of applications like rendering from different view points. However, efficient compression exploiting the characteristics of depth-maps as well as the requirements of these new applications is still an open issue. In this paper, we evaluate different depth-map compression algorithms, with a focus on tree-based methods and view projection as application.

    The contributions of this paper are the following: 1. extensions of existing geometric compression trees, 2. a comparison of a number of different trees, 3. a comparison of them to a state-of-the-art video coder, 4. an evaluation using ground-truth data that considers both depth-maps and predicted frames with arbitrary camera translation and rotation.

    Despite our best efforts, and contrary to earlier results, current video depth-map compression outperforms tree-based methods in most cases. The reason for this is likely that previous evaluations focused on low-quality, low-resolution depth maps, while high-resolution depth (as needed in the DIBR setting) has been ignored up until now. We also demonstrate that PSNR on depth-maps is not always a good measure of their utility.

  • 257.
    Ollesson, Niklas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Automatic Configuration of Vision Sensor2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In factory automation cameras and image processing algorithms can be used to inspect objects. This can decrease the faulty objects that leave the factory and reduce manual labour needed. A vision sensor is a system where camera and image processing is delivered together, and that only needs to be configured for the application that it is to be used for. Thus no programming knowledge is needed for the customer. In this Master’s thesis a way to make the configuration of a vision sensor even easier is developed and evaluated.

    The idea is that the customer knows his or her product much better than he or she knows image processing. The customer could take images of positive and negative samples of the object that is to be inspected. The algorithm should then, given these images, configure the vision sensor automatically.

    The algorithm that is developed to solve this problem is described step by step with examples to illustrate the problems that needed to be solved. Much of the focus is on how to compare two configurations to each other, in order to find the best one. The resulting configuration from the algorithm is then evaluated with respect to types of applications, computation time and representativeness of the input images.

  • 258.
    Olsson, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Feature Based Learning for Point Cloud Labeling and Grasp Point Detection2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Robotic bin picking is the problem of emptying a bin of randomly distributedobjects through a robotic interface. This thesis examines an SVM approach to ex-tract grasping points for a vacuum-type gripper. The SVM is trained on syntheticdata and used to classify the points of a non-synthetic 3D-scanned point cloud aseither graspable or non-graspable. The classified points are then clustered intograspable regions from which the grasping points are extracted.

    The SVM models and the algorithm as a whole are trained and evaluated againstcubic and cylindrical objects. Separate SVM models are trained for each type ofobject in addition to one model being trained on a dataset containing both typesof objects. It is shown that the performance of the SVM in terms accuracy isdependent on the objects and their geometrical properties. Further, it is shownthat the algorithm is reasonably robust in terms of successfully picking objects,regardless of the scale of the objects.

  • 259.
    Ovrén, Hannes
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Continuous Models for Cameras and Inertial Sensors2018Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Using images to reconstruct the world in three dimensions is a classical computer vision task. Some examples of applications where this is useful are autonomous mapping and navigation, urban planning, and special effects in movies. One common approach to 3D reconstruction is ”structure from motion” where a scene is imaged multiple times from different positions, e.g. by moving the camera. However, in a twist of irony, many structure from motion methods work best when the camera is stationary while the image is captured. This is because the motion of the camera can cause distortions in the image that lead to worse image measurements, and thus a worse reconstruction. One such distortion common to all cameras is motion blur, while another is connected to the use of an electronic rolling shutter. Instead of capturing all pixels of the image at once, a camera with a rolling shutter captures the image row by row. If the camera is moving while the image is captured the rolling shutter causes non-rigid distortions in the image that, unless handled, can severely impact the reconstruction quality.

    This thesis studies methods to robustly perform 3D reconstruction in the case of a moving camera. To do so, the proposed methods make use of an inertial measurement unit (IMU). The IMU measures the angular velocities and linear accelerations of the camera, and these can be used to estimate the trajectory of the camera over time. Knowledge of the camera motion can then be used to correct for the distortions caused by the rolling shutter. Another benefit of an IMU is that it can provide measurements also in situations when a camera can not, e.g. because of excessive motion blur, or absence of scene structure.

    To use a camera together with an IMU, the camera-IMU system must be jointly calibrated. The relationship between their respective coordinate frames need to be established, and their timings need to be synchronized. This thesis shows how to automatically perform this calibration and synchronization, without requiring e.g. calibration objects or special motion patterns.

    In standard structure from motion, the camera trajectory is modeled as discrete poses, with one pose per image. Switching instead to a formulation with a continuous-time camera trajectory provides a natural way to handle rolling shutter distortions, and also to incorporate inertial measurements. To model the continuous-time trajectory, many authors have used splines. The ability for a spline-based trajectory to model the real motion depends on the density of its spline knots. Choosing a too smooth spline results in approximation errors. This thesis proposes a method to estimate the spline approximation error, and use it to better balance camera and IMU measurements, when used in a sensor fusion framework. Also proposed is a way to automatically decide how dense the spline needs to be to achieve a good reconstruction.

    Another approach to reconstruct a 3D scene is to use a camera that directly measures depth. Some depth cameras, like the well-known Microsoft Kinect, are susceptible to the same rolling shutter effects as normal cameras. This thesis quantifies the effect of the rolling shutter distortion on 3D reconstruction, depending on the amount of motion. It is also shown that a better 3D model is obtained if the depth images are corrected using inertial measurements.

    List of papers
    1. Improving RGB-D Scene Reconstruction using Rolling Shutter Rectification
    Open this publication in new window or tab >>Improving RGB-D Scene Reconstruction using Rolling Shutter Rectification
    2015 (English)In: New Development in Robot Vision / [ed] Yu Sun, Aman Behal & Chi-Kit Ronald Chung, Springer Berlin/Heidelberg, 2015, p. 55-71Chapter in book (Refereed)
    Abstract [en]

    Scene reconstruction, i.e. the process of creating a 3D representation (mesh) of some real world scene, has recently become easier with the advent of cheap RGB-D sensors (e.g. the Microsoft Kinect).

    Many such sensors use rolling shutter cameras, which produce geometrically distorted images when they are moving. To mitigate these rolling shutter distortions we propose a method that uses an attached gyroscope to rectify the depth scans.We also present a simple scheme to calibrate the relative pose and time synchronization between the gyro and a rolling shutter RGB-D sensor.

    For scene reconstruction we use the Kinect Fusion algorithm to produce meshes. We create meshes from both raw and rectified depth scans, and these are then compared to a ground truth mesh. The types of motion we investigate are: pan, tilt and wobble (shaking) motions.

    As our method relies on gyroscope readings, the amount of computations required is negligible compared to the cost of running Kinect Fusion.

    This chapter is an extension of a paper at the IEEE Workshop on Robot Vision [10]. Compared to that paper, we have improved the rectification to also correct for lens distortion, and use a coarse-to-fine search to find the time shift more quicky.We have extended our experiments to also investigate the effects of lens distortion, and to use more accurate ground truth. The experiments demonstrate that correction of rolling shutter effects yields a larger improvement of the 3D model than correction for lens distortion.

    Place, publisher, year, edition, pages
    Springer Berlin/Heidelberg, 2015
    Series
    Cognitive Systems Monographs, ISSN 1867-4925 ; 23
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-114344 (URN)10.1007/978-3-662-43859-6_4 (DOI)978-3-662-43858-9 (ISBN)978-3-662-43859-6 (ISBN)
    Projects
    Learnable Camera Motion Models
    Available from: 2015-02-19 Created: 2015-02-19 Last updated: 2018-06-19Bibliographically approved
    2. Gyroscope-based video stabilisation with auto-calibration
    Open this publication in new window or tab >>Gyroscope-based video stabilisation with auto-calibration
    2015 (English)In: 2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, p. 2090-2097Conference paper, Published paper (Refereed)
    Abstract [en]

    We propose a technique for joint calibration of a wide-angle rolling shutter camera (e.g. a GoPro) and an externally mounted gyroscope. The calibrated parameters are time scaling and offset, relative pose between gyroscope and camera, and gyroscope bias. The parameters are found using non-linear least squares minimisation using the symmetric transfer error as cost function. The primary contribution is methods for robust initialisation of the relative pose and time offset, which are essential for convergence. We also introduce a robust error norm to handle outliers. This results in a technique that works with general video content and does not require any specific setup or calibration patterns. We apply our method to stabilisation of videos recorded by a rolling shutter camera, with a rigidly attached gyroscope. After recording, the gyroscope and camera are jointly calibrated using the recorded video itself. The recorded video can then be stabilised using the calibrated parameters. We evaluate the technique on video sequences with varying difficulty and motion frequency content. The experiments demonstrate that our method can be used to produce high quality stabilised videos even under difficult conditions, and that the proposed initialisation is shown to end up within the basin of attraction. We also show that a residual based on the symmetric transfer error is more accurate than residuals based on the recently proposed epipolar plane normal coplanarity constraint.

    Series
    IEEE International Conference on Robotics and Automation ICRA, ISSN 1050-4729
    Keywords
    Calibration, Cameras, Cost function, Gyroscopes, Robustness, Synchronization
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering Signal Processing
    Identifiers
    urn:nbn:se:liu:diva-120182 (URN)10.1109/ICRA.2015.7139474 (DOI)000370974902014 ()978-1-4799-6922-7; 978-1-4799-6923-4 (ISBN)
    Conference
    2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26-30 May, 2015
    Projects
    LCMMVPS
    Funder
    Swedish Research Council, 2014-5928Swedish Foundation for Strategic Research , IIS11-0081
    Available from: 2015-07-13 Created: 2015-07-13 Last updated: 2018-06-19Bibliographically approved
    3. Spline Error Weighting for Robust Visual-Inertial Fusion
    Open this publication in new window or tab >>Spline Error Weighting for Robust Visual-Inertial Fusion
    2018 (English)In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, p. 321-329Conference paper, Published paper (Refereed)
    Abstract [en]

    In this paper we derive and test a probability-based weighting that can balance residuals of different types in spline fitting. In contrast to previous formulations, the proposed spline error weighting scheme also incorporates a prediction of the approximation error of the spline fit. We demonstrate the effectiveness of the prediction in a synthetic experiment, and apply it to visual-inertial fusion on rolling shutter cameras. This results in a method that can estimate 3D structure with metric scale on generic first-person videos. We also propose a quality measure for spline fitting, that can be used to automatically select the knot spacing. Experiments verify that the obtained trajectory quality corresponds well with the requested quality. Finally, by linearly scaling the weights, we show that the proposed spline error weighting minimizes the estimation errors on real sequences, in terms of scale and end-point errors.

    Series
    Computer Vision and Pattern Recognition, ISSN 1063-6919
    National Category
    Electrical Engineering, Electronic Engineering, Information Engineering
    Identifiers
    urn:nbn:se:liu:diva-149495 (URN)10.1109/CVPR.2018.00041 (DOI)000457843600034 ()978-1-5386-6420-9 (ISBN)978-1-5386-6421-6 (ISBN)
    Conference
    The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 18-22, 2018, Salt Lake City, USA
    Funder
    Swedish Research Council, 2014-5928Swedish Research Council, 2014-6227
    Available from: 2018-07-03 Created: 2018-07-03 Last updated: 2019-02-26Bibliographically approved
  • 260.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Gyroscope-based video stabilisation with auto-calibration2015In: 2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, p. 2090-2097Conference paper (Refereed)
    Abstract [en]

    We propose a technique for joint calibration of a wide-angle rolling shutter camera (e.g. a GoPro) and an externally mounted gyroscope. The calibrated parameters are time scaling and offset, relative pose between gyroscope and camera, and gyroscope bias. The parameters are found using non-linear least squares minimisation using the symmetric transfer error as cost function. The primary contribution is methods for robust initialisation of the relative pose and time offset, which are essential for convergence. We also introduce a robust error norm to handle outliers. This results in a technique that works with general video content and does not require any specific setup or calibration patterns. We apply our method to stabilisation of videos recorded by a rolling shutter camera, with a rigidly attached gyroscope. After recording, the gyroscope and camera are jointly calibrated using the recorded video itself. The recorded video can then be stabilised using the calibrated parameters. We evaluate the technique on video sequences with varying difficulty and motion frequency content. The experiments demonstrate that our method can be used to produce high quality stabilised videos even under difficult conditions, and that the proposed initialisation is shown to end up within the basin of attraction. We also show that a residual based on the symmetric transfer error is more accurate than residuals based on the recently proposed epipolar plane normal coplanarity constraint.

  • 261.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Spline Error Weighting for Robust Visual-Inertial Fusion2018In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, p. 321-329Conference paper (Refereed)
    Abstract [en]

    In this paper we derive and test a probability-based weighting that can balance residuals of different types in spline fitting. In contrast to previous formulations, the proposed spline error weighting scheme also incorporates a prediction of the approximation error of the spline fit. We demonstrate the effectiveness of the prediction in a synthetic experiment, and apply it to visual-inertial fusion on rolling shutter cameras. This results in a method that can estimate 3D structure with metric scale on generic first-person videos. We also propose a quality measure for spline fitting, that can be used to automatically select the knot spacing. Experiments verify that the obtained trajectory quality corresponds well with the requested quality. Finally, by linearly scaling the weights, we show that the proposed spline error weighting minimizes the estimation errors on real sequences, in terms of scale and end-point errors.

  • 262.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Swedish Def Res Agcy, Sweden.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Trajectory representation and landmark projection for continuous-time structure from motion2019In: The international journal of robotics research, ISSN 0278-3649, E-ISSN 1741-3176, Vol. 38, no 6, p. 686-701Article in journal (Refereed)
    Abstract [en]

    This paper revisits the problem of continuous-time structure from motion, and introduces a number of extensions that improve convergence and efficiency. The formulation with a C2-continuous spline for the trajectory naturally incorporates inertial measurements, as derivatives of the sought trajectory. We analyze the behavior of split spline interpolation on SO(3) and on R3, and a joint spline on SE(3), and show that the latter implicitly couples the direction of translation and rotation. Such an assumption can make good sense for a camera mounted on a robot arm, but not for hand-held or body-mounted cameras. Our experiments in the Spline Fusion framework show that a split spline on R3andSO(3) is preferable over an SE(3) spline in all tested cases. Finally, we investigate the problem of landmark reprojection on rolling shutter cameras, and show that the tested reprojection methods give similar quality, whereas their computational load varies by a factor of two.

  • 263.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Törnqvist, David
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Improving RGB-D Scene Reconstruction using Rolling Shutter Rectification2015In: New Development in Robot Vision / [ed] Yu Sun, Aman Behal & Chi-Kit Ronald Chung, Springer Berlin/Heidelberg, 2015, p. 55-71Chapter in book (Refereed)
    Abstract [en]

    Scene reconstruction, i.e. the process of creating a 3D representation (mesh) of some real world scene, has recently become easier with the advent of cheap RGB-D sensors (e.g. the Microsoft Kinect).

    Many such sensors use rolling shutter cameras, which produce geometrically distorted images when they are moving. To mitigate these rolling shutter distortions we propose a method that uses an attached gyroscope to rectify the depth scans.We also present a simple scheme to calibrate the relative pose and time synchronization between the gyro and a rolling shutter RGB-D sensor.

    For scene reconstruction we use the Kinect Fusion algorithm to produce meshes. We create meshes from both raw and rectified depth scans, and these are then compared to a ground truth mesh. The types of motion we investigate are: pan, tilt and wobble (shaking) motions.

    As our method relies on gyroscope readings, the amount of computations required is negligible compared to the cost of running Kinect Fusion.

    This chapter is an extension of a paper at the IEEE Workshop on Robot Vision [10]. Compared to that paper, we have improved the rectification to also correct for lens distortion, and use a coarse-to-fine search to find the time shift more quicky.We have extended our experiments to also investigate the effects of lens distortion, and to use more accurate ground truth. The experiments demonstrate that correction of rolling shutter effects yields a larger improvement of the 3D model than correction for lens distortion.

  • 264.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Törnqvist, David
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Why Would I Want a Gyroscope on my RGB-D Sensor?2013Conference paper (Refereed)
    Abstract [en]

    Many RGB-D sensors, e.g. the Microsoft Kinect, use rolling shutter cameras. Such cameras produce geometrically distorted images when the sensor is moving. To mitigate these rolling shutter distortions we propose a method that uses an attached gyroscope to rectify the depth scans. We also present a simple scheme to calibrate the relative pose and time synchronization between the gyro and a rolling shutter RGB-D sensor. We examine the effectiveness of our rectification scheme by coupling it with the the Kinect Fusion algorithm. By comparing Kinect Fusion models obtained from raw sensor scans and from rectified scans, we demonstrate improvement for three classes of sensor motion: panning motions causes slant distortions, and tilt motions cause vertically elongated or compressed objects. For wobble we also observe a loss of detail, compared to the reconstruction using rectified depth scans. As our method relies on gyroscope readings, the amount of computations required is negligible compared to the cost of running Kinect Fusion.

  • 265.
    Persson, Mikael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Online Monocular SLAM: Rittums2014Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    A classic Computer Vision task is the estimation of a 3D map from a collection of images. This thesis explores the online simultaneous estimation of camera poses and map points, often called Visual Simultaneous Localisation and Mapping [VSLAM]. In the near future the use of visual information by autonomous cars is likely, since driving is a vision dominated process. For example, VSLAM could be used to estimate the position of the car in relation to objects of interest, such as the road, other cars and pedestrians. Aimed at the creation of a real-time, robust, loop closing, single camera SLAM system, the properties of several state-of-the-art VSLAM systems and related techniques are studied. The system goals cover several important, if difficult, problems, which makes a solution widely applicable. This thesis makes two contributions: A rigorous qualitative analysis of VSLAM methods and a system designed accordingly. A novel tracking by matching scheme is proposed, which, unlike the trackers used by many similar systems, is able to deal better with forward camera motion. The system estimates general motion with loop closure in real time. The system is compared to a state-of-the-art monocular VSLAM algorithm and found to be similar in speed and performance.

  • 266.
    Persson, Mikael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Nordberg, Klas
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver2018In: Computer Vision -- ECCV 2018Part of the Lecture Notes in Computer Science book series (LNCS, volume 11208), Cham, 2018, p. 334-349Conference paper (Refereed)
    Abstract [en]

    We present Lambda Twist; a novel P3P solver which is accurate, fast and robust. Current state-of-the-art P3P solvers find all roots to a quartic and discard geometrically invalid and duplicate solutions in a post-processing step. Instead of solving a quartic, the proposed P3P solver exploits the underlying elliptic equations which can be solved by a fast and numerically accurate diagonalization. This diagonalization requires a single real root of a cubic which is then used to find the, up to four, P3P solutions. Unlike the direct quartic solvers our method never computes geometrically invalid or duplicate solutions.

    Extensive evaluation on synthetic data shows that the new solver has better numerical accuracy and is faster compared to the state-of-the-art P3P implementations. Implementation and benchmark are available on github.

  • 267.
    Persson, Mikael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Piccini, Tommaso
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Frankfurt University, Germany.
    Robust Stereo Visual Odometry from Monocular Techniques2015In: 2015 IEEE Intelligent Vehicles Symposium (IV), Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 686-691Conference paper (Refereed)
    Abstract [en]

    Visual odometry is one of the most active topics in computer vision. The automotive industry is particularly interested in this field due to the appeal of achieving a high degree of accuracy with inexpensive sensors such as cameras. The best results on this task are currently achieved by systems based on a calibrated stereo camera rig, whereas monocular systems are generally lagging behind in terms of performance. We hypothesise that this is due to stereo visual odometry being an inherently easier problem, rather than than due to higher quality of the state of the art stereo based algorithms. Under this hypothesis, techniques developed for monocular visual odometry systems would be, in general, more refined and robust since they have to deal with an intrinsically more difficult problem. In this work we present a novel stereo visual odometry system for automotive applications based on advanced monocular techniques. We show that the generalization of these techniques to the stereo case result in a significant improvement of the robustness and accuracy of stereo based visual odometry. We support our claims by the system results on the well known KITTI benchmark, achieving the top rank for visual only systems∗ .

  • 268.
    Pettersson, Niklas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    GPU-Accelerated Real-Time Surveillance De-Weathering2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    A fully automatic de-weathering system to increase the visibility/stability in surveillance applications during bad weather has been developed. Rain, snow and haze during daylight are handled in real-time performance with acceleration from CUDA implemented algorithms. Video from fixed cameras is processed on a PC with no need of special hardware except an NVidia GPU. The system does not use any background model and does not require any precalibration. Increase in contrast is obtained in all haze/rain/snow-cases while the system lags the maximum of one frame during rain or snow removal. De-hazing can be obtained for any distance to simplify tracking or other operating algorithms on a surveillance system.

  • 269.
    Piccini, Tommaso
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Persson, Mikael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Nordberg, Klas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. VSI, Frankfurt University.
    Good Edgels to Track: Beating the Aperture Problem with Epipolar Geometry2015In: COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II / [ed] Agapito, Lourdes and Bronstein, Michael M. and Rother, Carsten, Elsevier, 2015, p. 652-664Conference paper (Refereed)
    Abstract [en]

    An open issue in multiple view geometry and structure from motion, applied to real life scenarios, is the sparsity of the matched key-points and of the reconstructed point cloud. We present an approach that can significantly improve the density of measured displacement vectors in a sparse matching or tracking setting, exploiting the partial information of the motion field provided by linear oriented image patches (edgels). Our approach assumes that the epipolar geometry of an image pair already has been computed, either in an earlier feature-based matching step, or by a robustified differential tracker. We exploit key-points of a lower order, edgels, which cannot provide a unique 2D matching, but can be employed if a constraint on the motion is already given. We present a method to extract edgels, which can be effectively tracked given a known camera motion scenario, and show how a constrained version of the Lucas-Kanade tracking procedure can efficiently exploit epipolar geometry to reduce the classical KLT optimization to a 1D search problem. The potential of the proposed methods is shown by experiments performed on real driving sequences.

  • 270.
    Pinggera, Peter
    et al.
    Daimler RandD, Germany; Goethe University of Frankfurt, Germany.
    Franke, Uwe
    Daimler RandD, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    High-Performance Long Range Obstacle Detection Using Stereo Vision2015In: 2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), IEEE , 2015, p. 1308-1313Conference paper (Refereed)
    Abstract [en]

    Reliable detection of obstacles at long range is crucial for the timely response to hazards by fast-moving safety-critical platforms like autonomous cars. We present a novel method for the joint detection and localization of distant obstacles using a stereo vision system on a moving platform. The approach is applicable to both static and moving obstacles and pushes the limits of detection performance as well as localization accuracy. The proposed detection algorithm is based on sound statistical tests using local geometric criteria which implicitly consider non-flat ground surfaces. To achieve maximum performance, it operates directly on image data instead of precomputed stereo disparity maps. A careful experimental evaluation on several datasets shows excellent detection performance and localization accuracy up to very large distances, even for small obstacles. We demonstrate a parallel implementation of the proposed system on a GPU that executes at real-time speeds.

  • 271.
    poole, alexander
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Real-Time Image Segmentation for Augmented Reality by Combiningmulti-Channel Thresholds.2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Extracting foreground objects from an image is a hot research topic. Doing thisfor high quality real world images in real-time on limited hardware such as asmart phone, is a demanding task. This master thesis shows how this problemcan be addressed using Otsu’s method together with Gaussian probability dis-tributions to create classifiers in different colour channels. We also show howclassifiers can be combined resulting in higher accuracy than using only the indi-vidual classifiers. We also propose using inter-class variance together with imagevariance to estimate classifier quality.A data set was produced to evaluate performance. The data set featuresreal-world images captured by a smart phone and objects of varying complex-ity against plain backgrounds that can be found in a typical office or urban space.

  • 272.
    Ringaby, Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Geometric Computer Vision for Rolling-shutter and Push-broom Sensors2012Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Almost all cell-phones and camcorders sold today are equipped with a CMOS (Complementary Metal Oxide Semiconductor) image sensor and there is also a general trend to incorporate CMOS sensors in other types of cameras. The sensor has many advantages over the more conventional CCD (Charge-Coupled Device) sensor such as lower power consumption, cheaper manufacturing and the potential for on-chip processing. Almost all CMOS sensors make use of what is called a rolling shutter. Compared to a global shutter, which images all the pixels at the same time, a rolling-shutter camera exposes the image row-by-row. This leads to geometric distortions in the image when either the camera or the objects in the scene are moving. The recorded videos and images will look wobbly (jello effect), skewed or otherwise strange and this is often not desirable. In addition, many computer vision algorithms assume that the camera used has a global shutter, and will break down if the distortions are too severe.

    In airborne remote sensing it is common to use push-broom sensors. These sensors exhibit a similar kind of distortion as a rolling-shutter camera, due to the motion of the aircraft. If the acquired images are to be matched with maps or other images, then the distortions need to be suppressed.

    The main contributions in this thesis are the development of the three dimensional models for rolling-shutter distortion correction. Previous attempts modelled the distortions as taking place in the image plane, and we have shown that our techniques give better results for hand-held camera motions.

    The basic idea is to estimate the camera motion, not only between frames, but also the motion during frame capture. The motion can be estimated using inter-frame image correspondences and with these a non-linear optimisation problem can be formulated and solved. All rows in the rolling-shutter image are imaged at different times, and when the motion is known, each row can be transformed to the rectified position.

    In addition to rolling-shutter distortions, hand-held footage often has shaky camera motion. It has been shown how to do efficient video stabilisation, in combination with the rectification, using rotation smoothing.

    In the thesis it has been explored how to use similar techniques as for the rolling-shutter case in order to correct push-broom images, and also how to rectify 3D point clouds from e.g. the Kinect depth sensor.

    List of papers
    1. Rectifying rolling shutter video from hand-held devices
    Open this publication in new window or tab >>Rectifying rolling shutter video from hand-held devices
    2010 (English)In: IEEE  Conference on  Computer Vision and Pattern Recognition (CVPR), 2010, Los Alamitos, CA, USA: IEEE Computer Society, 2010, p. 507-514Conference paper, Published paper (Other academic)
    Abstract [en]

    This paper presents a method for rectifying video sequences from rolling shutter (RS) cameras. In contrast to previous RS rectification attempts we model distortions as being caused by the 3D motion of the camera. The camera motion is parametrised as a continuous curve, with knots at the last row of each frame. Curve parameters are solved for using non-linear least squares over inter-frame correspondences obtained from a KLT tracker. We have generated synthetic RS sequences with associated ground-truth to allow controlled evaluation. Using these sequences, we demonstrate that our algorithm improves over to two previously published methods. The RS dataset is available on the web to allow comparison with other methods

    Place, publisher, year, edition, pages
    Los Alamitos, CA, USA: IEEE Computer Society, 2010
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-70572 (URN)10.1109/CVPR.2010.5540173 (DOI)978-1-4244-6984-0 (ISBN)
    Conference
    CVPR10, San Fransisco, USA, June 13-18, 2010
    Available from: 2011-09-13 Created: 2011-09-13 Last updated: 2015-12-10
    2. Efficient Video Rectification and Stabilisation for Cell-Phones
    Open this publication in new window or tab >>Efficient Video Rectification and Stabilisation for Cell-Phones
    2012 (English)In: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 96, no 3, p. 335-352Article in journal (Refereed) Published
    Abstract [en]

    This article presents a method for rectifying and stabilising video from cell-phones with rolling shutter (RS) cameras. Due to size constraints, cell-phone cameras have constant, or near constant focal length, making them an ideal application for calibrated projective geometry. In contrast to previous RS rectification attempts that model distortions in the image plane, we model the 3D rotation of the camera. We parameterise the camera rotation as a continuous curve, with knots distributed across a short frame interval. Curve parameters are found using non-linear least squares over inter-frame correspondences from a KLT tracker. By smoothing a sequence of reference rotations from the estimated curve, we can at a small extra cost, obtain a high-quality image stabilisation. Using synthetic RS sequences with associated ground-truth, we demonstrate that our rectification improves over two other methods. We also compare our video stabilisation with the methods in iMovie and Deshaker.

    Place, publisher, year, edition, pages
    Springer Verlag (Germany), 2012
    Keywords
    Cell-phone, Rolling shutter, CMOS, Video stabilisation
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-75277 (URN)10.1007/s11263-011-0465-8 (DOI)000299769400005 ()
    Note
    Funding Agencies|CENIIT organisation at Linkoping Institute of Technology||Swedish Research Council||Available from: 2012-02-27 Created: 2012-02-24 Last updated: 2017-12-07
    3. Scan Rectification for Structured Light Range Sensors with Rolling Shutters
    Open this publication in new window or tab >>Scan Rectification for Structured Light Range Sensors with Rolling Shutters
    2011 (English)In: IEEE International Conference on Computer Vision, Barcelona Spain, 2011, p. 1575-1582Conference paper, Published paper (Other academic)
    Abstract [en]

    Structured light range sensors, such as the Microsoft Kinect, have recently become popular as perception devices for computer vision and robotic systems. These sensors use CMOS imaging chips with electronic rolling shutters (ERS). When using such a sensor on a moving platform, both the image, and the depth map, will exhibit geometric distortions. We introduce an algorithm that can suppress such distortions, by rectifying the 3D point clouds from the range sensor. This is done by first estimating the time continuous 3D camera trajectory, and then transforming the 3D points to where they would have been, if the camera had been stationary. To ensure that image and range data are synchronous, the camera trajectory is computed from KLT tracks on the structured-light frames, after suppressing the structured-light pattern. We evaluate our rectification, by measuring angles between the visible sides of a cube, before and after rectification. We also measure how much better the 3D point clouds can be aligned after rectification. The obtained improvement is also related to the actual rotational velocity, measured using a MEMS gyroscope.

    Place, publisher, year, edition, pages
    Barcelona Spain: , 2011
    Series
    International Conference on Computer Vision (ICCV), ISSN 1550-5499
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-77059 (URN)10.1109/ICCV.2011.6126417 (DOI)978-1-4577-1101-5 (ISBN)
    Conference
    IEEE International Conference on Computer Vision(ICCV11), 8-11 November 2011, Barcelona, Spain
    Available from: 2012-05-07 Created: 2012-05-03 Last updated: 2015-12-10Bibliographically approved
    4. Co-alignmnent of Aerial Push-broom Strips using Trajectory Smoothness Constraints
    Open this publication in new window or tab >>Co-alignmnent of Aerial Push-broom Strips using Trajectory Smoothness Constraints
    2010 (English)Conference paper, Published paper (Other academic)
    Abstract [en]

    We study the problem of registering a sequence of scan lines (a strip) from an airborne push-broom imager to another sequence partly covering the same area. Such a registration has to compensate for deformations caused by attitude and speed changes in the aircraft. The registration is challenging, as both strips contain such deformations. Our algorithm estimates the 3D rotation of the camera for each scan line, by parametrising it as a linear spline with a number of knots evenly distributed in one of the strips. The rotations are estimated from correspondences between strips of the same area. Once the rotations are known, they can be compensated for, and each line of pixels can be transformed such that ground trace of the two strips are registered with respect to each other.

    Place, publisher, year, edition, pages
    Swedish Society for automated image analysis, 2010
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-70706 (URN)
    Conference
    SSBA10, Symposium on Image Analysis 11-12 March, Uppsala
    Available from: 2011-09-15 Created: 2011-09-15 Last updated: 2018-01-12Bibliographically approved
    5. Co-aligning Aerial Hyperspectral Push-broom Strips for Change Detection
    Open this publication in new window or tab >>Co-aligning Aerial Hyperspectral Push-broom Strips for Change Detection
    2010 (English)In: Proc. SPIE 7835, Electro-Optical Remote Sensing, Photonic Technologies, and Applications IV / [ed] Gary W. Kamerman; Ove Steinvall; Keith L. Lewis; Richard C. Hollins; Thomas J. Merlet; Gary J. Bishop; John D. Gonglewski, SPIE - International Society for Optical Engineering, 2010, p. Art.nr. 7835B-36-Conference paper, Published paper (Refereed)
    Abstract [en]

    We have performed a field trial with an airborne push-broom hyperspectral sensor, making several flights over the same area and with known changes (e.g., moved vehicles) between the flights. Each flight results in a sequence of scan lines forming an image strip, and in order to detect changes between two flights, the two resulting image strips must be geometrically aligned and radiometrically corrected. The focus of this paper is the geometrical alignment, and we propose an image- and gyro-based method for geometric co-alignment (registration) of two image strips. The method is particularly useful when the sensor is not stabilized, thus reducing the need for expensive mechanical stabilization. The method works in several steps, including gyro-based rectification, global alignment using SIFT matching, and a local alignment using KLT tracking. Experimental results are shown but not quantified, as ground truth is, by the nature of the trial, lacking.

    Place, publisher, year, edition, pages
    SPIE - International Society for Optical Engineering, 2010
    Series
    Proceedings Spie, ISSN 0277-786X ; 7835
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-70464 (URN)10.1117/12.865034 (DOI)978-0-8194-8353-9 (ISBN)
    Conference
    Electro-Optical Remote Sensing, Photonic Technologies, and Applications IV, 20-23 September, Toulouse, France
    Available from: 2011-09-13 Created: 2011-09-09 Last updated: 2018-01-12Bibliographically approved
  • 273.
    Ringaby, Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Geometric Models for Rolling-shutter and Push-broom Sensors2014Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Almost all cell-phones and camcorders sold today are equipped with a  CMOS (Complementary Metal Oxide Semiconductor) image sensor and there is also a general trend to incorporate CMOS sensors in other types of cameras. The CMOS sensor has many advantages over the more conventional CCD (Charge-Coupled Device) sensor such as lower power consumption, cheaper manufacturing and the potential for onchip processing. Nearly all CMOS sensors make use of what is called a rolling shutter readout. Unlike a global shutter readout, which images all the pixels at the same time, a rolling-shutter exposes the image row-by-row. If a mechanical shutter is not used this will lead to geometric distortions in the image when either the camera or the objects in the scene are moving. Smaller cameras, like those in cell-phones, do not have mechanical shutters and systems that do have them will not use them when recording video. The result will look wobbly (jello eect), skewed or otherwise strange and this is often not desirable. In addition, many computer vision algorithms assume that the camera used has a global shutter and will break down if the distortions are too severe.

    In airborne remote sensing it is common to use push-broom sensors. These sensors exhibit a similar kind of distortion as that of a rolling-shutter camera, due to the motion of the aircraft. If the acquired images are to be registered to maps or other images, the distortions need to be suppressed.

    The main contributions in this thesis are the development of the three-dimensional models for rolling-shutter distortion correction. Previous attempts modelled the distortions as taking place in the image plane, and we have shown that our techniques give better results for hand-held camera motions. The basic idea is to estimate the camera motion, not only between frames, but also the motion during frame capture. The motion is estimated using image correspondences and with these a non-linear optimisation problem is formulated and solved. All rows in the rollingshutter image are imaged at dierent times, and when the motion is known, each row can be transformed to its rectied position. The same is true when using depth sensors such as the Microsoft Kinect, and the thesis describes how to estimate its 3D motion and how to rectify 3D point clouds.

    In the thesis it has also been explored how to use similar techniques as for the rolling-shutter case, to correct push-broom images. When a transformation has been found, the images need to be resampled to a regular grid in order to be visualised. This can be done in many ways and dierent methods have been tested and adapted to the push-broom setup.

    In addition to rolling-shutter distortions, hand-held footage often has shaky camera motion. It is possible to do ecient video stabilisation in combination with the rectication using rotation smoothing. Apart from these distortions, motion blur is a big problem for hand-held photography. The images will be blurry due to the camera motion and also noisy if taken in low light conditions. One of the contributions in the thesis is a method which uses gyroscope measurements and feature tracking to combine several images, taken with a smartphone, into one resulting image with less blur and noise. This enables the user to take photos which would have otherwise required a tripod.

    List of papers
    1. Rectifying rolling shutter video from hand-held devices
    Open this publication in new window or tab >>Rectifying rolling shutter video from hand-held devices
    2010 (English)In: IEEE  Conference on  Computer Vision and Pattern Recognition (CVPR), 2010, Los Alamitos, CA, USA: IEEE Computer Society, 2010, p. 507-514Conference paper, Published paper (Other academic)
    Abstract [en]

    This paper presents a method for rectifying video sequences from rolling shutter (RS) cameras. In contrast to previous RS rectification attempts we model distortions as being caused by the 3D motion of the camera. The camera motion is parametrised as a continuous curve, with knots at the last row of each frame. Curve parameters are solved for using non-linear least squares over inter-frame correspondences obtained from a KLT tracker. We have generated synthetic RS sequences with associated ground-truth to allow controlled evaluation. Using these sequences, we demonstrate that our algorithm improves over to two previously published methods. The RS dataset is available on the web to allow comparison with other methods

    Place, publisher, year, edition, pages
    Los Alamitos, CA, USA: IEEE Computer Society, 2010
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-70572 (URN)10.1109/CVPR.2010.5540173 (DOI)978-1-4244-6984-0 (ISBN)
    Conference
    CVPR10, San Fransisco, USA, June 13-18, 2010
    Available from: 2011-09-13 Created: 2011-09-13 Last updated: 2015-12-10
    2. Efficient Video Rectification and Stabilisation for Cell-Phones
    Open this publication in new window or tab >>Efficient Video Rectification and Stabilisation for Cell-Phones
    2012 (English)In: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 96, no 3, p. 335-352Article in journal (Refereed) Published
    Abstract [en]

    This article presents a method for rectifying and stabilising video from cell-phones with rolling shutter (RS) cameras. Due to size constraints, cell-phone cameras have constant, or near constant focal length, making them an ideal application for calibrated projective geometry. In contrast to previous RS rectification attempts that model distortions in the image plane, we model the 3D rotation of the camera. We parameterise the camera rotation as a continuous curve, with knots distributed across a short frame interval. Curve parameters are found using non-linear least squares over inter-frame correspondences from a KLT tracker. By smoothing a sequence of reference rotations from the estimated curve, we can at a small extra cost, obtain a high-quality image stabilisation. Using synthetic RS sequences with associated ground-truth, we demonstrate that our rectification improves over two other methods. We also compare our video stabilisation with the methods in iMovie and Deshaker.

    Place, publisher, year, edition, pages
    Springer Verlag (Germany), 2012
    Keywords
    Cell-phone, Rolling shutter, CMOS, Video stabilisation
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-75277 (URN)10.1007/s11263-011-0465-8 (DOI)000299769400005 ()
    Note
    Funding Agencies|CENIIT organisation at Linkoping Institute of Technology||Swedish Research Council||Available from: 2012-02-27 Created: 2012-02-24 Last updated: 2017-12-07
    3. Scan Rectification for Structured Light Range Sensors with Rolling Shutters
    Open this publication in new window or tab >>Scan Rectification for Structured Light Range Sensors with Rolling Shutters
    2011 (English)In: IEEE International Conference on Computer Vision, Barcelona Spain, 2011, p. 1575-1582Conference paper, Published paper (Other academic)
    Abstract [en]

    Structured light range sensors, such as the Microsoft Kinect, have recently become popular as perception devices for computer vision and robotic systems. These sensors use CMOS imaging chips with electronic rolling shutters (ERS). When using such a sensor on a moving platform, both the image, and the depth map, will exhibit geometric distortions. We introduce an algorithm that can suppress such distortions, by rectifying the 3D point clouds from the range sensor. This is done by first estimating the time continuous 3D camera trajectory, and then transforming the 3D points to where they would have been, if the camera had been stationary. To ensure that image and range data are synchronous, the camera trajectory is computed from KLT tracks on the structured-light frames, after suppressing the structured-light pattern. We evaluate our rectification, by measuring angles between the visible sides of a cube, before and after rectification. We also measure how much better the 3D point clouds can be aligned after rectification. The obtained improvement is also related to the actual rotational velocity, measured using a MEMS gyroscope.

    Place, publisher, year, edition, pages
    Barcelona Spain: , 2011
    Series
    International Conference on Computer Vision (ICCV), ISSN 1550-5499
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-77059 (URN)10.1109/ICCV.2011.6126417 (DOI)978-1-4577-1101-5 (ISBN)
    Conference
    IEEE International Conference on Computer Vision(ICCV11), 8-11 November 2011, Barcelona, Spain
    Available from: 2012-05-07 Created: 2012-05-03 Last updated: 2015-12-10Bibliographically approved
    4. Co-alignmnent of Aerial Push-broom Strips using Trajectory Smoothness Constraints
    Open this publication in new window or tab >>Co-alignmnent of Aerial Push-broom Strips using Trajectory Smoothness Constraints
    2010 (English)Conference paper, Published paper (Other academic)
    Abstract [en]

    We study the problem of registering a sequence of scan lines (a strip) from an airborne push-broom imager to another sequence partly covering the same area. Such a registration has to compensate for deformations caused by attitude and speed changes in the aircraft. The registration is challenging, as both strips contain such deformations. Our algorithm estimates the 3D rotation of the camera for each scan line, by parametrising it as a linear spline with a number of knots evenly distributed in one of the strips. The rotations are estimated from correspondences between strips of the same area. Once the rotations are known, they can be compensated for, and each line of pixels can be transformed such that ground trace of the two strips are registered with respect to each other.

    Place, publisher, year, edition, pages
    Swedish Society for automated image analysis, 2010
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-70706 (URN)
    Conference
    SSBA10, Symposium on Image Analysis 11-12 March, Uppsala
    Available from: 2011-09-15 Created: 2011-09-15 Last updated: 2018-01-12Bibliographically approved
    5. Anisotropic Scattered Data Interpolation for Pushbroom Image Rectification
    Open this publication in new window or tab >>Anisotropic Scattered Data Interpolation for Pushbroom Image Rectification
    Show others...
    2014 (English)In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 23, no 5, p. 2302-2314Article in journal (Refereed) Published
    Abstract [en]

    This article deals with fast and accurate visualization of pushbroom image data from airborne and spaceborne platforms. A pushbroom sensor acquires images in a line-scanning fashion, and this results in scattered input data that needs to be resampled onto a uniform grid for geometrically correct visualization. To this end, we model the anisotropic spatial dependence structure caused by the acquisition process. Several methods for scattered data interpolation are then adapted to handle the induced anisotropic metric and compared for the pushbroom image rectification problem. A trick that exploits the semi-ordered line structure of pushbroom data to improve the computational complexity several orders of magnitude is also presented.

    Place, publisher, year, edition, pages
    IEEE, 2014
    Keywords
    pushbroom, rectification, hyperspectral, interpolation, anisotropic, scattered data
    National Category
    Engineering and Technology Electrical Engineering, Electronic Engineering, Information Engineering Signal Processing
    Identifiers
    urn:nbn:se:liu:diva-108105 (URN)10.1109/TIP.2014.2316377 (DOI)000350284400001 ()
    Available from: 2014-06-25 Created: 2014-06-25 Last updated: 2018-09-25Bibliographically approved
    6. A Virtual Tripod for Hand-held Video Stacking on Smartphones
    Open this publication in new window or tab >>A Virtual Tripod for Hand-held Video Stacking on Smartphones
    2014 (English)In: 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL PHOTOGRAPHY (ICCP), IEEE , 2014Conference paper, Published paper (Refereed)
    Abstract [en]

    We propose an algorithm that can capture sharp, low-noise images in low-light conditions on a hand-held smartphone. We make use of the recent ability to acquire bursts of high resolution images on high-end models such as the iPhone5s. Frames are aligned, or stacked, using rolling shutter correction, based on motion estimated from the built-in gyro sensors and image feature tracking. After stacking, the images may be combined, using e.g. averaging to produce a sharp, low-noise photo. We have tested the algorithm on a variety of different scenes, using several different smartphones. We compare our method to denoising, direct stacking, as well as a global-shutter based stacking, with favourable results.

    Place, publisher, year, edition, pages
    IEEE, 2014
    Series
    IEEE International Conference on Computational Photography, ISSN 2164-9774
    National Category
    Engineering and Technology Electrical Engineering, Electronic Engineering, Information Engineering Signal Processing
    Identifiers
    urn:nbn:se:liu:diva-108109 (URN)10.1109/ICCPHOT.2014.6831799 (DOI)000356494100001 ()978-1-4799-5188-8 (ISBN)
    Conference
    IEEE International Conference on Computational Photography (ICCP 2014), May 2-4, 2014, Intel, Santa Clara, USA
    Projects
    VPS
    Available from: 2014-06-25 Created: 2014-06-25 Last updated: 2015-12-10Bibliographically approved
  • 274.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    A Virtual Tripod for Hand-held Video Stacking on Smartphones2014In: 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL PHOTOGRAPHY (ICCP), IEEE , 2014Conference paper (Refereed)
    Abstract [en]

    We propose an algorithm that can capture sharp, low-noise images in low-light conditions on a hand-held smartphone. We make use of the recent ability to acquire bursts of high resolution images on high-end models such as the iPhone5s. Frames are aligned, or stacked, using rolling shutter correction, based on motion estimated from the built-in gyro sensors and image feature tracking. After stacking, the images may be combined, using e.g. averaging to produce a sharp, low-noise photo. We have tested the algorithm on a variety of different scenes, using several different smartphones. We compare our method to denoising, direct stacking, as well as a global-shutter based stacking, with favourable results.

  • 275.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Efficient Video Rectification and Stabilisation for Cell-Phones2012In: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 96, no 3, p. 335-352Article in journal (Refereed)
    Abstract [en]

    This article presents a method for rectifying and stabilising video from cell-phones with rolling shutter (RS) cameras. Due to size constraints, cell-phone cameras have constant, or near constant focal length, making them an ideal application for calibrated projective geometry. In contrast to previous RS rectification attempts that model distortions in the image plane, we model the 3D rotation of the camera. We parameterise the camera rotation as a continuous curve, with knots distributed across a short frame interval. Curve parameters are found using non-linear least squares over inter-frame correspondences from a KLT tracker. By smoothing a sequence of reference rotations from the estimated curve, we can at a small extra cost, obtain a high-quality image stabilisation. Using synthetic RS sequences with associated ground-truth, we demonstrate that our rectification improves over two other methods. We also compare our video stabilisation with the methods in iMovie and Deshaker.

  • 276.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Rectifying rolling shutter video from hand-held devices2011In: Proceedings SSBA´11 Symposium on Image Analysis, 2011Conference paper (Other academic)
    Abstract [en]

    This paper presents a method for rectifying video sequences from rolling shutter (RS) cameras. In contrast to previous RS rectification attempts we model distortions as being caused by the 3D motion of the camera. The camera motion is parametrised as a continuous curve, with knots at the last row of each frame. Curve parameters are solved for using non-linear least squares over inter-frame correspondences obtained from a KLT tracker. We have generated synthetic RS sequences with associated ground-truth to allow controlled evaluation. Using these sequences, we demonstrate that our algorithm improves over two previously published methods. The RS dataset is available on the web to allow comparison with other methods.

  • 277.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Scan Rectification for Structured Light Range Sensors with Rolling Shutters2011In: IEEE International Conference on Computer Vision, Barcelona Spain, 2011, p. 1575-1582Conference paper (Other academic)
    Abstract [en]

    Structured light range sensors, such as the Microsoft Kinect, have recently become popular as perception devices for computer vision and robotic systems. These sensors use CMOS imaging chips with electronic rolling shutters (ERS). When using such a sensor on a moving platform, both the image, and the depth map, will exhibit geometric distortions. We introduce an algorithm that can suppress such distortions, by rectifying the 3D point clouds from the range sensor. This is done by first estimating the time continuous 3D camera trajectory, and then transforming the 3D points to where they would have been, if the camera had been stationary. To ensure that image and range data are synchronous, the camera trajectory is computed from KLT tracks on the structured-light frames, after suppressing the structured-light pattern. We evaluate our rectification, by measuring angles between the visible sides of a cube, before and after rectification. We also measure how much better the 3D point clouds can be aligned after rectification. The obtained improvement is also related to the actual rotational velocity, measured using a MEMS gyroscope.

  • 278.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Friman, Ola
    Linköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, The Institute of Technology. Sick IVP AB, Linköping, Sweden.
    Olsvik Opsahl, Thomas
    Norwegian Defence Research Establishment.
    Vegard Haavardsholm, Trym
    Norwegian Defence Research Establishment.
    Kåsen, Ingebjørg
    Norwegian Defence Research Establishment.
    Anisotropic Scattered Data Interpolation for Pushbroom Image Rectification2014In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 23, no 5, p. 2302-2314Article in journal (Refereed)
    Abstract [en]

    This article deals with fast and accurate visualization of pushbroom image data from airborne and spaceborne platforms. A pushbroom sensor acquires images in a line-scanning fashion, and this results in scattered input data that needs to be resampled onto a uniform grid for geometrically correct visualization. To this end, we model the anisotropic spatial dependence structure caused by the acquisition process. Several methods for scattered data interpolation are then adapted to handle the induced anisotropic metric and compared for the pushbroom image rectification problem. A trick that exploits the semi-ordered line structure of pushbroom data to improve the computational complexity several orders of magnitude is also presented.

  • 279.
    Ringdahl, Viktor
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Stereo Camera Pose Estimation to Enable Loop Detection2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Visual Simultaneous Localization And Mapping (SLAM) allows for three dimensionalreconstruction from a camera’s output and simultaneous positioning of the camera withinthe reconstruction. With use cases ranging from autonomous vehicles to augmentedreality, the SLAM field has garnered interest both commercially and academically.

    A SLAM system performs odometry as it estimates the camera’s movement throughthe scene. The incremental estimation of odometry is not error free and exhibits driftover time with map inconsistencies as a result. Detecting the return to a previously seenplace, a loop, means that this new information regarding our position can be incorporatedto correct the trajectory retroactively. Loop detection can also facilitate relocalization ifthe system loses tracking due to e.g. heavy motion blur.

    This thesis proposes an odometric system making use of bundle adjustment within akeyframe based stereo SLAM application. This system is capable of detecting loops byutilizing the algorithm FAB-MAP. Two aspects of this system is evaluated, the odometryand the capability to relocate. Both of these are evaluated using the EuRoC MAV dataset,with an absolute trajectory RMS error ranging from 0.80 m to 1.70 m for the machinehall sequences.

    The capability to relocate is evaluated using a novel methodology that intuitively canbe interpreted. Results are given for different levels of strictness to encompass differentuse cases. The method makes use of reprojection of points seen in keyframes to definewhether a relocalization is possible or not. The system shows a capability to relocate inup to 85% of all cases when a keyframe exists that can project 90% of its points intothe current view. Errors in estimated poses were found to be correlated with the relativedistance, with errors less than 10 cm in 23% to 73% of all cases.

    The evaluation of the whole system is augmented with an evaluation of local imagedescriptors and pose estimation algorithms. The descriptor SIFT was found to performbest overall, but demanding to compute. BRISK was deemed the best alternative for afast yet accurate descriptor.

    Conclusions that can be drawn from this thesis is that FAB-MAP works well fordetecting loops as long as the addition of keyframes is handled appropriately.

  • 280.
    Ringqvist, Sanna
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Classification of terrain using superpixel segmentation and supervised learning2014Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The usage of 3D-modeling is expanding rapidly. Modeling from aerial imagery has become very popular due to its increasing number of both civilian and mili- tary applications like urban planning, navigation and target acquisition.

    This master thesis project was carried out at Vricon Systems at SAAB. The Vricon system produces high resolution geospatial 3D data based on aerial imagery from manned aircrafts, unmanned aerial vehicles (UAV) and satellites.

    The aim of this work was to investigate to what degree superpixel segmentation and supervised learning can be applied to a terrain classification problem using imagery and digital surface models (dsm). The aim was also to investigate how the height information from the digital surface model may contribute compared to the information from the grayscale values. The goal was to identify buildings, trees and ground. Another task was to evaluate existing methods, and compare results.

    The approach for solving the stated goal was divided into several parts. The first part was to segment the image using superpixel segmentation, after that features were extracted. Then the classifiers were created and trained and finally the classifiers were evaluated.

    The classification method that obtained the best results in this thesis had approx- imately 90 % correctly labeled superpixels. The result was equal, if not better, compared to other solutions available on the market. 

  • 281.
    Robinson, Andreas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Implementation and evaluation of a 3D tracker2014Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Many methods have been developed for visual tracking of generic objects. The vast majority of these assume the world is two-dimensional, either ignoring the third dimension or only dealing with it indirectly. This causes difficulties for the tracker when the target approaches or moves away from the camera, is occluded or moves out of the camera frame.

    Unmanned aerial vehicles (UAVs) are increasingly used in civilian applications and some of these will undoubtedly carry tracking systems in the future. As they move around, these trackers will encounter both scale changes and occlusions. To improve the tracking performance in these cases, the third dimension should be taken into account.

    This thesis extends the capabilities of a 2D tracker to three dimensions, with the assumption that the target moves on a ground plane.

    The position of the tracker camera is established by matching the video it produces to a sparse point-cloud map built with off-the-shelf structure-from-motion software. A target is tracked with a generic 2D tracker and subsequently positioned on the ground. Should the target disappear from view, its motion on the ground is predicted. In combination, these simple techniques are shown to improve the robustness of a tracking system on a moving platform under target scale changes and occlusions.

  • 282.
    Robinson, Andreas
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Persson, Mikael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Robust Accurate Extrinsic Calibration of Static Non-overlapping Cameras2017In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10425, p. 342-353Conference paper (Refereed)
    Abstract [en]

    An increasing number of robots and autonomous vehicles are equipped with multiple cameras to achieve surround-view sensing. The estimation of their relative poses, also known as extrinsic parameter calibration, is a challenging problem, particularly in the non-overlapping case. We present a simple and novel extrinsic calibration method based on standard components that performs favorably to existing approaches. We further propose a framework for predicting the performance of different calibration configurations and intuitive error metrics. This makes selecting a good camera configuration straightforward. We evaluate on rendered synthetic images and show good results as measured by angular and absolute pose differences, as well as the reprojection error distributions.

  • 283.
    Rundgren, Emil
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Automatic Volume Estimation of Timber from Multi-View Stereo 3D Reconstruction2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The ability to automatically estimate the volume of timber is becoming increasingly important within the timber industry. The large number of timber trucks arriving each day at Swedish timber terminals fortifies the need for a volume estimation performed in real-time and on-the-go as the trucks arrive.

    This thesis investigates if a volumetric integration of disparity maps acquired from a Multi-View Stereo (MVS) system is a suitable approach for automatic volume estimation of timber loads. As real-time execution is preferred, efforts were made to provide a scalable method. The proposed method was quantitatively evaluated on datasets containing two geometric objects of known volume. A qualitative comparison to manual volume estimates of timber loads was also made on datasets recorded at a Swedish timber terminal.

    The proposed method is shown to be both accurate and precise under specific circumstances. However, robustness is poor to varying weather conditions, although a more thorough evaluation of this aspect needs to be performed. The method is also parallelizable, which means that future efforts can be made to significantly decrease execution time.

  • 284.
    Rydholm, Niklas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Panoramic Video Stitching2015Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In this thesis a system for creating panoramic video has been developed. The panoramic video is formed by stitching several camera streams together. The system is designed as a vehicle mounted system, but can be applied to several other areas, such as surveillance. The system creates the video by finding features that correspond in the overlapping frames. By using cylinder projection the problem is reduced to finding a translation between the images and using algorithms such as ORB matching features can be detected and described. The camera frames are stitched together by calculating the average translation of the matching features. To reduce artifacts such as ghosting, a simple but effective alpha blending technique has been used. The system has been implemented using C++ and the OpenCV library and the algorithm is capable of processing about 15 frames per second making it close to real-time. With future improvements, such as parallel processing of the cameras, the system may be speeded up even further and possibly include other types of image processing, e.g. object recognition and tracking.

  • 285.
    Rydström, Daniel
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Calibration of Laser Triangulating Cameras in Small Fields of View2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    A laser triangulating camera system projects a laser line onto an object to create height curveson the object surface. By moving the object, height curves from different parts of the objectcan be observed and combined to produce a three dimensional representation of the object.The calibration of such a camera system involves transforming received data to get real worldmeasurements instead of pixel based measurements.

    The calibration method presented in this thesis focuses specifically on small fields ofview. The goal is to provide an easy to use and robust calibration method that can complementalready existing calibration methods. The tool should get as good measurementsin metric units as possible, while still keeping complexity and production costs of the calibrationobject low. The implementation uses only data from the laser plane itself making itusable also in environments where no external light exist.

    The proposed implementation utilises a complete scan of a three dimensional calibrationobject and returns a calibration for three dimensions. The results of the calibration havebeen evaluated against synthetic and real data.

  • 286.
    Sandberg, David
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Model-Based Video Coding Using a Colour and Depth Camera2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In this master thesis, a model-based video coding algorithm has been developed that uses input from a colour and depth camera, such as the Microsoft Kinect. Using a model-based representation of a video has several advantages over the commonly used block-based approach, used by the H.264 standard. For example, videos can be rendered in 3D, be viewed from alternative views, and have objects inserted into them for augmented reality and user interaction.

    This master thesis demonstrates a very efficient way of encoding the geometry of a scene. The results of the proposed algorithm show that it can reach very low bitrates with comparable results to the H.264 standard.

  • 287.
    Sandberg, David
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ogniewski, Jens
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, The Institute of Technology.
    Model-Based Video Coding using Colour and Depth Cameras2011In: Digital Image Computing: Techniques and Applications (DICTA11), IEEE , 2011, p. 158-163Conference paper (Other academic)
    Abstract [en]

    In this paper, we present a model-based video coding method that uses input from colour and depth cameras, such as the Microsoft Kinect. The model-based approach uses a 3D representation of the scene, enabling several other applications besides video playback. Some of these applications are stereoscopic viewing, object insertion for augmented reality and free viewpoint viewing. The video encoding step uses computer vision to estimate the camera motion. The scene geometry is represented by keyframes, which are encoded as 3D quadsusing a quadtree, allowing good compression rates. Camera motion in-between keyframes is approximated to be linear. The relative camera positions at keyframes and the scene geometry are then compressed and transmitted to the decoder. Our experiments demonstrate that the model-based approach delivers a high level of detail at competitively low bitrates.

  • 288.
    Sandsveden, Daniel
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Evaluation of Random Forests for Detection and Localization of Cattle Eyes2015Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In a time when cattle herds grow continually larger the need for automatic methods to detect diseases is ever increasing. One possible method to discover diseases is to use thermal images and automatic head and eye detectors. In this thesis an eye detector and a head detector is implemented using the Random Forests classifier. During the implementation the classifier is evaluated using three different descriptors: Histogram of Oriented Gradients, Local Binary Patterns, and a descriptor based on pixel differences. An alternative classifier, the Support Vector Machine, is also evaluated for comparison against Random Forests.

    The thesis results show that Histogram of Oriented Gradients performs well as a description of cattle heads, while Local Binary Patterns performs well as a description of cattle eyes. The provided descriptor performs almost equally well in both cases. The results also show that Random Forests performs approximately as good as the Support Vector Machine, when the Support Vector Machine is paired with Local Binary Patterns for both heads and eyes.

    Finally the thesis results indicate that it is easier to detect and locate cattle heads than it is to detect and locate cattle eyes. For eyes, combining a head detector and an eye detector is shown to give a better result than only using an eye detector. In this combination heads are first detected in images, followed by using the eye detector in areas classified as heads.

  • 289.
    Schmiterlöw, Maria
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Autonomous Path Following Using Convolutional Networks2012Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Autonomous vehicles have many application possibilities within many different fields like rescue missions, exploring foreign environments or unmanned vehicles etc. For such system to navigate in a safe manner, high requirements of reliability and security must be fulfilled.

    This master's thesis explores the possibility to use the machine learning algorithm convolutional network on a robotic platform for autonomous path following. The only input to predict the steering signal is a monochromatic image taken by a camera mounted on the robotic car pointing in the steering direction. The convolutional network will learn from demonstrations in a supervised manner.

    In this thesis three different preprocessing options are evaluated. The evaluation is based on the quadratic error and the number of correctly predicted classes. The results show that the convolutional network has no problem of learning a correct behaviour and scores good result when evaluated on similar data that it has been trained on. The results also show that the preprocessing options are not enough to ensure that the system is environment dependent.

  • 290.
    Sjöholm, Alexander
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Closing the Loop: Mobile Visual Location Recognition2014Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Visual simultaneous localization and mapping (SLAM) as field has been researched for ten years, but with recent advances in mobile performance visual SLAM is entering the consumer market in a completely new way. A visual SLAM system will however be sensitive to non cautious use that may result in severe motion, occlusion or poor surroundings in terms of visual features that will cause the system to temporarily fail. The procedure of recovering from such a fail is called relocalization. Together with two similar problems localization, to find your position in an existing SLAM session, and loop closing, the online reparation and perfection of the map in an active SLAM session, these can be grouped as visual location recognition (VLR).

    This thesis presents novel results by combining the scalability of FabMap and the precision of 13th Lab's tracking yielding high-precision VLR, +/- 10 cm, while maintaining above 99 % precision and 60 % recall for sessions containing thousands of images. Everything functional purely on a normal mobile phone.

    The applications of VLR are many. Indoors, where GPS is not functioning, VLR can still provide positional information and navigate you through big complexes like airports and museums. Outdoors, VLR can improve the precision of GPS tenfold yielding a new level of navigational experience. Virtual and augmented reality applications are other areas that benefit from improved positioning and localization.

  • 291.
    Sjölund, Jonathan
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Detection of Frozen Video Subtitles Using Machine Learning2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    When subtitles are burned into a video, an error can sometimes occur in the encoder that results in the same subtitle being burned into several frames, resulting in subtitles becoming frozen. This thesis provides a way to detect frozen video subtitles with the help of an implemented text detector and classifier.

    Two types of classifiers, naïve classifiers and machine learning classifiers, are tested and compared on a variety of different videos to see how much a machine learning approach can improve the performance. The naïve classifiers are evaluated using ground truth data to gain an understanding of the importance of good text detection. To understand the difficulty of the problem, two different machine learning classifiers are tested, logistic regression and random forests.

    The result shows that machine learning improves the performance over using naïve classifiers by improving the specificity from approximately 87.3% to 95.8% and improving the accuracy from 93.3% to 95.5%. Random forests achieve the best overall performance, but the difference compared to when using logistic regression is small enough that more computationally complex machine learning classifiers are not necessary. Using the ground truth shows that the weaker naïve classifiers would be improved by at least 4.2% accuracy, thus a better text detector is warranted. This thesis shows that machine learning is a viable option for detecting frozen video subtitles.

  • 292.
    Stacke, Karin
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Automatic Brain Segmentation into Substructures Using Quantitative MRI2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Segmentation of the brain into sub-volumes has many clinical applications. Manyneurological diseases are connected with brain atrophy (tissue loss). By dividingthe brain into smaller compartments, volume comparison between the compartmentscan be made, as well as monitoring local volume changes over time. Theformer is especially interesting for the left and right cerebral hemispheres, dueto their symmetric appearance. By using automatic segmentation, the time consumingstep of manually labelling the brain is removed, allowing for larger scaleresearch.In this thesis, three automatic methods for segmenting the brain from magneticresonance (MR) images are implemented and evaluated. Since neither ofthe evaluated methods resulted in sufficiently good segmentations to be clinicallyrelevant, a novel segmentation method, called SB-GC (shape bottleneck detectionincorporated in graph cuts), is also presented. SB-GC utilizes quantitative MRIdata as input data, together with shape bottleneck detection and graph cuts tosegment the brain into the left and right cerebral hemispheres, the cerebellumand the brain stem. SB-GC shows promises of highly accurate and repeatable resultsfor both healthy, adult brains and more challenging cases such as childrenand brains containing pathologies.

  • 293.
    Stein, Madeleine
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improving Image Based Fruitcount Estimates Using Multiple View-Points2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This master-thesis presents an approach to track and count the number of fruit incommercial mango orchards. The algorithm is intended to enable precision agri-culture and to facilitate labour and post-harvest storage planning. The primary objective is to develop an multi-view algorithm and investigate how it can beused to mitigate the effects of visual occlusion, to improve upon estimates frommethods that use a single central or two opposite viewpoints. Fruit are detectedin images by using two classification methods: dense pixel-wise cnn and regionbased r-cnn detection. Pair-wise fruit correspondences are established between images by using geometry provided by navigation data, and lidar data is used to generate image masks for each separate tree, to isolate fruit counts to individual trees. The tracked fruit are triangulated to locate them in 3D space, and spatial statistics are calculated over whole orchard blocks. The estimated tree counts are compared to single view estimates and validated against ground truth data of 16 mango trees from a Bundaberg mango orchard in Queensland, Australia. The results show a high R2-value of 0.99335 for four hand labelled trees and a highest R2-value of 0.9165 for the machine labelled images using the r-cnn classifier forthe 16 target trees.

  • 294.
    Stenhagen, Petter
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improving Realism in Synthetic Barcode Images using Generative Adversarial Networks2018Independent thesis Advanced level (degree of Master (Two Years)), 300 HE creditsStudent thesis
    Abstract [en]

    This master thesis explores the possibility of using generative Adversarial Networks (GANs) to refine labeled synthetic code images to resemble real code images while preserving label information. The GAN used in this thesis consists of a refiner and a discriminator. The discriminator tries to distinguish between real images and refined synthetic images. The refiner tries to fool the discriminator by producing refined synthetic images such that the discriminator classify them as real. By updating these two networks iteratively, the idea is that they will push each other to get better, resulting in refined synthetic images with real image characteristics.

    The aspiration, if the exploration of GANs turns out successful, is to be able to use refined synthetic images as training data in Semantic Segmentation (SS) tasks and thereby eliminate the laborious task of gathering and labeling real data. Starting off from a foundational GAN-model, different network architectures, hyperparameters and other design choices are explored to find the best performing GAN-model.

    As is widely acknowledged in the relevant literature, GANs can be difficult to train and the results in this thesis are varying and sometimes ambiguous. Based on the results from this study, the best performing models do however perform better in SS tasks than the unrefined synthetic set they are based on and benchmarked against, with regards to Intersection over Union.

  • 295.
    Stigson, Magnus
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Object Tracking Using Tracking-Learning-Detection inThermal Infrared Video2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Automatic tracking of an object of interest in a video sequence is a task that has been much researched. Difficulties include varying scale of the object, rotation and object appearance changing over time, thus leading to tracking failures. Different tracking methods, such as short-term tracking often fail if the object steps out of the camera’s field of view, or changes shape rapidly. Also, small inaccuracies in the tracking method can accumulate over time, which can lead to tracking drift. Long-term tracking is also problematic, partly due to updating and degradation of the object model, leading to incorrectly classified and tracked objects.

    This master’s thesis implements a long-term tracking framework called Tracking-Learning-Detection which can learn and adapt, using so called P/N-learning, to changing object appearance over time, thus making it more robust to tracking failures. The framework consists of three parts; a tracking module which follows the object from frame to frame, a learning module that learns new appearances of the object, and a detection module which can detect learned appearances of the object and correct the tracking module if necessary.

    This tracking framework is evaluated on thermal infrared videos and the results are compared to the results obtained from videos captured within the visible spectrum. Several important differences between visual and thermal infrared tracking are presented, and the effect these have on the tracking performance is evaluated.

    In conclusion, the results are analyzed to evaluate which differences matter the most and how they affect tracking, and a number of different ways to improve the tracking are proposed.

  • 296.
    Strömberg, Isak
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Characterization of creping marks in paper2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The cost and environmental damage of reclaims is a large problem within thepaper industry. With certain types of paper, so called crepe marks on the paper’ssurface is a common issue, leading to printing defects and consequentlyreclaims. This thesis compares four different image analysis methods for evaluatingcrepe marks and predicting printing results. The methods evaluated consistsof one established methods, two adaptations of established methods andone novel method. All methods were evaluated on the same data, topographicheight images of paper samples from 4 paper rolls of similar type but differingin roughness. The method based on 1D Fourier analysis and the method basedon fully convolutional networks performs best, depending on if speed or detailedcharacteristics is a priority.

  • 297.
    Strömgren, Oliver
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Deep Learning for Autonomous Collision Avoidance2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Deep learning has been rapidly growing in recent years obtaining excellent results for many computer vision applications, such as image classification and object detection. One aspect for the increased popularity of deep learning is that it mitigates the need for hand-crafted features. This thesis work investigates deep learning as a methodology to solve the problem of autonomous collision avoidance for a small robotic car. To accomplish this, transfer learning is used with the VGG16 deep network pre-trained on ImageNet dataset. A dataset has been collected and then used to fine-tune and validate the network offline. The deep network has been used with the robotic car in a real-time manner. The robotic car sends images to an external computer, which is used for running the network. The predictions from the network is sent back to the robotic car which takes actions based on those predictions. The results show that deep learning has great potential in solving the collision avoidance problem.

  • 298.
    Stynsberg, John
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Incorporating Scene Depth in Discriminative Correlation Filters for Visual Tracking2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Visual tracking is a computer vision problem where the task is to follow a targetthrough a video sequence. Tracking has many important real-world applications in several fields such as autonomous vehicles and robot-vision. Since visual tracking does not assume any prior knowledge about the target, it faces different challenges such occlusion, appearance change, background clutter and scale change. In this thesis we try to improve the capabilities of tracking frameworks using discriminative correlation filters by incorporating scene depth information. We utilize scene depth information on three main levels. First, we use raw depth information to segment the target from its surroundings enabling occlusion detection and scale estimation. Second, we investigate different visual features calculated from depth data to decide which features are good at encoding geometric information available solely in depth data. Third, we investigate handling missing data in the depth maps using a modified version of the normalized convolution framework. Finally, we introduce a novel approach for parameter search using genetic algorithms to find the best hyperparameters for our tracking framework. Experiments show that depth data can be used to estimate scale changes and handle occlusions. In addition, visual features calculated from depth are more representative if they were combined with color features. It is also shown that utilizing normalized convolution improves the overall performance in some cases. Lastly, the usage of genetic algorithms for hyperparameter search leads to accuracy gains as well as some insights on the performance of different components within the framework.

  • 299.
    Sundelius, Carl
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Deep Fusion of Imaging Modalities for Semantic Segmentation of Satellite Imagery2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In this report I summarize my master’s thesis work, in which I have investigated different approaches for fusing imaging modalities for semantic segmentation with deep convolutional networks. State-of-the-art methods for semantic segmentation of RGB-images use pre-trained models, which are fine-tuned to learn task-specific deep features. However, the use of pre-trained model weights constrains the model input to images with three channels (e.g. RGB-images). In some applications, e.g. classification of satellite imagery, there are other imaging modalities that can complement the information from the RGB modality and, thus, improve the performance of the classification. In this thesis, semantic segmentation methods designed for RGB images are extended to handle multiple imaging modalities, without compromising on the benefits, that pre-training on RGB datasets offers.

    In the experiments of this thesis, RGB images from satellites have been fused with normalised difference vegetation index (NDVI) and a digital surface model (DSM). The evaluation shows that the modality fusion can significantly improve the performance of semantic segmentation networks in comparison with a corresponding network with only RGB input. However, the different investigated approaches to fuse the modalities proved to achieve similar performance. The conclusion of the experiments is, that the fusion of imaging modalities is necessary, but the method of fusion has shown to be of less importance.

  • 300.
    Svensk, Joakim
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Evaluation of Aerial Image Stereo Matching Methods for Forest Variable Estimation2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This work investigates the landscape of aerial image stereo matching (AISM) methods suitable for large scale forest variable estimation. AISM methods are an important source of remotely collected information used in modern forestry to keep track of a growing forest's condition.

    A total of 17 AISM methods are investigated, out of which 4 are evaluated by processing a test data set consisting of three aerial images. The test area is located in southern Sweden, consisting of mainly Norway Spruce and Scots Pine. From the resulting point clouds and height raster images, a total of 30 different metrics of both height and density types are derived. Linear regression is used to fit functions from metrics derived from AISM data to a set of forest variables including tree height (HBW), tree diameter (DBW), basal area, volume. As ground truth, data collected by dense airborne laser scanning is used. Results are presented as RMSE and standard deviation concluded from the linear regression.

    For tree height, tree diameter, basal area, volume the RMSE ranged from 7.442% to 10.11%, 11.58% to 13.96%, 32.01% to 35.10% and 34.01% to 38.26% respectively. The results concluded that all four tested methods achieved comparable estimation quality although showing small differences among them. Keystone and SURE performed somewhat better while MicMac placed third and Photoscan achieved the less accurate result.

345678 251 - 300 of 367
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf