liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 81) Show all publications
Järemo Lawin, F., Danelljan, M., Khan, F. S., Forssén, P.-E. & Felsberg, M. (2018). Density Adaptive Point Set Registration. In: : . Paper presented at The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, United States, 18-22 June, 2018.
Open this publication in new window or tab >>Density Adaptive Point Set Registration
Show others...
2018 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Probabilistic methods for point set registration have demonstrated competitive results in recent years. These techniques estimate a probability distribution model of the point clouds. While such a representation has shown promise, it is highly sensitive to variations in the density of 3D points. This fundamental problem is primarily caused by changes in the sensor location across point sets.    We revisit the foundations of the probabilistic registration paradigm. Contrary to previous works, we model the underlying structure of the scene as a latent probability distribution, and thereby induce invariance to point set density changes. Both the probabilistic model of the scene and the registration parameters are inferred by minimizing the Kullback-Leibler divergence in an Expectation Maximization based framework. Our density-adaptive registration successfully handles severe density variations commonly encountered in terrestrial Lidar applications. We perform extensive experiments on several challenging real-world Lidar datasets. The results demonstrate that our approach outperforms state-of-the-art probabilistic methods for multi-view registration, without the need of re-sampling.

National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:liu:diva-149774 (URN)
Conference
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, United States, 18-22 June, 2018
Available from: 2018-07-18 Created: 2018-07-18 Last updated: 2018-08-15Bibliographically approved
Ovrén, H. & Forssen, P.-E. (2018). Spline Error Weighting for Robust Visual-Inertial Fusion. In: : . Paper presented at The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 18-22, 2018, Salt Lake City, USA.
Open this publication in new window or tab >>Spline Error Weighting for Robust Visual-Inertial Fusion
2018 (English)Conference paper, Oral presentation only (Refereed)
Abstract [en]

In this paper we derive and test a probability-based weighting that can balance residuals of different types in spline fitting. In contrast to previous formulations, the proposed spline error weighting scheme also incorporates a prediction of the approximation error of the spline fit. We demonstrate the effectiveness of the prediction in a synthetic experiment, and apply it to visual-inertial fusion on rolling shutter cameras. This results in a method that can estimate 3D structure with metric scale on generic first-person videos. We also propose a quality measure for spline fitting, that can be used to automatically select the knot spacing. Experiments verify that the obtained trajectory quality corresponds well with the requested quality. Finally, by linearly scaling the weights, we show that the proposed spline error weighting minimizes the estimation errors on real sequences, in terms of scale and end-point errors.

National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:liu:diva-149495 (URN)
Conference
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 18-22, 2018, Salt Lake City, USA
Funder
Swedish Research Council, 2014-5928Swedish Research Council, 2014-6227
Available from: 2018-07-03 Created: 2018-07-03 Last updated: 2018-08-02Bibliographically approved
Wallenberg, M. & Forssen, P.-E. (2017). Attentional Masking for Pre-trained Deep Networks. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS17): . Paper presented at The 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017), September 24–28, Vancouver, Canada. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Attentional Masking for Pre-trained Deep Networks
2017 (English)In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS17), Institute of Electrical and Electronics Engineers (IEEE), 2017Conference paper, Published paper (Refereed)
Abstract [en]

The ability to direct visual attention is a fundamental skill for seeing robots. Attention comes in two flavours: the gaze direction (overt attention) and attention to a specific part of the current field of view (covert attention), of which the latter is the focus of the present study. Specifically, we study the effects of attentional masking within pre-trained deep neural networks for the purpose of handling ambiguous scenes containing multiple objects. We investigate several variants of attentional masking on partially pre-trained deep neural networks and evaluate the effects on classification performance and sensitivity to attention mask errors in multi-object scenes. We find that a combined scheme consisting of multi-level masking and blending provides the best trade-off between classification accuracy and insensitivity to masking errors. This proposed approach is denoted multilayer continuous-valued convolutional feature masking (MC-CFM). For reasonably accurate masks it can suppress the influence of distracting objects and reach comparable classification performance to unmasked recognition in cases without distractors.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Systems
Identifiers
urn:nbn:se:liu:diva-142061 (URN)10.1109/IROS.2017.8206516 (DOI)000426978205110 ()978-1-5386-2682-5 (ISBN)
Conference
The 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017), September 24–28, Vancouver, Canada
Note

Funding agencies: Swedish Research Council [2014-5928]; Linkoping University

Available from: 2017-10-20 Created: 2017-10-20 Last updated: 2018-04-11Bibliographically approved
Eilertsen, G., Forssen, P.-E. & Unger, J. (2017). BriefMatch: Dense binary feature matching for real-time optical flow estimation. In: Puneet Sharma, Filippo Maria Bianchi (Ed.), Proceedings of the Scandinavian Conference on Image Analysis (SCIA17): . Paper presented at Scandinavian Conference on Image Analysis (SCIA17) (pp. 221-233). , 10269
Open this publication in new window or tab >>BriefMatch: Dense binary feature matching for real-time optical flow estimation
2017 (English)In: Proceedings of the Scandinavian Conference on Image Analysis (SCIA17) / [ed] Puneet Sharma, Filippo Maria Bianchi, 2017, Vol. 10269, p. 221-233Conference paper, Published paper (Refereed)
Abstract [en]

Research in optical flow estimation has to a large extent focused on achieving the best possible quality with no regards to running time. Nevertheless, in a number of important applications the speed is crucial. To address this problem we present BriefMatch, a real-time optical flow method that is suitable for live applications. The method combines binary features with the search strategy from PatchMatch in order to efficiently find a dense correspondence field between images. We show that the BRIEF descriptor provides better candidates (less outlier-prone) in shorter time, when compared to direct pixel comparisons and the Census transform. This allows us to achieve high quality results from a simple filtering of the initially matched candidates. Currently, BriefMatch has the fastest running time on the Middlebury benchmark, while placing highest of all the methods that run in shorter than 0.5 seconds.

Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords
computer vision, optical flow, feature matching, real-time computation
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-149418 (URN)10.1007/978-3-319-59126-1_19 (DOI)978-3-319-59125-4 (ISBN)
Conference
Scandinavian Conference on Image Analysis (SCIA17)
Available from: 2018-06-28 Created: 2018-06-28 Last updated: 2018-06-28
Ogniewski, J. & Forssén, P.-E. (2017). Pushing the Limits for View Prediction in Video Coding. In: 12th International Conference on Computer Vision Theory and Applications (VISAPP’17): . Paper presented at 12th International Conference on Computer Vision Theory and Applications (VISAPP'17), 27 February-1 March, Porto, Portugal. Scitepress Digital Library
Open this publication in new window or tab >>Pushing the Limits for View Prediction in Video Coding
2017 (English)In: 12th International Conference on Computer Vision Theory and Applications (VISAPP’17), Scitepress Digital Library , 2017Conference paper, Published paper (Refereed)
Abstract [en]

The ability to direct visual attention is a fundamental skill for seeing robots. Attention comes in two flavours: the gaze direction (overt attention) and attention to a specific part of the current field of view (covert attention), of which the latter is the focus of the present study. Specifically, we study the effects of attentional masking within pre-trained deep neural networks for the purpose of handling ambiguous scenes containing multiple objects. We investigate several variants of attentional masking on partially pre-trained deep neural networks and evaluate the effects on classification performance and sensitivity to attention mask errors in multi-object scenes. We find that a combined scheme consisting of multi-level masking and blending provides the best trade-off between classification accuracy and insensitivity to masking errors. This proposed approach is denoted multilayer continuous-valued convolutional feature masking (MC-CFM). For reasonably accurate masks it can suppress the influence of distracting objects and reach comparable classification performance to unmasked recognition in cases without distractors.

Place, publisher, year, edition, pages
Scitepress Digital Library, 2017
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Engineering
Identifiers
urn:nbn:se:liu:diva-142063 (URN)
Conference
12th International Conference on Computer Vision Theory and Applications (VISAPP'17), 27 February-1 March, Porto, Portugal
Available from: 2017-10-20 Created: 2017-10-20 Last updated: 2018-01-13Bibliographically approved
Ogniewski, J. & Forssén, P.-E. (2017). What is the best depth-map compression for Depth Image Based Rendering?. In: Michael Felsberg, Anders Heyden and Norbert Krüger (Ed.), Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II. Paper presented at 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24 (pp. 403-415). Springer, 10425
Open this publication in new window or tab >>What is the best depth-map compression for Depth Image Based Rendering?
2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10425, p. 403-415Conference paper, Published paper (Refereed)
Abstract [en]

Many of the latest smart phones and tablets come with integrated depth sensors, that make depth-maps freely available, thus enabling new forms of applications like rendering from different view points. However, efficient compression exploiting the characteristics of depth-maps as well as the requirements of these new applications is still an open issue. In this paper, we evaluate different depth-map compression algorithms, with a focus on tree-based methods and view projection as application.

The contributions of this paper are the following: 1. extensions of existing geometric compression trees, 2. a comparison of a number of different trees, 3. a comparison of them to a state-of-the-art video coder, 4. an evaluation using ground-truth data that considers both depth-maps and predicted frames with arbitrary camera translation and rotation.

Despite our best efforts, and contrary to earlier results, current video depth-map compression outperforms tree-based methods in most cases. The reason for this is likely that previous evaluations focused on low-quality, low-resolution depth maps, while high-resolution depth (as needed in the DIBR setting) has been ignored up until now. We also demonstrate that PSNR on depth-maps is not always a good measure of their utility.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10425
Keywords
Depth map compression; Quadtree; Triangle tree; 3DVC; View projection
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Systems
Identifiers
urn:nbn:se:liu:diva-142064 (URN)10.1007/978-3-319-64698-5_34 (DOI)000432084600034 ()2-s2.0-85028463006 (Scopus ID)9783319646978 (ISBN)9783319646985 (ISBN)
Conference
17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24
Funder
Swedish Research Council, 2014-5928
Note

VR Project: Learnable Camera Motion Models, 2014-5928

Available from: 2017-10-20 Created: 2017-10-20 Last updated: 2018-06-01Bibliographically approved
Ovrén, H. & Forssén, P.-E. (2015). Gyroscope-based video stabilisation with auto-calibration. In: 2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA): . Paper presented at 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26-30 May, 2015 (pp. 2090-2097).
Open this publication in new window or tab >>Gyroscope-based video stabilisation with auto-calibration
2015 (English)In: 2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, p. 2090-2097Conference paper, Published paper (Refereed)
Abstract [en]

We propose a technique for joint calibration of a wide-angle rolling shutter camera (e.g. a GoPro) and an externally mounted gyroscope. The calibrated parameters are time scaling and offset, relative pose between gyroscope and camera, and gyroscope bias. The parameters are found using non-linear least squares minimisation using the symmetric transfer error as cost function. The primary contribution is methods for robust initialisation of the relative pose and time offset, which are essential for convergence. We also introduce a robust error norm to handle outliers. This results in a technique that works with general video content and does not require any specific setup or calibration patterns. We apply our method to stabilisation of videos recorded by a rolling shutter camera, with a rigidly attached gyroscope. After recording, the gyroscope and camera are jointly calibrated using the recorded video itself. The recorded video can then be stabilised using the calibrated parameters. We evaluate the technique on video sequences with varying difficulty and motion frequency content. The experiments demonstrate that our method can be used to produce high quality stabilised videos even under difficult conditions, and that the proposed initialisation is shown to end up within the basin of attraction. We also show that a residual based on the symmetric transfer error is more accurate than residuals based on the recently proposed epipolar plane normal coplanarity constraint.

Series
IEEE International Conference on Robotics and Automation ICRA, ISSN 1050-4729
Keywords
Calibration, Cameras, Cost function, Gyroscopes, Robustness, Synchronization
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Signal Processing
Identifiers
urn:nbn:se:liu:diva-120182 (URN)10.1109/ICRA.2015.7139474 (DOI)000370974902014 ()978-1-4799-6922-7; 978-1-4799-6923-4 (ISBN)
Conference
2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26-30 May, 2015
Projects
LCMMVPS
Funder
Swedish Research Council, 2014-5928Swedish Foundation for Strategic Research , IIS11-0081
Available from: 2015-07-13 Created: 2015-07-13 Last updated: 2018-06-19Bibliographically approved
Ovrén, H., Forssén, P.-E. & Törnqvist, D. (2015). Improving RGB-D Scene Reconstruction using Rolling Shutter Rectification. In: Yu Sun, Aman Behal & Chi-Kit Ronald Chung (Ed.), New Development in Robot Vision: (pp. 55-71). Springer Berlin/Heidelberg
Open this publication in new window or tab >>Improving RGB-D Scene Reconstruction using Rolling Shutter Rectification
2015 (English)In: New Development in Robot Vision / [ed] Yu Sun, Aman Behal & Chi-Kit Ronald Chung, Springer Berlin/Heidelberg, 2015, p. 55-71Chapter in book (Refereed)
Abstract [en]

Scene reconstruction, i.e. the process of creating a 3D representation (mesh) of some real world scene, has recently become easier with the advent of cheap RGB-D sensors (e.g. the Microsoft Kinect).

Many such sensors use rolling shutter cameras, which produce geometrically distorted images when they are moving. To mitigate these rolling shutter distortions we propose a method that uses an attached gyroscope to rectify the depth scans.We also present a simple scheme to calibrate the relative pose and time synchronization between the gyro and a rolling shutter RGB-D sensor.

For scene reconstruction we use the Kinect Fusion algorithm to produce meshes. We create meshes from both raw and rectified depth scans, and these are then compared to a ground truth mesh. The types of motion we investigate are: pan, tilt and wobble (shaking) motions.

As our method relies on gyroscope readings, the amount of computations required is negligible compared to the cost of running Kinect Fusion.

This chapter is an extension of a paper at the IEEE Workshop on Robot Vision [10]. Compared to that paper, we have improved the rectification to also correct for lens distortion, and use a coarse-to-fine search to find the time shift more quicky.We have extended our experiments to also investigate the effects of lens distortion, and to use more accurate ground truth. The experiments demonstrate that correction of rolling shutter effects yields a larger improvement of the 3D model than correction for lens distortion.

Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2015
Series
Cognitive Systems Monographs, ISSN 1867-4925 ; 23
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-114344 (URN)10.1007/978-3-662-43859-6_4 (DOI)978-3-662-43858-9 (ISBN)978-3-662-43859-6 (ISBN)
Projects
Learnable Camera Motion Models
Available from: 2015-02-19 Created: 2015-02-19 Last updated: 2018-06-19Bibliographically approved
Ringaby, E. & Forssén, P.-E. (2014). A Virtual Tripod for Hand-held Video Stacking on Smartphones. In: 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL PHOTOGRAPHY (ICCP): . Paper presented at IEEE International Conference on Computational Photography (ICCP 2014), May 2-4, 2014, Intel, Santa Clara, USA. IEEE
Open this publication in new window or tab >>A Virtual Tripod for Hand-held Video Stacking on Smartphones
2014 (English)In: 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL PHOTOGRAPHY (ICCP), IEEE , 2014Conference paper, Published paper (Refereed)
Abstract [en]

We propose an algorithm that can capture sharp, low-noise images in low-light conditions on a hand-held smartphone. We make use of the recent ability to acquire bursts of high resolution images on high-end models such as the iPhone5s. Frames are aligned, or stacked, using rolling shutter correction, based on motion estimated from the built-in gyro sensors and image feature tracking. After stacking, the images may be combined, using e.g. averaging to produce a sharp, low-noise photo. We have tested the algorithm on a variety of different scenes, using several different smartphones. We compare our method to denoising, direct stacking, as well as a global-shutter based stacking, with favourable results.

Place, publisher, year, edition, pages
IEEE, 2014
Series
IEEE International Conference on Computational Photography, ISSN 2164-9774
National Category
Engineering and Technology Electrical Engineering, Electronic Engineering, Information Engineering Signal Processing
Identifiers
urn:nbn:se:liu:diva-108109 (URN)10.1109/ICCPHOT.2014.6831799 (DOI)000356494100001 ()978-1-4799-5188-8 (ISBN)
Conference
IEEE International Conference on Computational Photography (ICCP 2014), May 2-4, 2014, Intel, Santa Clara, USA
Projects
VPS
Available from: 2014-06-25 Created: 2014-06-25 Last updated: 2015-12-10Bibliographically approved
Lesmana, M., Landgren, A., Forssén, P.-E. & Pai, D. K. (2014). Active Gaze Stabilization. In: A. G. Ramakrishnan (Ed.), Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing: . Paper presented at The ninth Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP’14), December 14-17, Bangalore, Karnataka, India (pp. 81:1-81:8). ACM Digital Library
Open this publication in new window or tab >>Active Gaze Stabilization
2014 (English)In: Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing / [ed] A. G. Ramakrishnan, ACM Digital Library, 2014, p. 81:1-81:8Conference paper, Published paper (Refereed)
Abstract [en]

We describe a system for active stabilization of cameras mounted on highly dynamic robots. To focus on careful performance evaluation of the stabilization algorithm, we use a camera mounted on a robotic test platform that can have unknown perturbations in the horizontal plane, a commonly occurring scenario in mobile robotics. We show that the camera can be eectively stabilized using an inertial sensor and a single additional motor, without a joint position sensor. The algorithm uses an adaptive controller based on a model of the vertebrate Cerebellum for velocity stabilization, with additional drift correction. We have alsodeveloped a resolution adaptive retinal slip algorithm that is robust to motion blur.

We evaluated the performance quantitatively using another high speed robot to generate repeatable sequences of large and fast movements that a gaze stabilization system can attempt to counteract. Thanks to the high-accuracy repeatability, we can make a fair comparison of algorithms for gaze stabilization. We show that the resulting system can reduce camera image motion to about one pixel per frame on average even when the platform is rotated at 200 degrees per second. As a practical application, we also demonstrate how the common task of face detection benets from active gaze stabilization.

Place, publisher, year, edition, pages
ACM Digital Library, 2014
Keywords
Gaze stabilization, active vision, Cerebellum, VOR, adaptive control
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-114318 (URN)10.1145/2683483.2683565 (DOI)978-1-4503-3061-9 (ISBN)
Conference
The ninth Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP’14), December 14-17, Bangalore, Karnataka, India
Projects
Learnable Camera Motion Models
Funder
Swedish Research Council
Available from: 2015-02-18 Created: 2015-02-18 Last updated: 2018-01-11Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-5698-5983

Search in DiVA

Show all publications