liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Density Adaptive Point Set Registration
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0001-6144-9520
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-5698-5983
Show others and affiliations
2018 (English)In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018, p. 3829-3837Conference paper, Published paper (Refereed)
Abstract [en]

Probabilistic methods for point set registration have demonstrated competitive results in recent years. These techniques estimate a probability distribution model of the point clouds. While such a representation has shown promise, it is highly sensitive to variations in the density of 3D points. This fundamental problem is primarily caused by changes in the sensor location across point sets.    We revisit the foundations of the probabilistic registration paradigm. Contrary to previous works, we model the underlying structure of the scene as a latent probability distribution, and thereby induce invariance to point set density changes. Both the probabilistic model of the scene and the registration parameters are inferred by minimizing the Kullback-Leibler divergence in an Expectation Maximization based framework. Our density-adaptive registration successfully handles severe density variations commonly encountered in terrestrial Lidar applications. We perform extensive experiments on several challenging real-world Lidar datasets. The results demonstrate that our approach outperforms state-of-the-art probabilistic methods for multi-view registration, without the need of re-sampling.

Place, publisher, year, edition, pages
IEEE, 2018. p. 3829-3837
Series
IEEE Conference on Computer Vision and Pattern Recognition
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Engineering and Technology
Identifiers
URN: urn:nbn:se:liu:diva-149774DOI: 10.1109/CVPR.2018.00403ISI: 000457843603101ISBN: 978-1-5386-6420-9 (electronic)OAI: oai:DiVA.org:liu-149774DiVA, id: diva2:1233671
Conference
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, United States, 18-22 June, 2018
Note

Funding Agencies|EUs Horizon 2020 Programme [644839]; CENIIT grant [18.14]; VR grant: EMC2 [2014-6227]; VR grant [2016-05543]; VR grant: LCMM [2014-5928]

Available from: 2018-07-18 Created: 2018-07-18 Last updated: 2023-04-03Bibliographically approved
In thesis
1. Learning Representations for Segmentation and Registration
Open this publication in new window or tab >>Learning Representations for Segmentation and Registration
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In computer vision, the aim is to model and extract high-level information from visual sensor measurements such as images, videos and 3D points. Since visual data is often high-dimensional, noisy and irregular, achieving robust data modeling is challenging. This thesis presents works that address challenges within a number of different computer vision problems. 

First, the thesis addresses the problem of phase unwrapping for multi-frequency amplitude modulated time-of-flight (ToF) ranging. ToF is used in depth cameras, which have many applications in 3D reconstruction and gesture recognition. While amplitude modulation in time-of-flight ranging can provide accurate measurements for the depth, it also causes depth ambiguities. This thesis presents a method to resolve the ambiguities by estimating the likelihoods of different hypotheses for the depth values. This is achieved by performing kernel density estimation over the hypotheses in a spatial neighborhood of each pixel in the depth image. The depth hypothesis with the highest estimated likelihood can then be selected as the output depth. This approach yields improvements in the quality of the depth images and extends the effective range in both indoor and outdoor environments. 

Next, point set registration is investigated, which is the problem of aligning point sets from overlapping depth images or 3D models. Robust registration is fundamental to many vision tasks, such as multi-view 3D reconstruction and object pose estimation for robotics. The thesis presents a method for handling density variations in the measured point sets. This is achieved by modeling a latent distribution representing the underlying structure of the scene. Both the model of the scene and the registration parameters are inferred in an Expectation-Maximization based framework. Secondly, the thesis introduces a method for integrating features from deep neural networks into the registration model. It is shown that the deep features improve registration performance in terms of accuracy and robustness. Additionally, improved feature representations are generated by training the deep neural network end-to-end by minimizing registration errors produced by our registration model. 

Further, an approach for 3D point set segmentation is presented. As scene models are often represented using 3D point measurements, segmentation of these is important for general scene understanding. Learning models for segmentation requires a significant amount of annotated data, which is expensive and time-consuming to acquire. The approach presented in the thesis circumvents this by projecting the points into virtual camera views and render 2D images. The method can then exploit accurate convolutional neural networks for image segmentation and map the segmentation predictions back to the 3D points. This also allows for transferring learning using available annotated image data, thereby reducing the need for 3D annotations. 

Finally, the thesis explores the problem of video object segmentation (VOS), where the task is to track and segment target objects in each frame of a video sequence. Accurate VOS requires a robust model of the target that can adapt to different scenarios and objects. This needs to be achieved using only a single labeled reference frame as training data for each video sequence. To address the challenges in VOS, the thesis introduces a parametric target model, optimized to predict a target label derived from the mask annotation. The target model is integrated into a deep neural network, where its predictions guide a decoder module to produce target segmentation masks. The deep network is trained on labeled video data to output accurate segmentation masks for each frame. Further, it is shown that by training the entire network model in an end-to-end manner, it can learn a representation of the target that provides increased segmentation accuracy. 

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2021. p. 75
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2151
Keywords
Computer Vision, point set registration, video object segmentation, time-of-flight, point set segmentation, deep learning, expectation maximization
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-176054 (URN)10.3384/diss.diva-176054 (DOI)9789179296230 (ISBN)
Public defence
2021-08-27, Ada Lovelace, B-building, Campus Valla, Linköping, 13:00 (English)
Opponent
Supervisors
Available from: 2021-07-20 Created: 2021-06-02 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

Density Adaptive Point Set Registration(1340 kB)491 downloads
File information
File name FULLTEXT02.pdfFile size 1340 kBChecksum SHA-512
89a1240b0fc0c8801eedb9eccca04f833b11bbfcd6c4f62bba9b8adfa3695f071fc0410d532f18b14c69cac9ea1becf9e2d53e3adf8028e4fefce528ce26c8a8
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Järemo Lawin, FelixDanelljan, MartinKhan, Fahad ShahbazForssén, Per-ErikFelsberg, Michael

Search in DiVA

By author/editor
Järemo Lawin, FelixDanelljan, MartinKhan, Fahad ShahbazForssén, Per-ErikFelsberg, Michael
By organisation
Computer VisionFaculty of Science & Engineering
Electrical Engineering, Electronic Engineering, Information EngineeringEngineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 491 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 948 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf