liu.seSearch for publications in DiVA
Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
3D Instance Segmentation via Enhanced Spatial and Semantic Supervision
Mohamed Bin Zayed Univ Artificial Intelligence MB, U Arab Emirates.
Mohamed Bin Zayed Univ Artificial Intelligence MB, U Arab Emirates.
Mohamed Bin Zayed Univ Artificial Intelligence MB, U Arab Emirates.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed Bin Zayed Univ Artificial Intelligence MB, U Arab Emirates.
2023 (English)In: 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, IEEE COMPUTER SOC , 2023, p. 541-550Conference paper, Published paper (Refereed)
Abstract [en]

3D instance segmentation has recently garnered increased attention. Typical deep learning methods adopt point grouping schemes followed by hand-designed geometric clustering. Inspired by the success of transformers for various 3D tasks, newer hybrid approaches have utilized transformer decoders coupled with convolutional backbones that operate on voxelized scenes. However, due to the nature of sparse feature backbones, the extracted features provided to the transformer decoder are lacking in spatial understanding. Thus, such approaches often predict spatially separate objects as single instances. To this end, we introduce a novel approach for 3D point clouds instance segmentation that addresses the challenge of generating distinct instance masks for objects that share similar appearances but are spatially separated. Our method leverages spatial and semantic supervision with query refinement to improve the performance of hybrid 3D instance segmentation models. Specifically, we provide the transformer block with spatial features to facilitate differentiation between similar object queries and incorporate semantic supervision to enhance prediction accuracy based on object class. Our proposed approach outperforms existing methods on the validation sets of ScanNet V2 and ScanNet200 datasets, establishing a new state-of-the-art for this task.

Place, publisher, year, edition, pages
IEEE COMPUTER SOC , 2023. p. 541-550
Series
IEEE International Conference on Computer Vision, ISSN 1550-5499, E-ISSN 2380-7504
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:liu:diva-202580DOI: 10.1109/ICCV51070.2023.00056ISI: 001159644300050ISBN: 9798350307184 (electronic)ISBN: 9798350307191 (print)OAI: oai:DiVA.org:liu-202580DiVA, id: diva2:1852130
Conference
IEEE/CVF International Conference on Computer Vision (ICCV), Paris, FRANCE, oct 02-06, 2023
Available from: 2024-04-17 Created: 2024-04-17 Last updated: 2025-02-07

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Khan, Fahad
By organisation
Computer VisionFaculty of Science & Engineering
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 235 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf