liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling
Mohamed bin Zayed Univ AI, U Arab Emirates.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed bin Zayed Univ AI, U Arab Emirates.
Mohamed bin Zayed Univ AI, U Arab Emirates.
Mohamed bin Zayed Univ AI, U Arab Emirates; Aalto Univ, Finland.
Show others and affiliations
2024 (English)In: 2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, IEEE , 2024, p. 5037-5044Conference paper, Published paper (Refereed)
Abstract [en]

Existing 3D understanding datasets typically provide annotations for a limited number of object classes, with sufficient examples per class. However, real-world object classes are not equally represented in practical settings, leading to poor performance on rarely-occurring categories if the class imbalance is neglected. In this work, we address the challenge of 3D semantic segmentation with a long-tail distribution of classes. Common methods to reduce class imbalance during training include data re-sampling, loss re-weighting, and transfer learning. In contrast, our work proposes to effectively utilize network classifier weights in 3D models to balance the training on long-tail class distributions. While previous work in the 2D domain has studied imposing constraints on the classifier weights to regularize the training, it is sensitive to hyper-parameter choices and has not been yet explored for the 3D domain. To address these challenges, our work proposes adaptive regularization for frequent classes and sampling-based regularization for rare classes that alleviate the need to manually select thresholds and can dynamically focus training on the hard classes. Our experiments on the large-scale ScanNet200 benchmark show that our method achieves improved performance, surpassing methods that rely on re-sampling, reweighting, and pre-training.

Place, publisher, year, edition, pages
IEEE , 2024. p. 5037-5044
Series
IEEE International Conference on Robotics and Automation ICRA, ISSN 1050-4729
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:liu:diva-211223DOI: 10.1109/ICRA57147.2024.10610029ISI: 001294576203124Scopus ID: 2-s2.0-85202450563ISBN: 9798350384581 (print)ISBN: 9798350384574 (electronic)OAI: oai:DiVA.org:liu-211223DiVA, id: diva2:1931970
Conference
IEEE International Conference on Robotics and Automation (ICRA), Yokohama, JAPAN, may 13-17, 2024
Note

Funding Agencies|Swedish Research Council [2022-06725]

Available from: 2025-01-28 Created: 2025-01-28 Last updated: 2025-01-28

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Khan, Fahad
By organisation
Computer VisionFaculty of Science & Engineering
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 66 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf