liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Leveraging Optical Flow Features for Higher Generalization Power in Video Object Segmentation
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. (Computer Vision Laboratory)ORCID iD: 0000-0001-8761-4715
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-9072-2204
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-6096-3648
2023 (English)In: 2023 IEEEInternational Conferenceon Image Processing: Proceedings, IEEE , 2023, p. 326-330Conference paper, Published paper (Refereed)
Abstract [en]

We propose to leverage optical flow features for higher generalization power in semi-supervised video object segmentation. Optical flow is usually exploited as additional guidance information in many computer vision tasks. However, its relevance in video object segmentation was mainly in unsupervised settings or using the optical flow to warp or refine the previously predicted masks. Different from the latter, we propose to directly leverage the optical flow features in the target representation. We show that this enriched representation improves the encoder-decoder approach to the segmentation task. A model to extract the combined information from the optical flow and the image is proposed, which is then used as input to the target model and the decoder network. Unlike previous methods, e.g. in tracking where concatenation is used to integrate information from image data and optical flow, a simple yet effective attention mechanism is exploited in our work. Experiments on DAVIS 2017 and YouTube-VOS 2019 show that integrating the information extracted from optical flow into the original image branch results in a strong performance gain, especially in unseen classes which demonstrates its higher generalization power.

Place, publisher, year, edition, pages
IEEE , 2023. p. 326-330
Keywords [en]
Optical flow features; Attention mechanism; Semi-supervised VOS; Generalization power
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:liu:diva-199057DOI: 10.1109/ICIP49359.2023.10222542ISI: 001106821000063ISBN: 9781728198354 (electronic)ISBN: 9781728198361 (print)OAI: oai:DiVA.org:liu-199057DiVA, id: diva2:1810690
Conference
2023 IEEE International Conference on Image Processing (ICIP), 8–11 October 2023 Kuala Lumpur, Malaysia
Available from: 2023-11-08 Created: 2023-11-08 Last updated: 2024-03-12

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Zhang, YushanRobinson, AndreasMagnusson, MariaFelsberg, Michael

Search in DiVA

By author/editor
Zhang, YushanRobinson, AndreasMagnusson, MariaFelsberg, Michael
By organisation
Computer VisionFaculty of Science & Engineering
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 81 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf