liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Discriminative Learning and Target Attention for the 2019 DAVIS Challenge onVideo Object Segmentation
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. ETH Zürich.ORCID iD: 0000-0001-6144-9520
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-6096-3648
2019 (English)In: CVPR 2019 workshops: DAVIS Challenge on Video Object Segmentation, 2019Conference paper, Published paper (Refereed)
Abstract [en]

In this work, we address the problem of semi-supervised video object segmentation, where the task is to segment a target object in every image of the video sequence, given a ground truth only in the first frame. To be successful it is crucial to robustly handle unpredictable target appearance changes and distracting objects in the background. In this work we obtain a robust and efficient representation of the target by integrating a fast and light-weight discriminative target model into a deep segmentation network. Trained during inference, the target model learns to discriminate between the local appearances of target and background image regions. Its predictions are enhanced to accurate segmentation masks in a subsequent refinement stage.To further improve the segmentation performance, we add a new module trained to generate global target attention vectors, given the input mask and image feature maps. The attention vectors add semantic information about thetarget from a previous frame to the refinement stage, complementing the predictions provided by the target appearance model. Our method is fast and requires no network fine-tuning. We achieve a combined J and F-score of 70.6 on the DAVIS 2019 test-challenge data

Place, publisher, year, edition, pages
2019.
Keywords [en]
video object segmentation, computer vision, machine learning
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:liu:diva-163334OAI: oai:DiVA.org:liu-163334DiVA, id: diva2:1390580
Conference
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Available from: 2020-02-01 Created: 2020-02-01 Last updated: 2023-04-03

Open Access in DiVA

No full text in DiVA

Authority records

Robinson, AndreasJäremo-Lawin, FelixDanelljan, MartinFelsberg, Michael

Search in DiVA

By author/editor
Robinson, AndreasJäremo-Lawin, FelixDanelljan, MartinFelsberg, Michael
By organisation
Computer VisionFaculty of Science & Engineering
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 192 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf