liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Generative Multi-Label Zero-Shot Learning
Univ Guelph, Canada.
Technol Innovat Inst, U Arab Emirates.
Mohamed Bin Zayed Univ, U Arab Emirates; Australian Natl Univ, Australia.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed Bin Zayed Univ, U Arab Emirates.
Show others and affiliations
2023 (English)In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 45, no 12, p. 14611-14624Article in journal (Refereed) Published
Abstract [en]

Multi-label zero-shot learning strives to classify images into multiple unseen categories for which no data is available during training. The test samples can additionally contain seen categories in the generalized variant. Existing approaches rely on learning either shared or label-specific attention from the seen classes. Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge. In contrast, state-of-the-art single-label generative adversarial network (GAN) based approaches learn to directly synthesize the class-specific visual features from the corresponding class attribute embeddings. However, synthesizing multi-label features from GANs is still unexplored in the context of zero-shot setting. When multiple objects occur jointly in a single image, a critical question is how to effectively fuse multi-class information. In this work, we introduce different fusion approaches at the attribute-level, feature-level and cross-level (across attribute and feature-levels) for synthesizing multi-label features from their corresponding multi-label class embeddings. To the best of our knowledge, our work is the first to tackle the problem of multi-label feature synthesis in the (generalized) zero-shot setting. Our cross-level fusion-based generative approach outperforms the state-of-the-art on three zero-shot benchmarks: NUS-WIDE, Open Images and MS COCO. Furthermore, we show the generalization capabilities of our fusion approach in the zero-shot detection task on MS COCO, achieving favorable performance against existing methods.

Place, publisher, year, edition, pages
IEEE COMPUTER SOC , 2023. Vol. 45, no 12, p. 14611-14624
Keywords [en]
Generalized zero-shot learning; multi-label classification; zero-shot object detection; feature synthesis
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:liu:diva-201026DOI: 10.1109/TPAMI.2023.3295772ISI: 001104973300034PubMedID: 37450360OAI: oai:DiVA.org:liu-201026DiVA, id: diva2:1840200
Note

Funding Agencies| [PID2021-128178OB-I00]

Available from: 2024-02-22 Created: 2024-02-22 Last updated: 2025-02-07

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Search in DiVA

By author/editor
Khan, Fahad
By organisation
Computer VisionFaculty of Science & Engineering
In the same journal
IEEE Transactions on Pattern Analysis and Machine Intelligence
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 74 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf