liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Guided-attention and gated-aggregation network for medical image segmentation
IBM Res, NY USA; Mohamed bin Zayed Univ Artificial Intelligence, U Arab Emirates.
Mohamed bin Zayed Univ Artificial Intelligence, U Arab Emirates.
Mohamed bin Zayed Univ Artificial Intelligence, U Arab Emirates.
Mohamed bin Zayed Univ Artificial Intelligence, U Arab Emirates; Aalto Univ, Finland.
Show others and affiliations
2024 (English)In: Pattern Recognition, ISSN 0031-3203, E-ISSN 1873-5142, Vol. 156, article id 110812Article in journal (Refereed) Published
Abstract [en]

Recently, transformers have been widely used in medical image segmentation to capture long-range and global dependencies using self-attention. However, they often struggle to learn the local details which limit their ability to capture irregular shapes and sizes of the tissues and indistinct boundaries between the tissues, which are critical for accurate segmentation. To alleviate this issue, we propose a network named GA2Net, which comprises an encoder, a bottleneck, and a decoder. The encoder computes multi-scale features. In the bottleneck, we propose a hierarchical-gated features aggregation (HGFA) which introduces a novel spatial gating mechanism to enrich the multi-scale features. To effectively learn the shapes and sizes of the tissues, we apply deep supervision in the bottleneck. GA2Net proposes to use adaptive aggregation (AA) within the decoder, to adjust the receptive fields for each location in the feature map, by replacing the traditional concatenation/summation operations in skip connections in U-Net like architecture. Furthermore, we propose mask-guided feature attention (MGFA) modules within the decoder which strives to learn the salient features using foreground priors to adequately grasp the intricate structural and contour information of the tissues. We also apply intermediate supervision for each stage of the decoder, which further improves the capability of the model to better locate the boundaries of the tissues. Our extensive experimental results illustrate that our GA2-Net significantly outperforms the existing state-of-the-art methods over eight medical image segmentation datasets i.e., five polyps, a skin lesion, a multiple myeloma cell segmentation, and a cardiac MRI scan datasets. We then perform an extensive ablation study to validate the capabilities of our method. Code is available at https://github.com/mustansarfiaz/ga2net.

Place, publisher, year, edition, pages
ELSEVIER SCI LTD , 2024. Vol. 156, article id 110812
Keywords [en]
Medical image segmentation; Multi-scale feature aggregation; Mask-guided feature attention; Deep supervision; Transformers; Convolutional neural networks
National Category
Medical Imaging
Identifiers
URN: urn:nbn:se:liu:diva-207118DOI: 10.1016/j.patcog.2024.110812ISI: 001289237100001OAI: oai:DiVA.org:liu-207118DiVA, id: diva2:1894140
Note

Funding Agencies|MBZUAI-WIS Joint Program for AI Research [WIS P008]; Google Research Awards

Available from: 2024-09-02 Created: 2024-09-02 Last updated: 2025-02-09

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Khan, Fahad
By organisation
Computer VisionFaculty of Science & Engineering
In the same journal
Pattern Recognition
Medical Imaging

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 112 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf