Guided-attention and gated-aggregation network for medical image segmentationShow others and affiliations
2024 (English)In: Pattern Recognition, ISSN 0031-3203, E-ISSN 1873-5142, Vol. 156, article id 110812Article in journal (Refereed) Published
Abstract [en]
Recently, transformers have been widely used in medical image segmentation to capture long-range and global dependencies using self-attention. However, they often struggle to learn the local details which limit their ability to capture irregular shapes and sizes of the tissues and indistinct boundaries between the tissues, which are critical for accurate segmentation. To alleviate this issue, we propose a network named GA2Net, which comprises an encoder, a bottleneck, and a decoder. The encoder computes multi-scale features. In the bottleneck, we propose a hierarchical-gated features aggregation (HGFA) which introduces a novel spatial gating mechanism to enrich the multi-scale features. To effectively learn the shapes and sizes of the tissues, we apply deep supervision in the bottleneck. GA2Net proposes to use adaptive aggregation (AA) within the decoder, to adjust the receptive fields for each location in the feature map, by replacing the traditional concatenation/summation operations in skip connections in U-Net like architecture. Furthermore, we propose mask-guided feature attention (MGFA) modules within the decoder which strives to learn the salient features using foreground priors to adequately grasp the intricate structural and contour information of the tissues. We also apply intermediate supervision for each stage of the decoder, which further improves the capability of the model to better locate the boundaries of the tissues. Our extensive experimental results illustrate that our GA2-Net significantly outperforms the existing state-of-the-art methods over eight medical image segmentation datasets i.e., five polyps, a skin lesion, a multiple myeloma cell segmentation, and a cardiac MRI scan datasets. We then perform an extensive ablation study to validate the capabilities of our method. Code is available at https://github.com/mustansarfiaz/ga2net.
Place, publisher, year, edition, pages
ELSEVIER SCI LTD , 2024. Vol. 156, article id 110812
Keywords [en]
Medical image segmentation; Multi-scale feature aggregation; Mask-guided feature attention; Deep supervision; Transformers; Convolutional neural networks
National Category
Medical Imaging
Identifiers
URN: urn:nbn:se:liu:diva-207118DOI: 10.1016/j.patcog.2024.110812ISI: 001289237100001OAI: oai:DiVA.org:liu-207118DiVA, id: diva2:1894140
Note
Funding Agencies|MBZUAI-WIS Joint Program for AI Research [WIS P008]; Google Research Awards
2024-09-022024-09-022025-02-09