liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Spatial-Temporal Deformable Attention Based Framework for Breast Lesion Detection in Videos
Mohamed bin Zayed Univ Artificial Intelligence, U Arab Emirates.
Tianjin Univ, Peoples R China.
Agcy Sci Res & Technol, Singapore.
Mohamed bin Zayed Univ Artificial Intelligence, U Arab Emirates.
Show others and affiliations
2023 (English)In: MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT II, SPRINGER INTERNATIONAL PUBLISHING AG , 2023, Vol. 14221, p. 479-488Conference paper, Published paper (Refereed)
Abstract [en]

Detecting breast lesion in videos is crucial for computer-aided diagnosis. Existing video-based breast lesion detection approaches typically perform temporal feature aggregation of deep backbone features based on the self-attention operation. We argue that such a strategy struggles to effectively perform deep feature aggregation and ignores the useful local information. To tackle these issues, we propose a spatial-temporal deformable attention based framework, named STNet. Our STNet introduces a spatial-temporal deformable attention module to perform local spatial-temporal feature fusion. The spatial-temporal deformable attention module enables deep feature aggregation in each stage of both encoder and decoder. To further accelerate the detection speed, we introduce an encoder feature shuffle strategy for multi-frame prediction during inference. In our encoder feature shuffle strategy, we share the backbone and encoder features, and shuffle encoder features for decoder to generate the predictions of multiple frames. The experiments on the public breast lesion ultrasound video dataset show that our STNet obtains a state-of-the-art detection performance, while operating twice as fast inference speed. The code and model are available at https://github.com/AlfredQin/STNet.

Place, publisher, year, edition, pages
SPRINGER INTERNATIONAL PUBLISHING AG , 2023. Vol. 14221, p. 479-488
Series
Lecture Notes in Computer Science, ISSN 0302-9743
Keywords [en]
Breast lesion detection; Ultrasound videos; Spatial-temporal deformable attention; Multi-frame prediction
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:liu:diva-200119DOI: 10.1007/978-3-031-43895-0_45ISI: 001109624900045ISBN: 9783031438943 (print)ISBN: 9783031438950 (electronic)OAI: oai:DiVA.org:liu-200119DiVA, id: diva2:1827911
Conference
26th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Vancouver, CANADA, oct 08-12, 2023
Note

Funding Agencies|National Research Foundation, Singapore under its AI Singapore Programme (AISG Award) [AISG2-TC-2021-003]; Agency for Science, Technology and Research (A*STAR) Central Research Fund (CRF)

Available from: 2024-01-15 Created: 2024-01-15 Last updated: 2025-02-07

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Khan, Fahad
By organisation
Computer VisionFaculty of Science & Engineering
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 95 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf