liu.seSearch for publications in DiVA
Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning
Mohamed Bin Zayed Univ Artificial Intelligence, U Arab Emirates.
SUNY Stony Brook, NY 11794 USA.
Shaukat Khanum Canc Hosp, Pakistan.
Mohamed Bin Zayed Univ Artificial Intelligence, U Arab Emirates.
Show others and affiliations
2024 (English)In: MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IV, SPRINGER INTERNATIONAL PUBLISHING AG , 2024, Vol. 15004, p. 167-177Conference paper, Published paper (Refereed)
Abstract [en]

Self-supervised representation learning has been highly promising for histopathology image analysis with numerous approaches leveraging their patient-slide-patch hierarchy to learn better representations. In this paper, we explore how the combination of domain specific natural language information with such hierarchical visual representations can benefit rich representation learning for medical image tasks. Building on automated language description generation for features visible in histopathology images, we present a novel language-tied self-supervised learning framework, Hierarchical Language-tied Self-Supervision (HLSS) for histopathology images. We explore contrastive objectives and granular language description based text alignment at multiple hierarchies to inject language modality information into the visual representations. Our resulting model achieves state-of-the-art performance on two medical imaging benchmarks, OpenSRH and TCGA datasets. Our framework also provides better interpretability with our language aligned representation space. The code is available at https://github.com/Hasindri/HLSS.

Place, publisher, year, edition, pages
SPRINGER INTERNATIONAL PUBLISHING AG , 2024. Vol. 15004, p. 167-177
Series
Lecture Notes in Computer Science, ISSN 0302-9743
Keywords [en]
Vision-Language Alignment; Self-Supervised Learning
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:liu:diva-210200DOI: 10.1007/978-3-031-72083-3_16ISI: 001342228800016ISBN: 9783031720826 (print)ISBN: 9783031720833 (electronic)OAI: oai:DiVA.org:liu-210200DiVA, id: diva2:1917787
Conference
27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Palmeraie Conf Ctr, Marrakesh, MOROCCO, oct 06-10, 2024
Note

Funding Agencies|Swedish Research Council [2022-06725]

Available from: 2024-12-03 Created: 2024-12-03 Last updated: 2025-02-07

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Khan, Fahad
By organisation
Computer VisionFaculty of Science & Engineering
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 57 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf