liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
CDF-Based Importance Sampling and Visualization for Neural Network Training
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering. (Computer graphics and image processing)ORCID iD: 0000-0002-5220-633X
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering. (Computer graphics and image processing)ORCID iD: 0000-0002-9217-9997
2023 (English)In: Eurographics Workshop on Visual Computing for Biology and Medicine / [ed] Thomas Höllt and Daniel Jönsson, 2023Conference paper, Published paper (Refereed)
Abstract [en]

Training a deep neural network is computationally expensive, but achieving the same network performance with less computation is possible if the training data is carefully chosen. However, selecting input samples during training is challenging as their true importance for the optimization is unknown. Furthermore, evaluation of the importance of individual samples must be computationally efficient and unbiased. In this paper, we present a new input data importance sampling strategy for reducing the training time of deep neural networks. We investigate different importance metrics that can be efficiently retrieved as they are available during training, i.e., the training loss and gradient norm. We found that choosing only samples with large loss or gradient norm, which are hard for the network to learn, is not optimal for the network performance. Instead, we introduce an importance sampling strategy that selects samples based on the cumulative distribution function of the loss and gradient norm, thereby making it more likely to choose hard samples while still including easy ones. The behavior of the proposed strategy is first analyzed on a synthetic dataset, and then evaluated in the application of classification of malignant cancer in digital pathology image patches. As pathology images contain many repetitive patterns, there could be significant gains in focusing on features that contribute stronger to the optimization. Finally, we show how the importance sampling process can be used to gain insights about the input data through visualization of samples that are found most or least useful for the training.

Place, publisher, year, edition, pages
2023.
Series
Eurographics Workshop on Visual Computing for Biomedicine, ISSN 2070-5778, E-ISSN 2070-5786
Keywords [en]
Computing methodologies, Neural networks, Human-centered computing, Visualization techniques;
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-199166DOI: 10.2312/vcbm.20231212ISBN: 978-3-03868-216-5 (print)OAI: oai:DiVA.org:liu-199166DiVA, id: diva2:1811538
Conference
VCBM 2023: Eurographics Workshop on Visual Computing for Biology and Medicine, Norrköping, Sweden, September 20 – 22, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

The fulltext is published under Creative Common license Attribution 4.0https://creativecommons.org/licenses/by/4.0/

No changes have been made to the publication.

Available from: 2023-11-13 Created: 2023-11-13 Last updated: 2023-11-21

Open Access in DiVA

fulltext(1715 kB)63 downloads
File information
File name FULLTEXT01.pdfFile size 1715 kBChecksum SHA-512
3e61c1fc28d768c42c3414ec327df8094991a9edc511f27ad16a49fcb4b025714656da87b7b9172331e135cfed0da0dce2cff975ab0aab2a01665a7085aab936
Type fulltextMimetype application/pdf

Other links

Publisher's full texthttps://diglib.eg.org/xmlui/bitstream/handle/10.2312/vcbm20231212/051-055.pdf?sequence=1

Authority records

Jönsson, DanielEilertsen, Gabriel

Search in DiVA

By author/editor
Unnebäck, JakobJönsson, DanielEilertsen, Gabriel
By organisation
Media and Information TechnologyFaculty of Science & Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 63 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 350 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf