liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Guidance Through Surrogate: Toward a Generic Diagnostic Attack
Mohamed Bin Zayed Univ Artificial Intelligence, U Arab Emirates; Australian Natl Univ, Australia.
Mohamed Bin Zayed Univ Artificial Intelligence, U Arab Emirates; Australian Natl Univ, Australia.
Qualcomm, CA 92121 USA.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed Bin Zayed Univ Artificial Intelligence, U Arab Emirates.
2024 (English)In: IEEE Transactions on Neural Networks and Learning Systems, ISSN 2162-237X, E-ISSN 2162-2388, Vol. 35, no 2, p. 2042-2053Article in journal (Refereed) Published
Abstract [en]

Adversarial training (AT) is an effective approach to making deep neural networks robust against adversarial attacks. Recently, different AT defenses are proposed that not only maintain a high clean accuracy but also show significant robustness against popular and well-studied adversarial attacks, such as projected gradient descent (PGD). High adversarial robustness can also arise if an attack fails to find adversarial gradient directions, a phenomenon known as "gradient masking." In this work, we analyze the effect of label smoothing on AT as one of the potential causes of gradient masking. We then develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed guided projected gradient attack (G-PGA). Our attack approach is based on a "match and deceive" loss that finds optimal adversarial directions through guidance from a surrogate model. Our modified attack does not require random restarts a large number of attack iterations or a search for optimal step size. Furthermore, our proposed G-PGA is generic, thus it can be combined with an ensemble attack strategy as we demonstrate in the case of auto-attack, leading to efficiency and convergence speed improvements. More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.

Place, publisher, year, edition, pages
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC , 2024. Vol. 35, no 2, p. 2042-2053
Keywords [en]
Smoothing methods; Robustness; Training; Optimization; Behavioral sciences; Computational modeling; Perturbation methods; Adversarial attack; gradient masking; guided optimization; image classification; label smoothing
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-187408DOI: 10.1109/TNNLS.2022.3186278ISI: 000826080200001PubMedID: 35816520OAI: oai:DiVA.org:liu-187408DiVA, id: diva2:1689187
Available from: 2022-08-22 Created: 2022-08-22 Last updated: 2024-12-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Authority records

Khan, Fahad Shahbaz

Search in DiVA

By author/editor
Khan, Fahad Shahbaz
By organisation
Computer VisionFaculty of Science & Engineering
In the same journal
IEEE Transactions on Neural Networks and Learning Systems
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 75 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf