liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cross-Domain Transferability of Adversarial Perturbations
Australian Natl Univ, Australia; Incept Inst Artificial Intelligence, U Arab Emirates.
Australian Natl Univ, Australia; Incept Inst Artificial Intelligence, U Arab Emirates.
Incept Inst Artificial Intelligence, U Arab Emirates.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Incept Inst Artificial Intelligence, U Arab Emirates.
Show others and affiliations
2019 (English)In: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), NEURAL INFORMATION PROCESSING SYSTEMS (NIPS) , 2019, Vol. 32Conference paper, Published paper (Refereed)
Abstract [en]

Adversarial examples reveal the blind spots of deep neural networks (DNNs) and represent a major concern for security-critical applications. The transferability of adversarial examples makes real-world attacks possible in black-box settings, where the attacker is forbidden to access the internal parameters of the model. The underlying assumption in most adversary generation methods, whether learning an instance-specific or an instance-agnostic perturbation, is the direct or indirect reliance on the original domain-specific data distribution. In this work, for the first time, we demonstrate the existence of domain-invariant adversaries, thereby showing common adversarial space among different datasets and models. To this end, we propose a framework capable of launching highly transferable attacks that crafts adversarial patterns to mislead networks trained on entirely different domains. For instance, an adversarial function learned on Paintings, Cartoons or Medical images can successfully perturb ImageNet samples to fool the classifier, with success rates as high as similar to 99% (l(infinity) <= 10). The core of our proposed adversarial function is a generative network that is trained using a relativistic supervisory signal that enables domain-invariant perturbations. Our approach sets the new state-of-the-art for fooling rates, both under the white-box and black-box scenarios. Furthermore, despite being an instance-agnostic perturbation function, our attack outperforms the conventionally much stronger instance-specific attack methods.

Place, publisher, year, edition, pages
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS) , 2019. Vol. 32
Series
Advances in Neural Information Processing Systems, ISSN 1049-5258
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:liu:diva-167712ISI: 000535866904053OAI: oai:DiVA.org:liu-167712DiVA, id: diva2:1454554
Conference
33rd Conference on Neural Information Processing Systems (NeurIPS)
Available from: 2020-07-17 Created: 2020-07-17 Last updated: 2020-07-17

Open Access in DiVA

No full text in DiVA

Search in DiVA

By author/editor
Khan, Fahad
By organisation
Computer VisionFaculty of Science & Engineering
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 55 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf