This list contains proposed white-box defenses to adversarial examples that have been open-sourced, along with third-party analyses / security evaluations that have been open-sourced.

Submit a new defense or analysis.

Defense Dataset Threat Model Natural Accuracy Claims Analyses
Adversarial Logit Pairing (Kannan et al.) (code) ImageNet $$\ell_\infty (\epsilon = 16/255)$$

72%

27.9% accuracy