K. S. Lukyanov, A. I. Perminov, D. Yu. Turdakov, M. A. Pautov, “Knowledge distillation in local-region for black-box adversarial examples”, Proceedings of ISP RAS, 2025, Volume 37, Issue 4(2),Pages <nobr>133

Knowledge distillation in local-region for black-box adversarial examples

K. S. Lukyanov^abc, A. I. Perminov^a, D. Yu. Turdakov^ac, M. A. Pautov^cd

^a Ivannikov Institute for System Programming of the RAS
^b Moscow Institute of Physics and Technology (National Research University)
^c Research Center of the Trusted Artificial Intelligence ISP RAS
^d Artificial Intelligence Research Institute

Abstract: The robustness of neural networks to adversarial perturbations in black-box settings remains a challenging problem. Most existing attack methods require an excessive number of queries to the target model, limiting their practical applicability. In this work, we propose an approach in which a surrogate student model is iteratively trained on failed attack attempts, gradually learning the local behavior of the black-box model. Experiments show that this method significantly reduces the number of queries required while maintaining a high attack success rate.

Keywords: black-box adversarial attack, knowledge distillation

DOI: 10.15514/ISPRAS-2025-37(4)-23