RUS  ENG
Full version
JOURNALS // Proceedings of the Institute for System Programming of the RAS // Archive

Proceedings of ISP RAS, 2025 Volume 37, Issue 4(2), Pages 133–146 (Mi tisp1030)

Knowledge distillation in local-region for black-box adversarial examples

K. S. Lukyanovabc, A. I. Perminova, D. Yu. Turdakovac, M. A. Pautovcd

a Ivannikov Institute for System Programming of the RAS
b Moscow Institute of Physics and Technology (National Research University)
c Research Center of the Trusted Artificial Intelligence ISP RAS
d Artificial Intelligence Research Institute

Abstract: The robustness of neural networks to adversarial perturbations in black-box settings remains a challenging problem. Most existing attack methods require an excessive number of queries to the target model, limiting their practical applicability. In this work, we propose an approach in which a surrogate student model is iteratively trained on failed attack attempts, gradually learning the local behavior of the black-box model. Experiments show that this method significantly reduces the number of queries required while maintaining a high attack success rate.

Keywords: black-box adversarial attack, knowledge distillation

DOI: 10.15514/ISPRAS-2025-37(4)-23



© Steklov Math. Inst. of RAS, 2026