Abstract:
Deploying of deep neural networks requires inference performance analysis on the target hardware. Performance results are aimed to be used as motivation to evaluate a decision for deployment, find the best performing hardware and software configurations, decide is there's a need for optimization of DL model and DL inference software. The paper describes a technique for analyzing and comparing inference performance using an example of image classification problem: converting a trained model to the formats of different frameworks, quality analysis, determining optimal inference execution parameters, model optimization and quality reanalysis, analyzing and comparing inference performance for the considered frameworks. Deep Learning Inference Benchmark Tool is aimed to support the performance analysis cycle. The technique is implemented on the example of the MobileNetV2 model.