Abstract:
The article discusses methods for improving quality metrics of machine learning models used
in the financial sector. Due to the fact that the data sets on which the models are trained have class
imbalances, it is proposed to use models aimed at reducing the imbalance. The study conducted
experiments using 9 methods for accounting for class imbalances with three data sets on retail lending. The
CatboostClassifier gradient boosting model, which does not take into account class imbalances, was used
as the base model. The experiments showed that the use of the RandomOverSampler method provides a
significant increase in classification quality metrics compared to the base model. The results indicate the
promise of further research into methods for accounting for class imbalances in the study of financial data,
as well as the feasibility of application of the considered methods in practice.
Keywords:financial risks, machine learning, classification, class imbalance