Abstract:
The article presents development of a machine learning model for predicting fraudulent
transactions using transactional data from a bank. It discusses the features of encoding categorical variables
related to the presence of time in the transactional data to avoid information leakage. Additionally,
experiments were conducted on the application of bagging and the creation of additional variables based
on their contribution to the final prediction using Shapley values. The quality metrics of the machine
learning model are examined and analyzed.