Abstract:
The paper is devoted to finding an efficient algorithm for detecting outliers in non-stationary one-dimensional time series representing field measurements. Thus, the non-stationarity of a series is characterized by the presence of a variable trend in the data, as well as heteroscedasticity which is the inconstancy of variance for individual subsequences of the time series. Failure to take these features into account leads to the fact that outliers associated with breakdowns or inaccuracies of the equipment recording field measurements can be classified as regular values. This makes most existing methods for detecting outliers in time series ineffective. The paper describes real data representing observations of temperature and pollutant concentration in the boundary layer of the atmosphere in Krasnoyarsk, which have specified properties. A brief overview of existing methods is given, their advantages and disadvantages in application to the available data are shown. The author's approach to detecting outliers in the series of the described type is proposed. The method proposed in the paper is aimed at correcting and combining existing approaches and is divided into two stages: localization of points suspected of being outliers and regression on the localized section with an adaptive threshold for cutting off points. The proposed algorithm was tested on the available data. A comparison with existing approaches was made.