Abstract:
The paper discusses a method for analyzing data that will be used in solving machine learning problems to find noise and inaccuracies, distortions in these data that impede the construction of an adequate
model. Data of this kind is called outliers. The proposed approach uses methods and algorithms based on
multi-valued logic systems. Multivalued logic can be used in the case of multidimensional heterogeneous
features that characterize the objects of the original subject area. To conduct a qualitative data analysis,
the following procedure is proposed in the work: a multi-valued logical function is constructed for the
analyzed data, which finds all possible classes on the subject area under consideration; Further, the
analysis of objects that did not fall into the constructed classes for a number of signs; and the hypothesis
that these objects are emissions is tested. In the work, a hypothesis test is a sequence of logical rules for
restoring the original dependencies presented in the training set. The proposed approach was considered
for classification problems, in the case of multidimensional discrete features, where each feature can take
k-different values and be equivalent in importance to class identification.