Abstract:
The article covers research in the field of Natural Language Processing. The method and the algorithm for searching thematically similar documents are presented. A comparison of various measures of thematic similarity and sets of features is performed.
Keywords:text similarity, vector space model, TF, IDF, topic importance characteristic, measure of thematic similarity, assessment of methods for information retrieval, DCG.