Abstract:
The paper demonstrates computational efficiency of probabilistic approach to knowledge extraction through binary similarity operation. In addition to previously proved by the author the result on sufficiency of a polynomial number of hypotheses on causes of investigated target property, the paper contains a polynomial upper bound on mean working time of the algorithm to generate a single candidate for hypothesis. The proven result concerns a family of algorithms based on coupled Markov chains. To obtain a good estimate for the length of the trajectory (before entering the ergodic state) of such a chain, we needed to enrich the training sample by adding negative columns for existing binary features.
Keywords:similarity, candidate, coupled Markov chain, average length of trajectory.