Abstract:
The article considers the problems of choosing structural parameters
characterizing the model of a mixture of probabilistic principal
component analyzers, namely, the number of elements of the mixture and
the dimensions of these elements. Among the set of approaches used in
practice for the task of classifying data, only sampling management
methods are actually available. To implement the choice of dimensions,
it is proposed to use a combination of the known methods for model selecting.
The mixture of probabilistic principal component analysis allows one to model
bulk data using a relatively small number of free parameters. The number of
free parameters can be controlled
by selecting the latent dimension of the data.
Keywords:probabilistic principal component analysis (PPCA), mixtures of PPCA, model selection criterion, bootstrap, cross-validation.