Theoretical Foundations of Applied Discrete Mathematics
Minimal representative set for a system of frequency classes of underdetermined words
L. A. Sholomov Federal Research Center "Computer Science and Control" of Russian Academy of Sciences, Moscow
Abstract:
Let a finite set
$M$ and a system
$\mathcal{T}$ of some non-empty subsets
$T\subseteq M$ be given. Associated with the sets
$M$ and
$\mathcal{T}$ are the alphabets
$A_0=\{a_i,\,i\in M\}$ of basic symbols and
$A=\{a_T,\; T\in\mathcal{T}\}$ of
underdetermined symbols. The set of all words of length
$l$ in the alphabet
$A$, in
which each symbol
$a_T$ is present
$r_T$ times,
$\sum\limits_{T\in\mathcal{T}} r_T=l$,
is called frequency class and denoted by
$\mathcal{K}_l(\mathbf{r})$ where
$\mathbf{r}=(r_T,\,T\in\mathcal{T})$. The specification of the word
$v$ in the alphabet
$A$ is any word obtained from
$v$ by replacing each symbol
$a_T$ with some
$a_i$,
$i\in T$. The specification of the set
$V$ of words in the alphabet
$A$ is any set
of words in the alphabet
$A_0$, containing for each word
$v\in V$ some
specification of it. The class
$\mathcal{K}_{l_1}(\mathbf{r}_1)$ is
considered to be more representative than the class
$\mathcal{K}_{l_2}(\mathbf{r}_2)$, if
$l_1\ge l_2$ and, whatever the specification of
the class
$\mathcal{ K}_{l_1}(\mathbf{r}_1)$, the set of beginnings of the length
$l_2$
of all words from the specification forms some specification for the class
$\mathcal{K}_{l_2}(\mathbf{r}_2)$. Let
$\mathfrak K$ be some system of frequency
classes. A subsystem of
$\mathfrak K$ is called a representative set of the
system
$\mathfrak K$ if, for any
$\mathcal{K}_l(\mathbf{r})\in\mathfrak K$,
the subsystem contains a class that is more representative than
$\mathcal{K}_l
(\mathbf{r})$. The paper presents a method for finding the smallest representative
set for an arbitrary system of frequency classes. This setting arises in the
problems of underdetermined data compression and of underdetermined functions
implementation.
Keywords:
underdetermined data, specification,
frequency class, representative set.
UDC:
519.728
DOI:
10.17223/2226308X/12/11