Abstract:
Let $x_1,x_2,\dots,x_N$, $N\geq 1$, be a word in a finite alphabet $A$. We consider sequence coding methods, i.e., methods that encode the letters $x_1,x_2,\dots$ by words in a binary alphabet, where the code of the letter $x_i$ depends only on $x_1,\dots,x_{i-1}$. (These codes are effective for data compression in computer memory [1–3].) For large alphabets $A$ (when $|A|\to\infty$), the code proposed in [1] has minimum redundancy of order $O(1)$ bits and the letter coding/decoding time is $O(|A|\log|A|)$. The implementation of the “book stack” method of [5] described in [2, 4] has time $O(\log^2|A|)$ and redundancy $\log\log|A|$ bits. In the present paper, we propose an algorithm which combines the advantages of the codes from [1] and [5, 2]: it has minimum redundancy of $O(1)$ bits and coding/decoding time of $O(\log^2|A|)$, which is close to the obvious lower bound $O(\log|A|)$.