RUS  ENG
Full version
JOURNALS // Informatics and Automation // Archive

Tr. SPIIRAN, 2007 Issue 4, Pages 388–404 (Mi trspy292)

Two-level morpho-phonetic prefix graph for the Russian continuous speech decoding

A. L. Ronzhina, An.B. Leontyeva, I.A. Kagirov, S. Theil

a St. Petersburg Institute for Informatics and Automation of RAS

Abstract: A new representation structure of large vocabulary for high inflective language is sketched. Reach morphology complicates text and speech parsing. To improve the performance a two level morpho-phonetic prefix graph is proposed for vocabulary representation. Sharing the identical beginning parts and endings of different words significantly reduces the search space for a large vocabulary. Stem based language model reduces the complexity of continuous speech decoding and solves data scarcity problem for the inflective languages. The proposed graph was compared with two baseline word lattice models that showed significant reduction of topology complexity of the graph.

UDC: 681.3



© Steklov Math. Inst. of RAS, 2026