A. A. Karpov, “An automatic multimodal speech recognition system with audio and video information”, Avtomat. i Telemekh., 2014, Issue 12,Pages <nobr>125

This article is cited in 15 papers

Intellectual Control Systems

An automatic multimodal speech recognition system with audio and video information

A. A. Karpov^ab

^a St. Petersburg Institute of Informatics and Automation, Russian Academy of Sciences, St. Petersburg, Russia
^b ITMO University, St. Petersburg, Russia

Abstract: The mathematical model and software implementation of an automatic Russian speech recognition system that employs techniques of digital processing and analysis of audiovisual signals from a microphone and a video camera are presented. The description of probabilistic modeling of audiovisual speech based on coupled hidden Markov models, information fusion methods with weight coefficients for audio and video speech modalities, and parametric representation of signals is provided. Quantitative results in multimodal recognition of continuous Russian speech indicate high accuracy and reliability of the automatic system.

Presented by the member of Editorial Board: A. V. Bernshtein

Received: 28.03.2012