RUS  ENG
Full version
JOURNALS // Proceedings of the Institute for System Programming of the RAS // Archive

Proceedings of ISP RAS, 2025 Volume 37, Issue 3, Pages 85–106 (Mi tisp988)

Temporally coherent person matting trained on fake-motion dataset

I. A. Molodetskikhab, M. V. Erofeevb, A. V. Moskalenkob, D. S. Vatolinab

a Ivannikov Institute for System Programming of the RAS
b Lomonosov Moscow State University

Abstract: We propose a novel neural-network-based method to perform matting of videos depicting people that does not require additional user input such as trimaps. Our architecture achieves temporal stability of the resulting alpha mattes by using motion-estimation-based smoothing of image-segmentation algorithm outputs, combined with convolutional-LSTM modules on U-Net skip connections. We also propose a fake-motion algorithm that generates training clips for the video-matting network given photos with ground-truth alpha mattes and background videos. We apply random motion to photos and their mattes to simulate movement one would find in real videos and composite the result with the background clips. It lets us train a deep neural network operating on videos in an absence of a large annotated video dataset and provides ground-truth training-clip foreground optical flow for use in loss functions.

Keywords: video matting, semantic person matting, temporal smoothing, deep learning.

DOI: 10.15514/ISPRAS-2025-37(3)-6



© Steklov Math. Inst. of RAS, 2026