RUS  ENG
Full version
JOURNALS // Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia // Archive

Dokl. RAN. Math. Inf. Proc. Upr., 2025 Volume 527, Pages 156–170 (Mi danma675)

SPECIAL ISSUE: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TECHNOLOGIES

MMRFIGN: an ensemble graph segmentation model for imbalanced high-resolution images informed by multicomponent Markov random fields

A. K. Gorshenin, A. M. Dostovalova

Federal Research Center "Computer Science and Control" of Russian Academy of Sciences, Moscow

Abstract: The paper presents a novel MMRFIGN ensemble graph neural network model, informed by multicomponent Markov random fields, to improve object segmentation quality in high-resolution images for cases of imbalanced and volatile datasets. A key component of this model is a specially-designed two-branch block of graph convolutions. This block simultaneously processes local and global image features based on multiscale image partitions using a multicomponent Markov model to reconstruct spatial relationships between features. A theorem on the faster decrease of the loss function for a multicomponent graph architecture is proven, indicating faster model training compared to graph and convolutional models of comparable size. The MMRFIGN model has been tested on the task of image segmentation collected with unmanned aerial vehicles on heterogeneous urban landscapes (open datasets UAVid and UDD were used: Ultra HD 4K resolution, there is an imbalance of object classes in terms of numbers). MMRFIGN has outperformed the recognition of both large (buildings, roads) and small objects of different scales (cars) compared to modern convolutional architectures (DeepLabV3, ENet) as well as transformers (SegFormer and SOTA-model 2025 LWGANet): for large objects, an increase in the F1-score reaches 25.04% (on average, up to 12.08%), and for small objects, 14.87% (on average, up to 11.52%). MMRFIGN also outperforms both specialized SOTA transformer model LWGANet (up to 15.11% for several classes), and alternative ensemble implementations based on graph architectures with attention up to 20.97%. At the same time, MMRFIGN has fewer parameters than the basic networks, demonstrating the possibility of reduction by a factor of 1.78.

Keywords: probability-informed neural networks, semantic segmentation, Markov random fields, class imbalance, unmanned aerial vehicles.

UDC: 004.852

Received: 20.08.2025
Accepted: 22.09.2025

DOI: 10.7868/S2686954325070136



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2026