RUS  ENG
Full version
JOURNALS // Modelirovanie i Analiz Informatsionnykh Sistem // Archive

Model. Anal. Inform. Sist., 2023 Volume 30, Number 4, Pages 382–393 (Mi mais810)

This article is cited in 1 paper

Artificial intelligence

Extracting named entities from russian-language documents with different expressiveness of structure

M. D. Averina, O. A. Levanova

P.G. Demidov Yaroslavl State University, 14 Sovetskaya str., Yaroslavl 150003, Russia

Abstract: This work is devoted to solving the problem of recognizing named entities for Russian-language texts based on the CRF model. Two sets of data were considered: documents on refinancing with a good document structure, semi-structured texts of court records. The model was tested under various sets of text features and CRF parameters (optimization algorithms). In average for all entities, the best F-measure value for structured documents was 0.99, and for semi-structured ones 0.86.

Keywords: named entity extraction, CRF.

UDC: 004.912

MSC: 68T50

Received: 13.10.2023
Revised: 10.11.2023
Accepted: 15.11.2023

DOI: 10.18255/1818-1015-2023-4-382-393



© Steklov Math. Inst. of RAS, 2026