Abstract:
The relevance of identifying tabular information and recognizing its
contents for processing scanned documents is shown. The formation of a data set for
training, validation and testing of a deep learning neural network (DNN) YOLOv5s
for the detection of simple tables is described. The effectiveness of using this DNN
when working with scanned documents is shown. Using the Keras Functional API,
a convolutional neural network (CNN) was formed to recognize the main elements
of tabular information— numbers, basic punctuation marks and Cyrillic letters.
The results of a study of the work of this CNN are given. The implementation of the
identification and recognition of tabular information on scanned documents in the
developed IS updating information in databases for the Unified State Register
of Real Estate system is described.
Key words and phrases:Convolutional Neural Networks, Deep Learning
Neural Networks, CNN, DNN, YOLOv5s, Keras, Python.