Abstract:
The article presents a model based on a convolutional neural network that matches a vector of embeddings encoding information about fonts to a text image. The model consists of two identical convolutional blocks that combine features into a vector, which is then analyzed by linear layers to find differences. The model trained in this way is able to distinguish fonts, ignoring the text content, which makes it universal for various types of documents. Embedding vectors are tested on additional tasks, such as text classification by fatness and tilt, demonstrating high accuracy and confirming their usefulness for analyzing stylistic features. Experiments with variable and manual fonts show the versatility of the model and its applicability to work with a variety of data. The results of the comparison with the base model confirm the effectiveness of the proposed architecture. However, the limitations associated with working with low-quality data and multilingual texts have been identified. The code and models were published on GitHub (https://github.com/YRL-AIDA/FontEmb).
Keywords:convolutional neural networks, font classification, neural networks, computer fonts