RUS  ENG
Full version
JOURNALS // Avtomatika i Telemekhanika // Archive

Avtomat. i Telemekh., 2022 Issue 12, Pages 5–17 (Mi at16094)

This article is cited in 3 papers

Topical issue

Identification of affective states based on automatic analysis of texts of comments in social networks

Yu. Yu. Dyulicheva

Vernadsky Crimean Federal University, Simferopol, 295007 Russia

Abstract: The paper considers the problem of classifying 3553 English-language comments from the social network Reddit based on various approaches to the vectorization of comment texts, including bag of words, TF–IDF, bigrams analysis based on pointwise mutual information (PMI) and sentiments, and the deep model BERT of the language representation. The use of a hybrid approach based on text vectorization using BERT and bigrams analysis have made it possible to improve the quality of comments classification up to 91%. Based on a cluster analysis of 1857 English-language comments describing anxiety, clusters were identified using BERT+k-means. The study proposes a hybrid approach based on the use of the LDA topic modeling method, the VADER sentiments analysis method, pointwise mutual information, and parts of speech analysis and permitting one to select bigrams and trigrams to describe clusters of comments. To visualize the extracted patterns in the form of trigrams, a knowledge graph was constructed that describes the subject area, and a comparison of the words of the selected target trigrams with the words of a custom dictionary describing various affective disorders has made it possible to determine the types of psychosocial stressors associated with affective disorders.

Keywords: bigram, sentiment analysis, LDA, BERT, VADER, BoW, TF-IDF, knowledge graph, mental health.

Presented by the member of Editorial Board: A. A. Lazarev

Received: 31.01.2022
Revised: 28.05.2022
Accepted: 29.06.2022

DOI: 10.31857/S0005231022120029


 English version:
Automation and Remote Control, 2022, 83:12, 1877–1885


© Steklov Math. Inst. of RAS, 2026