PARTICIPANTS EMOTION RECOGNITION IN VIDEO CONFERENCES USING MULTIMODAL ANALYSIS

Authors

DOI:

https://doi.org/10.32782/tnv-tech.2024.4.13

Keywords:

emotion recognition, convolutional neural networks, mathematical model, video conferencing, spectrograms, voice analysis.

Abstract

The paper describes the relevance of the task of analyzing emotions in video communications, emphasizing the importance of understanding the moods of participants to improve the quality of interaction in the modern world, where video meetings are becoming the norm in business, education, and personal contacts. Effective understanding of emotions helps to adapt communication, resolve conflicts at early stages, and improve the overall perception of interaction. Despite the availability of powerful emotion recognition software such as FaceReader and Microsoft Oxford Project, their effectiveness is limited due to their focus on analyzing facial expressions alone. The accuracy of such systems is often inferior due to shortcomings in emotion recognition, which requires improved analysis methods. This paper proposes a novel approach to recognizing emotions of video conference participants through multimodal analysis that combines the processing of physical voice characteristics and facial expressions. The use of convolutional neural networks allows for high accuracy in identifying emotional states, taking into account various distortions of the input data. The technique involves analyzing voice data, normalizing it, and converting it into spectrograms for further processing by a neural network. Special attention is paid to the process of training the network based on the gradient descent method to improve the accuracy of emotion recognition. Experimental results demonstrate the advantage of the proposed method over existing software tools, with an increase in emotion recognition accuracy of up to 79%, which is a significant improvement.

References

Ekman, P. Basic emotions. Handbook of cognition and emotion, 1999. 45-60.

Fredrickson, B. L. The role of positive emotions in positive psychology: The broaden-and-build theory of positive emotions. American psychologist, 2001. 56(3), 218-226.

Facereader. URL: https://www.noldus.com/facereader

Happy? Sad? Angry? This Microsoft tool recognizes emotions in pictures. URL: https://blogs.microsoft.com/ai/happy-sad-angry-this-microsoft-tool-recognizesemotions-in-pictures/

Understanding Audio data, Fourier Transform, FFT and Spectrogram features for a Speech Recognition System. URL: https://towardsdatascience.com/understandingaudio-data-fourier-transform-fft-spectrogram-and-speech-recognition-a4072d228520

Савчук Т. О., Пастух І. П. Розпізнавання емоцій учасників відеоконференцій в Microsoft Teams. Таврійський науковий вісник. Серія: Технічні науки. Херсон: Видавничий дім «Гельветика», 2023. Вип. 6. С. 18-24. https://doi.org/10.32851/tnvtech.

6.3

Published

2024-12-05

How to Cite

Савчук, Т. О., & Пастух, І. П. (2024). PARTICIPANTS EMOTION RECOGNITION IN VIDEO CONFERENCES USING MULTIMODAL ANALYSIS. Таuridа Scientific Herald. Series: Technical Sciences, (4), 138-146. https://doi.org/10.32782/tnv-tech.2024.4.13

Issue

Section

COMPUTER SCIENCE AND INFORMATION TECHNOLOGY