RETRACTED ARTICLE: Automatic emotion recognition using deep neural network

Sujatha, R.; Chatterjee, Jyotir Moy; Pathy, Baibhav; Hu, Yu-Chen

doi:10.1007/s11042-024-20590-4

RETRACTED ARTICLE: Automatic emotion recognition using deep neural network

Published: 10 January 2025

Volume 84, pages 33633–33662, (2025)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

R. Sujatha¹,
Jyotir Moy Chatterjee ORCID: orcid.org/0000-0003-2527-916X²,
Baibhav Pathy³ &
…
Yu-Chen Hu⁴

804 Accesses
13 Citations
Explore all metrics

This article was retracted on 23 April 2026

This article has been updated

Abstract

Emotions are a vital semantic part of human correspondence. Emotions are significant for human correspondence as well as basic for human–computer cooperation. Viable correspondence between people is possibly achieved when both the importance and the emotion of the correspondence are perceived by all groups included. Understanding the significance of language has generally been concentrated on in natural language processing (NLP) as a semantic examination. In NLP, the text can be handled appropriately for classification. Emotion detection from facial emotion is the subfield of social signal processing applied in a wide assortment of regions, explicitly for human and PC collaboration. Many researchers have proposed various approaches, generally utilizing machine learning concepts. Automatic emotion recognition (AER) is significant for working with consistent intuitiveness between a person and a smart device toward fully acknowledging an intelligent society. Many researchers examined cross-lingual and multilingual speech emotion as a stage toward language-free emotion acknowledgment in natural speech. In the present work, we are proposing a deep learning-based AER system using four openly accessible datasets, namely Basic Arabic Vocal Emotions Dataset (BAVED), Acted Emotional Speech Dynamic Database (AESDD), Urdu written in Latin/Roman Script (URDU), and Toronto Emotional Speech Set (TESS), by utilizing the Jupyter notebook and a Python library for music and audio synthesis named Librosa. The experimental results exhibited that the proposed approach achieves better than the existing approaches, i.e., the accuracy of the proposed system with the URDU dataset is 96.24%, the TESS dataset is 99.10%, the AESDD dataset is 65.97%, and the BAVED dataset is 73.12%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from €37.37 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Institutional subscriptions

Cross-lingual deep learning model for gender-based emotion detection

Article 26 August 2023

Audio and Text-Based Emotion Recognition System Using Deep Learning

Convolutional neural network-based cross-corpus speech emotion recognition with data augmentation and features fusion

Article 28 March 2022

Change history

23 April 2026
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s11042-026-21620-z

References

Er MB (2020) A novel approach for classification of speech emotions based on deep and acoustic features. IEEE Access 8:221640–221653
Article Google Scholar
Zvarevashe K, Olugbara O (2020) Ensemble learning of hybrid acoustic features for speech emotion recognition. Algorithms 13(3):70
Article Google Scholar
Hesam Sagha, Pavel Matejka, et al., Enhancing multilingual recognition of emotion in speech by language identification, In 17th Annual Conference of the International Speech Communication Association (Interspeech 2016), pp. 2949–2953.
Bo-Chang Chiou and Chia-Ping Chen, Speech emotion recognition with cross-lingual databases, In 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), pp. 558–561.
Je Hun Jeon, Duc Le, et al., A preliminary study of cross-lingual emotion recognition from speech: automatic classification versus human perception, In 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), pp. 2837–2840.
Neumann, M. (2018, April). Cross-lingual and multilingual speech emotion recognition on english and french. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 5769–5773). IEEE.
Schuller B, Vlasenko B, Eyben F, Wollmer M, Stuhlsatz A, Wendemuth A, Rigoll G (2010) Cross-corpus acoustic emotion recognition: Variances and strategies. IEEE Trans Affect Comput 1(2):119–131
Article Google Scholar
Wikipedia contributors. (2022). Urdu. Wikipedia. Retrieved August 24, 2022, from https://en.wikipedia.org/wiki/Urdu
Wu CH, Lin JC, Wei WL (2014) Survey on audiovisual emotion recognition: databases, features, and data fusion strategies. APSIPA Transactions on Signal and Information Processing 3
Latif S, Qayyum A, Usman M, Qadir J (2018) Cross lingual speech emotion recognition: Urdu vs. western languages. In 2018 International Conference on Frontiers of Information Technology (FIT) (pp. 88–93). IEEE
Aspandi D, Sukno F, Schuller B, Binefa X (2021) An enhanced adversarial network with combined latent features for spatio-temporal facial affect estimation in the wild. arXiv preprint arXiv:2102.09150
Zhang Z, Xu S, Cao S, Zhang S (2018) Deep convolutional neural network with mixup for environmental sound classification. In Chinese Conference on Pattern Recognition and Computer Vision (prcv) (pp. 356–367). Springer, Cham
Schuller B, Vlasenko B, Eyben F, Rigoll G, Wendemuth A (2009) Acoustic emotion recognition: A benchmark comparison of performances. In 2009 IEEE Workshop on Automatic Speech Recognition & Understanding (pp. 552–557). IEEE.
Mohamed O, Aly SA (2021) Arabic Speech Emotion Recognition Employing Wav2vec2. 0 and HuBERT Based on BAVED Dataset. arXiv preprint arXiv:2110.04425
Alnuaim AA, Zakariah M, Shukla PK, Alhadlaq A, Hatamleh WA, Tarazi H, ... Ratna R (2022) Human-Computer Interaction for Recognizing Speech Emotions Using Multilayer Perceptron Classifier. J Healthcare Eng 2022;6005446
Senthilkumar N, Karpakam S, Devi MG, Balakumaresan R, Dhilipkumar P (2022) Speech emotion recognition based on Bi-directional LSTM architecture and deep belief networks. Mater Today: Proc 57:2180–2184
Google Scholar
Li LQ, Xie K, Guo XL, Wen C, He JB (2022) Emotion recognition from speech with StarGAN and Dense-DCNN. IET Signal Proc 16(1):62–79
Article Google Scholar
Andayani F, Theng LB, Tsun MT, Chua C (2022) Hybrid LSTM-Transformer Model for Emotion Recognition from Speech Audio Files. IEEE Access 10:36018–36027
Article Google Scholar
Sound. (2022) Science World. Retrieved August 24, 2022, from https://www.scienceworld.ca/resource/sound/
Urdu Emotion Dataset (2021) Kaggle. Retrieved August 24, 2022, from https://www.kaggle.com/datasets/kingabzpro/urdu-emotion-dataset
Toronto Emotional Speech Set (TESS). (2019). Kaggle. Retrieved August 24, 2022, from https://www.kaggle.com/datasets/ejlok1/toronto-emotional-speech-set-tess
kingabzpro/Acted-Emotional-Speech-Dynamic-Database. (n.d.). DAGsHub. Retrieved August 24, 2022, from https://dagshub.com/kingabzpro/Acted-Emotional-Speech-Dynamic-Database
(n.d.). GitHub - 40uf411/Basic-Arabic-Vocal-Emotions-Dataset: Basic Arabic Vocal Emotions Dataset (BAVED) is a datasetthat contains an arabic words spelled in diffrent levels of emotions recorded in an audio/wav format. GitHub. Retrieved August 24, 2022, from https://github.com/40uf411/Basic-Arabic-Vocal-Emotions-Dataset
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) cikit-learn: Machine learning in python. J Mach Learn Res 12:2825–2830
Google Scholar
Pawar MD, Kokate RD (2021) Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients. Multimed Tools Appl 80(10):15563–15587
Article Google Scholar
Jahangir R, Teh YW, Hanif F, Mujtaba G (2021) Deep learning approaches for speech emotion recognition: State of the art and research challenges. Multimed Tools Appl 80(16):23745–23812
Article Google Scholar
Valstar MF, Pantic M (2010) Induced disgust, happiness and surprise: An addition to the MMI facial expression database. In Proceedings of the 3rd International Workshop on EMOTION (pp. 65–72)
Kaliouby RE, Robinson P (2005) Real-time inference of complex mental states from facial expressions and head gestures. In Proceedings of the 7th international conference on Multimodal interfaces (pp. 1–8)
Zhao L, Zhao Y, Zhang J, Zhang W (2018) Emotion recognition from EEG signals using deep learning with kernel methods. IEEE Trans Affect Comput 9(1):94–105
Google Scholar
Liu F, Shen H, Shen Y, Cui L (2020) A survey on deep learning-based emotion recognition: Toward multimodal fusion. IEEE Transactions on Affective Computing 1–1
Koelstra S, Muhl C, Soleymani M, Lee JS, Yazdani A, Ebrahimi T, ... Patras I (2012) DEAP: A database for emotion analysis; using physiological signals. IEEE Trans Affective Comput 3(1);18–31
Wang Y, Zhang Y, Ji Q, Zhang B (2020) Emotion recognition from physiological signals using a multimodal deep belief network. IEEE Trans Affect Comput 11(2):178–191
Google Scholar
Singh P, Sahidullah M, Saha G (2023) Modulation spectral features for speech emotion recognition using deep neural networks. Speech Commun 146:53–69
Article Google Scholar
Bashir MF, Javed AR, Arshad MU, Gadekallu TR, Shahzad W, Beg MO (2022) Context aware emotion detection from low resource urdu language using deep neural network. Transactions on Asian and Low-Resource Language Information Processing
Maheshwari D, Ghosh SK, Tripathy RK, Sharma M, Acharya UR (2021) Automated accurate emotion recognition system using rhythm-specific deep convolutional neural network technique with multi-channel EEG signals. Comput Biol Med 134:104428
Article Google Scholar
Nakisa B, Rastgoo MN, Rakotonirainy A, Maire F, Chandran V (2020) Automatic emotion recognition using temporal multimodal deep learning. IEEE Access 8:225463–225474
Article Google Scholar
Mehendale N (2020) Facial emotion recognition using convolutional neural networks (FERC). SN Appl Sci 2(3):446
Article Google Scholar
Sharma LD, Bhattacharyya A (2021) A computerized approach for automatic human emotion recognition using sliding mode singular spectrum analysis. IEEE Sens J 21(23):26931–26940
Article Google Scholar
Tiwari P, Darji AD (2022) A novel S-LDA features for automatic emotion recognition from speech using 1-D CNN. Int J Math, Eng Manag Sci 7(1):49
Google Scholar
Singh L, Gupta P, Katarya R, Jayvant P (2020) Twitter data in Emotional Analysis-A study. In 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) (pp. 1301–1305). IEEE
Gupta A, Gupta S, Katarya R (2021) InstaCovNet-19: A deep learning classification model for the detection of COVID-19 patients using Chest X-ray. Appl Soft Comput 99:106859
Article Google Scholar
Gupta G, Katarya R (2021) Research on understanding the effect of deep learning on user preferences. Arab J Sci Eng 46:3247–3286
Article Google Scholar

Download references

Funding

No funding has been received for this work.

Author information

Authors and Affiliations

School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, India
R. Sujatha
Department of CSE, Graphic Era University, Dehradun, India
Jyotir Moy Chatterjee
School of Electrical Engineering, Vellore Institute of Technology, Vellore, India
Baibhav Pathy
Department of Computer Science, Tunghai University, Taichung City, Taiwan
Yu-Chen Hu

Authors

R. Sujatha
View author publications
Search author on:PubMed Google Scholar
Jyotir Moy Chatterjee
View author publications
Search author on:PubMed Google Scholar
Baibhav Pathy
View author publications
Search author on:PubMed Google Scholar
Yu-Chen Hu
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Jyotir Moy Chatterjee.

Ethics declarations

Competing interests

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s11042-026-21620-z

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

About this article

Cite this article

Sujatha, R., Chatterjee, J.M., Pathy, B. et al. RETRACTED ARTICLE: Automatic emotion recognition using deep neural network. Multimed Tools Appl 84, 33633–33662 (2025). https://doi.org/10.1007/s11042-024-20590-4

Download citation

Received: 16 December 2022
Revised: 04 September 2023
Accepted: 23 December 2024
Published: 10 January 2025
Version of record: 10 January 2025
Issue date: August 2025
DOI: https://doi.org/10.1007/s11042-024-20590-4

Keywords

Profiles

R. Sujatha View author profile
Jyotir Moy Chatterjee View author profile
Yu-Chen Hu View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+

from €37.37 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Institutional subscriptions

RETRACTED ARTICLE: Automatic emotion recognition using deep neural network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Cross-lingual deep learning model for gender-based emotion detection

Audio and Text-Based Emotion Recognition System Using Deep Learning

Convolutional neural network-based cross-corpus speech emotion recognition with data augmentation and features fusion

Explore related subjects

Change history

23 April 2026

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now