Skip to main content
Log in

RETRACTED ARTICLE: Automatic emotion recognition using deep neural network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

This article was retracted on 23 April 2026

This article has been updated

Abstract

Emotions are a vital semantic part of human correspondence. Emotions are significant for human correspondence as well as basic for human–computer cooperation. Viable correspondence between people is possibly achieved when both the importance and the emotion of the correspondence are perceived by all groups included. Understanding the significance of language has generally been concentrated on in natural language processing (NLP) as a semantic examination. In NLP, the text can be handled appropriately for classification. Emotion detection from facial emotion is the subfield of social signal processing applied in a wide assortment of regions, explicitly for human and PC collaboration. Many researchers have proposed various approaches, generally utilizing machine learning concepts. Automatic emotion recognition (AER) is significant for working with consistent intuitiveness between a person and a smart device toward fully acknowledging an intelligent society. Many researchers examined cross-lingual and multilingual speech emotion as a stage toward language-free emotion acknowledgment in natural speech. In the present work, we are proposing a deep learning-based AER system using four openly accessible datasets, namely Basic Arabic Vocal Emotions Dataset (BAVED), Acted Emotional Speech Dynamic Database (AESDD), Urdu written in Latin/Roman Script (URDU), and Toronto Emotional Speech Set (TESS), by utilizing the Jupyter notebook and a Python library for music and audio synthesis named Librosa. The experimental results exhibited that the proposed approach achieves better than the existing approaches, i.e., the accuracy of the proposed system with the URDU dataset is 96.24%, the TESS dataset is 99.10%, the AESDD dataset is 65.97%, and the BAVED dataset is 73.12%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from €37.37 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Fig. 1
The alternative text for this image may have been generated using AI.
Fig. 2
The alternative text for this image may have been generated using AI.
Fig. 3
The alternative text for this image may have been generated using AI.
Fig. 4
The alternative text for this image may have been generated using AI.
Fig. 5
The alternative text for this image may have been generated using AI.
Fig. 6
The alternative text for this image may have been generated using AI.
Fig. 7
The alternative text for this image may have been generated using AI.
Fig. 8
The alternative text for this image may have been generated using AI.
Fig. 9
The alternative text for this image may have been generated using AI.
Fig. 10
The alternative text for this image may have been generated using AI.
Fig. 11
The alternative text for this image may have been generated using AI.
Fig. 12
The alternative text for this image may have been generated using AI.
Fig. 13
The alternative text for this image may have been generated using AI.
Fig. 14
The alternative text for this image may have been generated using AI.
Fig. 15
The alternative text for this image may have been generated using AI.
Fig. 16
The alternative text for this image may have been generated using AI.
Fig. 17
The alternative text for this image may have been generated using AI.
Fig. 18
The alternative text for this image may have been generated using AI.
Fig. 19
The alternative text for this image may have been generated using AI.
Fig. 20
The alternative text for this image may have been generated using AI.
Fig. 21
The alternative text for this image may have been generated using AI.
Fig. 22
The alternative text for this image may have been generated using AI.
Fig. 23
The alternative text for this image may have been generated using AI.
Fig. 24
The alternative text for this image may have been generated using AI.
Fig. 25
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Change history

References

  1. Er MB (2020) A novel approach for classification of speech emotions based on deep and acoustic features. IEEE Access 8:221640–221653

    Article  Google Scholar 

  2. Zvarevashe K, Olugbara O (2020) Ensemble learning of hybrid acoustic features for speech emotion recognition. Algorithms 13(3):70

    Article  Google Scholar 

  3. Hesam Sagha, Pavel Matejka, et al., Enhancing multilingual recognition of emotion in speech by language identification, In 17th Annual Conference of the International Speech Communication Association (Interspeech 2016), pp. 2949–2953.

  4. Bo-Chang Chiou and Chia-Ping Chen, Speech emotion recognition with cross-lingual databases, In 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), pp. 558–561.

  5. Je Hun Jeon, Duc Le, et al., A preliminary study of cross-lingual emotion recognition from speech: automatic classification versus human perception, In 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), pp. 2837–2840.

  6. Neumann, M. (2018, April). Cross-lingual and multilingual speech emotion recognition on english and french. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 5769–5773). IEEE.

  7. Schuller B, Vlasenko B, Eyben F, Wollmer M, Stuhlsatz A, Wendemuth A, Rigoll G (2010) Cross-corpus acoustic emotion recognition: Variances and strategies. IEEE Trans Affect Comput 1(2):119–131

    Article  Google Scholar 

  8. Wikipedia contributors. (2022). Urdu. Wikipedia. Retrieved August 24, 2022, from https://en.wikipedia.org/wiki/Urdu

  9. Wu CH, Lin JC, Wei WL (2014) Survey on audiovisual emotion recognition: databases, features, and data fusion strategies. APSIPA Transactions on Signal and Information Processing 3

  10. Latif S, Qayyum A, Usman M, Qadir J (2018) Cross lingual speech emotion recognition: Urdu vs. western languages. In 2018 International Conference on Frontiers of Information Technology (FIT) (pp. 88–93). IEEE

  11. Aspandi D, Sukno F, Schuller B, Binefa X (2021) An enhanced adversarial network with combined latent features for spatio-temporal facial affect estimation in the wild. arXiv preprint arXiv:2102.09150

  12. Zhang Z, Xu S, Cao S, Zhang S (2018) Deep convolutional neural network with mixup for environmental sound classification. In Chinese Conference on Pattern Recognition and Computer Vision (prcv) (pp. 356–367). Springer, Cham

  13. Schuller B, Vlasenko B, Eyben F, Rigoll G, Wendemuth A (2009) Acoustic emotion recognition: A benchmark comparison of performances. In 2009 IEEE Workshop on Automatic Speech Recognition & Understanding (pp. 552–557). IEEE.

  14. Mohamed O, Aly SA (2021) Arabic Speech Emotion Recognition Employing Wav2vec2. 0 and HuBERT Based on BAVED Dataset. arXiv preprint arXiv:2110.04425

  15. Alnuaim AA, Zakariah M, Shukla PK, Alhadlaq A, Hatamleh WA, Tarazi H, ... Ratna R (2022) Human-Computer Interaction for Recognizing Speech Emotions Using Multilayer Perceptron Classifier. J Healthcare Eng 2022;6005446

  16. Senthilkumar N, Karpakam S, Devi MG, Balakumaresan R, Dhilipkumar P (2022) Speech emotion recognition based on Bi-directional LSTM architecture and deep belief networks. Mater Today: Proc 57:2180–2184

    Google Scholar 

  17. Li LQ, Xie K, Guo XL, Wen C, He JB (2022) Emotion recognition from speech with StarGAN and Dense-DCNN. IET Signal Proc 16(1):62–79

    Article  Google Scholar 

  18. Andayani F, Theng LB, Tsun MT, Chua C (2022) Hybrid LSTM-Transformer Model for Emotion Recognition from Speech Audio Files. IEEE Access 10:36018–36027

    Article  Google Scholar 

  19. Sound. (2022) Science World. Retrieved August 24, 2022, from https://www.scienceworld.ca/resource/sound/

  20. Urdu Emotion Dataset (2021) Kaggle. Retrieved August 24, 2022, from https://www.kaggle.com/datasets/kingabzpro/urdu-emotion-dataset

  21. Toronto Emotional Speech Set (TESS). (2019). Kaggle. Retrieved August 24, 2022, from https://www.kaggle.com/datasets/ejlok1/toronto-emotional-speech-set-tess

  22. kingabzpro/Acted-Emotional-Speech-Dynamic-Database. (n.d.). DAGsHub. Retrieved August 24, 2022, from https://dagshub.com/kingabzpro/Acted-Emotional-Speech-Dynamic-Database

  23. (n.d.). GitHub - 40uf411/Basic-Arabic-Vocal-Emotions-Dataset: Basic Arabic Vocal Emotions Dataset (BAVED) is a datasetthat contains an arabic words spelled in diffrent levels of emotions recorded in an audio/wav format. GitHub. Retrieved August 24, 2022, from https://github.com/40uf411/Basic-Arabic-Vocal-Emotions-Dataset

  24. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) cikit-learn: Machine learning in python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  25. Pawar MD, Kokate RD (2021) Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients. Multimed Tools Appl 80(10):15563–15587

    Article  Google Scholar 

  26. Jahangir R, Teh YW, Hanif F, Mujtaba G (2021) Deep learning approaches for speech emotion recognition: State of the art and research challenges. Multimed Tools Appl 80(16):23745–23812

    Article  Google Scholar 

  27. Valstar MF, Pantic M (2010) Induced disgust, happiness and surprise: An addition to the MMI facial expression database. In Proceedings of the 3rd International Workshop on EMOTION (pp. 65–72)

  28. Kaliouby RE, Robinson P (2005) Real-time inference of complex mental states from facial expressions and head gestures. In Proceedings of the 7th international conference on Multimodal interfaces (pp. 1–8)

  29. Zhao L, Zhao Y, Zhang J, Zhang W (2018) Emotion recognition from EEG signals using deep learning with kernel methods. IEEE Trans Affect Comput 9(1):94–105

    Google Scholar 

  30. Liu F, Shen H, Shen Y, Cui L (2020) A survey on deep learning-based emotion recognition: Toward multimodal fusion. IEEE Transactions on Affective Computing 1–1

  31. Koelstra S, Muhl C, Soleymani M, Lee JS, Yazdani A, Ebrahimi T, ... Patras I (2012) DEAP: A database for emotion analysis; using physiological signals. IEEE Trans Affective Comput 3(1);18–31

  32. Wang Y, Zhang Y, Ji Q, Zhang B (2020) Emotion recognition from physiological signals using a multimodal deep belief network. IEEE Trans Affect Comput 11(2):178–191

    Google Scholar 

  33. Singh P, Sahidullah M, Saha G (2023) Modulation spectral features for speech emotion recognition using deep neural networks. Speech Commun 146:53–69

    Article  Google Scholar 

  34. Bashir MF, Javed AR, Arshad MU, Gadekallu TR, Shahzad W, Beg MO (2022) Context aware emotion detection from low resource urdu language using deep neural network. Transactions on Asian and Low-Resource Language Information Processing

  35. Maheshwari D, Ghosh SK, Tripathy RK, Sharma M, Acharya UR (2021) Automated accurate emotion recognition system using rhythm-specific deep convolutional neural network technique with multi-channel EEG signals. Comput Biol Med 134:104428

    Article  Google Scholar 

  36. Nakisa B, Rastgoo MN, Rakotonirainy A, Maire F, Chandran V (2020) Automatic emotion recognition using temporal multimodal deep learning. IEEE Access 8:225463–225474

    Article  Google Scholar 

  37. Mehendale N (2020) Facial emotion recognition using convolutional neural networks (FERC). SN Appl Sci 2(3):446

    Article  Google Scholar 

  38. Sharma LD, Bhattacharyya A (2021) A computerized approach for automatic human emotion recognition using sliding mode singular spectrum analysis. IEEE Sens J 21(23):26931–26940

    Article  Google Scholar 

  39. Tiwari P, Darji AD (2022) A novel S-LDA features for automatic emotion recognition from speech using 1-D CNN. Int J Math, Eng Manag Sci 7(1):49

    Google Scholar 

  40. Singh L, Gupta P, Katarya R, Jayvant P (2020) Twitter data in Emotional Analysis-A study. In 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) (pp. 1301–1305). IEEE

  41. Gupta A, Gupta S, Katarya R (2021) InstaCovNet-19: A deep learning classification model for the detection of COVID-19 patients using Chest X-ray. Appl Soft Comput 99:106859

    Article  Google Scholar 

  42. Gupta G, Katarya R (2021) Research on understanding the effect of deep learning on user preferences. Arab J Sci Eng 46:3247–3286

    Article  Google Scholar 

Download references

Funding

No funding has been received for this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jyotir Moy Chatterjee.

Ethics declarations

Competing interests

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s11042-026-21620-z

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sujatha, R., Chatterjee, J.M., Pathy, B. et al. RETRACTED ARTICLE: Automatic emotion recognition using deep neural network. Multimed Tools Appl 84, 33633–33662 (2025). https://doi.org/10.1007/s11042-024-20590-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11042-024-20590-4

Keywords

Profiles

  1. R. Sujatha
  2. Jyotir Moy Chatterjee
  3. Yu-Chen Hu