Skip to main content
Log in

Transforming educational insights: strategic integration of federated learning for enhanced prediction of student learning outcomes

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Numerous educational institutions utilize data mining techniques to manage student records, particularly those related to academic achievements, which are essential in improving learning experiences and overall outcomes. Educational data mining (EDM) is a thriving research field that employs data mining and machine learning methods to extract valuable insights from educational databases, primarily focused on predicting students’ academic performance. This study proposes a novel federated learning (FL) standard that ensures the confidentiality of the dataset and allows for the prediction of student grades, categorized into four levels: low, good, average, and drop. Optimized features are incorporated into the training process to enhance model precision. This study evaluates the optimized dataset using five machine learning (ML) algorithms, namely support vector machine (SVM), decision tree, Naïve Bayes, K-nearest neighbors, and the proposed federated learning model. The models’ performance is assessed regarding accuracy, precision, recall, and F1-score, followed by a comprehensive comparative analysis. The results reveal that FL and SVM outperform the alternative models, demonstrating superior predictive performance for student grade classification. This study showcases the potential of federated learning in effectively utilizing educational data from various institutes while maintaining data privacy, contributing to educational data mining and machine learning advancements for student performance prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from €37.37 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Fig. 1
The alternative text for this image may have been generated using AI.
Fig. 2
The alternative text for this image may have been generated using AI.
Fig. 3
The alternative text for this image may have been generated using AI.
Fig. 4
The alternative text for this image may have been generated using AI.
Algorithm 1
The alternative text for this image may have been generated using AI.
Algorithm 2
The alternative text for this image may have been generated using AI.
Algorithm 3
The alternative text for this image may have been generated using AI.
Algorithm 4
The alternative text for this image may have been generated using AI.
Algorithm 5
The alternative text for this image may have been generated using AI.
Fig. 5
The alternative text for this image may have been generated using AI.
Fig. 6
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Availability of data and materials

Not applicable.

References

  1. Yassein NA, Helali RGM, Mohomad SB et al (2017) Predicting student academic performance in ksa using data mining techniques. J Inf Technol Softw Eng 7(5):1–5

    Google Scholar 

  2. Mahmood T, Li J, Pei Y, Akhtar F, Imran A, Rehman KU (2020) A brief survey on breast cancer diagnostic with deep learning schemes using multi-image modalities. IEEE Access 8:165779–165809

    Google Scholar 

  3. Siddique A, Jan A, Majeed F, Qahmash AI, Quadri NN, Wahab MOA (2021) Predicting academic performance using an efficient model based on fusion of classifiers. Appl Sci 11(24):11845

    Google Scholar 

  4. Pujianto U, Prasetyo WA, Taufani AR (2020) Students academic performance prediction with k-nearest neighbor and c4. 5 on smote-balanced data. In: 2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), pp. 348–353. IEEE

  5. Alwarthan SA, Aslam N, Khan IU (2022) Predicting student academic performance at higher education using data mining: a systematic review. Appl Comput Intell Soft Comput 2022:8924028

    Google Scholar 

  6. Namoun A, Alshanqiti A (2020) Predicting student performance using data mining and learning analytics techniques: a systematic literature review. Appl Sci 11(1):237

    Google Scholar 

  7. Al-Mahmoud H, Al-Razgan M (2015) Arabic text mining a systematic review of the published literature 2002-2014. In: 2015 International Conference on Cloud Computing (ICCC), pp. 1–7. IEEE

  8. Chen D, Gao D, Xie Y, Pan X, Li Z, Li Y, Ding B, Zhou J (2023) Fs-real: Towards real-world cross-device federated learning. arXiv preprint arXiv:2303.13363

  9. Kumar M, Mehta G, Nayar N, Sharma A (2021) Emt: Ensemble meta-based tree model for predicting student performance in academics. In: IOP Conference Series: Materials Science and Engineering, vol. 1022, p. 012062. IOP Publishing

  10. Makhtar M, Nawang H, WAN SHAMSUDDIN SN (2017) Analysis on students performance using naïve bayes classifier. J Theoret Appl Inf Technol 95(16)

  11. Altabrawee H, Ali OAJ, Ajmi SQ (2019) Predicting students’ performance using machine learning techniques. J Univ BABYLON Pure Appl Sci 27(1):194–205

    Google Scholar 

  12. Apolinar-Gotardo M (2019) Using decision tree algorithm to predict student performance. Indian J Sci Technol 12:5

    Google Scholar 

  13. Karthikeyan VG, Thangaraj P, Karthik S (2020) Towards developing hybrid educational data mining model (hedm) for efficient and accurate student performance evaluation. Soft Comput 24(24):18477–18487

    Google Scholar 

  14. Dhilipan J, Vijayalakshmi N, Suriya, S., Christopher A (2021) Prediction of students performance using machine learning. In: IOP Conference Series: Materials Science and Engineering, vol. 1055, p. 012122. IOP Publishing

  15. Li S, Liu T (2021) Performance prediction for higher education students using deep learning. Complexity 2021:1–10

    Google Scholar 

  16. Khan MS, Mansour M, Khadar S, Mallick Z (2020) Evaluating healthcare performance using fuzzy logic. S Afr J Ind Eng 31(1):133–143

    Google Scholar 

  17. Zhang J, Zhu H, Wang F, Zhao J, Xu Q, Li H et al (2022) Security and privacy threats to federated learning: Issues, methods, and challenges. Secur Commun Netw

  18. Hu Z, Shaloudegi K, Zhang G, Yu Y (2022) Federated learning meets multi-objective optimization. IEEE Trans Netw Sci Eng 9(4):2039–2051

    MathSciNet  Google Scholar 

  19. Chen H, Wang H, Jin D, Li Y (2023) Advancements in federated learning: Models, methods, and privacy. arXiv preprint arXiv:2302.11466

  20. Realinho V, Machado J, Baptista L, Martins MV (2022) Predicting student dropout and academic success. Data 7(11):146

    Google Scholar 

  21. Tyler JH, Taylor ES, Kane TJ, Wooten AL (2010) Using student performance data to identify effective classroom practices. Am Econ Rev 100(2):256–260

    Google Scholar 

  22. Kaur K, Kaur K (2015) Analyzing the effect of difficulty level of a course on students performance prediction using data mining. In: 2015 1st International Conference on Next Generation Computing Technologies (NGCT), pp. 756–761. IEEE

  23. Bhardwaj BK, Pal S (2012) Data mining: A prediction for performance improvement using classification. arXiv preprint arXiv:1201.3418

  24. Pandey M, Taruna S (2016) Towards the integration of multiple classifier pertaining to the student’s performance prediction. Perspect Sci 8:364–366

    Google Scholar 

  25. Ch’ng LK (2024) Standing on the shoulders of generative ai. In: Transforming Education With Generative AI: Prompt Engineering and Synthetic Content Creation, pp. 1–21. IGI Global

  26. Chai CS, Chiu TK, Wang X, Jiang F, Lin X-F (2022) Modeling Chinese secondary school students’ behavioral intentions to learn artificial intelligence with the theory of planned behavior and self-determination theory. Sustainability 15(1):605

    Google Scholar 

  27. Chu Y-W, Hosseinalipour S, Tenorio E, Cruz L, Douglas K, Lan A, Brinton C (2022) Mitigating biases in student performance prediction via attention-based personalized federated learning. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 3033–3042

  28. Banabilah S, Aloqaily M, Alsayed E, Malik N, Jararweh Y (2022) Federated learning review: Fundamentals, enabling technologies, and future applications. Inf Process Manag 59(6):103061

    Google Scholar 

  29. Zhang C, Xie Y, Bai H, Yu B, Li W, Gao Y (2021) A survey on federated learning. Knowl-Based Syst 216:106775

    Google Scholar 

  30. Wen J, Zhang Z, Lan Y, Cui Z, Cai J, Zhang W (2023) A survey on federated learning: challenges and applications. Int J Mach Learn Cybern 14(2):513–535

    Google Scholar 

  31. Parlak B, Uysal AK (2023) A novel filter feature selection method for text classification: extensive feature selector. J Inf Sci 49(1):59–78

    Google Scholar 

  32. Parlak B, Uysal AK (2021) The effects of globalisation techniques on feature selection for text classification. J Inf Sci 47(6):727–739

    Google Scholar 

  33. Janan F, Ghosh SK (2021) Prediction of student’s performance using support vector machine classifier. In: Proc. Int. Conf. Ind. Eng. Oper. Manag, pp. 7078–7088

  34. Mahmood T, Li J, Pei Y, Akhtar F, Imran A, Yaqub M (2021) An automatic detection and localization of mammographic microcalcifications roi with multi-scale features using the radiomics analysis approach. Cancers 13(23):5916

    Google Scholar 

  35. Mahmood T, Li J, Pei Y, Akhtar F, Rehman MU, Wasti SH (2022) Breast lesions classifications of mammographic images using a deep convolutional neural network-based approach. PLoS ONE 17(1):0263126

    Google Scholar 

  36. Mahmood T, Li J, Pei Y, Akhtar F (2021) An automated in-depth feature learning algorithm for breast abnormality prognosis and robust characterization from mammography images using deep transfer learning. Biology 10(9):859

    Google Scholar 

  37. Rehman KU, Li J, Pei Y, Yasin A, Ali S, Mahmood T (2021) Computer vision-based microcalcification detection in digital mammograms using fully connected depthwise separable convolutional neural network. Sensors 21(14):4854

    Google Scholar 

  38. Mahmood T, Li J, Pei Y, Akhtar F, Jia Y, Khand ZH (2021) Breast mass detection and classification using deep convolutional neural networks for radiologist diagnosis assistance. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 1918–1923. IEEE

  39. Sarker IH (2022) Ai-based modeling: techniques, applications and research issues towards automation, intelligent and smart systems. SN Comput Sci 3(2):158

    MathSciNet  Google Scholar 

  40. Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, Santamaría J, Fadhel MA, Al-Amidie M, Farhan L (2021) Review of deep learning: concepts, cnn architectures, challenges, applications, future directions. J Big Data 8:1–74

    Google Scholar 

  41. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    Google Scholar 

  42. Moguerza JM, Muñoz A (2006) Support vector machines with applications. Stat Sci 21(3):322–336. https://doi.org/10.1214/088342306000000493

    Article  MathSciNet  Google Scholar 

  43. Ali S, Li J, Pei Y, Khurram R, Rehman KU, Mahmood T (2022) A comprehensive survey on brain tumor diagnosis using deep learning and emerging hybrid techniques with multi-modal mr image. Arch Comput Methods Eng 29(7):4871–4896

    Google Scholar 

  44. Yaqub M, Jinchao F, Arshid K, Ahmed S, Zhang W, Nawaz MZ, Mahmood T (2022) Deep learning-based image reconstruction for different medical imaging modalities. Comput Math Methods Med 2022:8750648

    Google Scholar 

  45. Marjanović M, Kovačević M, Bajat B, Voženílek V (2011) Landslide susceptibility assessment using svm machine learning algorithm. Eng Geol 123(3):225–234

    Google Scholar 

  46. Iqbal S, Qureshi AN, Li J, Choudhry IA, Mahmood T (2023) Dynamic learning for imbalance data in learning chest x-ray and ct images. Heliyon

    Google Scholar 

  47. Quinlan JR (2014) C4. 5: Programs for Machine Learning. Elsevier, Amsterdam

    Google Scholar 

  48. Iqbal S, Qureshi NA, Li J, Mahmood T (2023) On the analyses of medical images using traditional machine learning techniques and convolutional neural networks. Arch Comput Methods Eng 30(5):3173–3233

    Google Scholar 

  49. Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  50. Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. CRC press, Florida, Boca Raton

    Google Scholar 

  51. Divyabharathi Y, Someswari P (2018) A Framework for Student Academic Performance Using Naïve Bayes Classification. JAET

  52. Iqbal S, Qureshi AN, Ullah A, Li J, Mahmood T (2022) Improving the robustness and quality of biomedical cnn models through adaptive hyperparameter tuning. Appl Sci 12(22):11870

    Google Scholar 

  53. Jabbar A, Naseem S, Mahmood T, Saba T, Alamri FS, Rehman A (2023) Brain tumor detection and multi-grade segmentation through hybrid caps-vggnet model. IEEE Access 11(1):72518–72536

    Google Scholar 

  54. Rehman A, Sadad T, Saba T, Hussain A, Tariq U (2021) Real-time diagnosis system of covid-19 using x-ray images and deep learning. It Professional 23(4):57–62

    Google Scholar 

  55. McMahan B, Moore E, Ramage D, Hampson S, y Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR

  56. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Google Scholar 

  57. Saba T, Khan SU, Islam N, Abbas N, Rehman A, Javaid N, Anjum A (2019) Cloud-based decision support system for the detection and classification of malignant cells in breast cancer using breast cytology images. Microsc Res Tech 82(6):775–785

    Google Scholar 

  58. Sandra L, Lumbangaol F, Matsuo T (2021) Machine learning algorithm to predict student’s performance: a systematic literature review. TEM J 10(4):1919–1927

    Google Scholar 

  59. Naseem S, Mahmood T, Saba T, Alamri FS, Bahaj SA, Ateeq H, Farooq U (2023) Deepfert: An intelligent fertility rate prediction approach for men based on deep learning neural networks. IEEE Access

  60. Chen H-C, Prasetyo E, Tseng S-S, Putra KT, Kusumawardani SS, Weng C-E (2022) Week-wise student performance early prediction in virtual learning environment using a deep explainable artificial intelligence. Appl Sci 12(4):1885

    Google Scholar 

  61. Khan A, Ghosh SK (2021) Student performance analysis and prediction in classroom learning: a review of educational data mining studies. Educ Inf Technol 26:205–240

    Google Scholar 

  62. Ismail NH, Ahmad F, Aziz AA (2013) Implementing weka as a data mining tool to analyze students’ academic performances using naïve bayes classifier. In: UniSZA Postgraduate Research Conference

  63. Pandey M, Sharma VK (2013) A decision tree algorithm pertaining to the student performance analysis and prediction. Int J Comput Appl 61(13):1–5

    Google Scholar 

  64. Nedeva V, Pehlivanova T (2021) Students’ performance analyses using machine learning algorithms in weka. In: IOP Conference Series: Materials Science and Engineering, vol. 1031, pp 012061. IOP Publishing

Download references

Acknowledgements

The authors would like to thank China’s National Key R & D Program for providing the experimental facilities to perform these experiments. The author would like to thank Artificial Intelligence and Data Analytics Lab (AIDA) CCIS Prince Sultan University for their support.

Funding

This study is supported by the National Key R &D Program of China with project no. 2020YFB2104402.

Author information

Authors and Affiliations

Authors

Contributions

Farooq, Naseem, Mahmood, LI,  Saba, Rehman, and Mustafa conceived the study and experimented. Farooq, Naseem, Mahmood, Saba, and Li reviewed, drafted, and revised the study. Naseem, Mahmood, Rehman and Mustafa contributed to the design and analyzed data. Farooq, Mahmood, Saba, Li, and Mustafa conducted the proofreading of the study. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Tariq Mahmood.

Ethics declarations

Ethical approval

Not applicable.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farooq, U., Naseem, S., Mahmood, T. et al. Transforming educational insights: strategic integration of federated learning for enhanced prediction of student learning outcomes. J Supercomput 80, 16334–16367 (2024). https://doi.org/10.1007/s11227-024-06087-9

Download citation

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11227-024-06087-9

Keywords

Profiles

  1. Tariq Mahmood
  2. Jianqiang Li