Abstract
Numerous educational institutions utilize data mining techniques to manage student records, particularly those related to academic achievements, which are essential in improving learning experiences and overall outcomes. Educational data mining (EDM) is a thriving research field that employs data mining and machine learning methods to extract valuable insights from educational databases, primarily focused on predicting students’ academic performance. This study proposes a novel federated learning (FL) standard that ensures the confidentiality of the dataset and allows for the prediction of student grades, categorized into four levels: low, good, average, and drop. Optimized features are incorporated into the training process to enhance model precision. This study evaluates the optimized dataset using five machine learning (ML) algorithms, namely support vector machine (SVM), decision tree, Naïve Bayes, K-nearest neighbors, and the proposed federated learning model. The models’ performance is assessed regarding accuracy, precision, recall, and F1-score, followed by a comprehensive comparative analysis. The results reveal that FL and SVM outperform the alternative models, demonstrating superior predictive performance for student grade classification. This study showcases the potential of federated learning in effectively utilizing educational data from various institutes while maintaining data privacy, contributing to educational data mining and machine learning advancements for student performance prediction.











Similar content being viewed by others
Availability of data and materials
Not applicable.
References
Yassein NA, Helali RGM, Mohomad SB et al (2017) Predicting student academic performance in ksa using data mining techniques. J Inf Technol Softw Eng 7(5):1–5
Mahmood T, Li J, Pei Y, Akhtar F, Imran A, Rehman KU (2020) A brief survey on breast cancer diagnostic with deep learning schemes using multi-image modalities. IEEE Access 8:165779–165809
Siddique A, Jan A, Majeed F, Qahmash AI, Quadri NN, Wahab MOA (2021) Predicting academic performance using an efficient model based on fusion of classifiers. Appl Sci 11(24):11845
Pujianto U, Prasetyo WA, Taufani AR (2020) Students academic performance prediction with k-nearest neighbor and c4. 5 on smote-balanced data. In: 2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), pp. 348–353. IEEE
Alwarthan SA, Aslam N, Khan IU (2022) Predicting student academic performance at higher education using data mining: a systematic review. Appl Comput Intell Soft Comput 2022:8924028
Namoun A, Alshanqiti A (2020) Predicting student performance using data mining and learning analytics techniques: a systematic literature review. Appl Sci 11(1):237
Al-Mahmoud H, Al-Razgan M (2015) Arabic text mining a systematic review of the published literature 2002-2014. In: 2015 International Conference on Cloud Computing (ICCC), pp. 1–7. IEEE
Chen D, Gao D, Xie Y, Pan X, Li Z, Li Y, Ding B, Zhou J (2023) Fs-real: Towards real-world cross-device federated learning. arXiv preprint arXiv:2303.13363
Kumar M, Mehta G, Nayar N, Sharma A (2021) Emt: Ensemble meta-based tree model for predicting student performance in academics. In: IOP Conference Series: Materials Science and Engineering, vol. 1022, p. 012062. IOP Publishing
Makhtar M, Nawang H, WAN SHAMSUDDIN SN (2017) Analysis on students performance using naïve bayes classifier. J Theoret Appl Inf Technol 95(16)
Altabrawee H, Ali OAJ, Ajmi SQ (2019) Predicting students’ performance using machine learning techniques. J Univ BABYLON Pure Appl Sci 27(1):194–205
Apolinar-Gotardo M (2019) Using decision tree algorithm to predict student performance. Indian J Sci Technol 12:5
Karthikeyan VG, Thangaraj P, Karthik S (2020) Towards developing hybrid educational data mining model (hedm) for efficient and accurate student performance evaluation. Soft Comput 24(24):18477–18487
Dhilipan J, Vijayalakshmi N, Suriya, S., Christopher A (2021) Prediction of students performance using machine learning. In: IOP Conference Series: Materials Science and Engineering, vol. 1055, p. 012122. IOP Publishing
Li S, Liu T (2021) Performance prediction for higher education students using deep learning. Complexity 2021:1–10
Khan MS, Mansour M, Khadar S, Mallick Z (2020) Evaluating healthcare performance using fuzzy logic. S Afr J Ind Eng 31(1):133–143
Zhang J, Zhu H, Wang F, Zhao J, Xu Q, Li H et al (2022) Security and privacy threats to federated learning: Issues, methods, and challenges. Secur Commun Netw
Hu Z, Shaloudegi K, Zhang G, Yu Y (2022) Federated learning meets multi-objective optimization. IEEE Trans Netw Sci Eng 9(4):2039–2051
Chen H, Wang H, Jin D, Li Y (2023) Advancements in federated learning: Models, methods, and privacy. arXiv preprint arXiv:2302.11466
Realinho V, Machado J, Baptista L, Martins MV (2022) Predicting student dropout and academic success. Data 7(11):146
Tyler JH, Taylor ES, Kane TJ, Wooten AL (2010) Using student performance data to identify effective classroom practices. Am Econ Rev 100(2):256–260
Kaur K, Kaur K (2015) Analyzing the effect of difficulty level of a course on students performance prediction using data mining. In: 2015 1st International Conference on Next Generation Computing Technologies (NGCT), pp. 756–761. IEEE
Bhardwaj BK, Pal S (2012) Data mining: A prediction for performance improvement using classification. arXiv preprint arXiv:1201.3418
Pandey M, Taruna S (2016) Towards the integration of multiple classifier pertaining to the student’s performance prediction. Perspect Sci 8:364–366
Ch’ng LK (2024) Standing on the shoulders of generative ai. In: Transforming Education With Generative AI: Prompt Engineering and Synthetic Content Creation, pp. 1–21. IGI Global
Chai CS, Chiu TK, Wang X, Jiang F, Lin X-F (2022) Modeling Chinese secondary school students’ behavioral intentions to learn artificial intelligence with the theory of planned behavior and self-determination theory. Sustainability 15(1):605
Chu Y-W, Hosseinalipour S, Tenorio E, Cruz L, Douglas K, Lan A, Brinton C (2022) Mitigating biases in student performance prediction via attention-based personalized federated learning. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 3033–3042
Banabilah S, Aloqaily M, Alsayed E, Malik N, Jararweh Y (2022) Federated learning review: Fundamentals, enabling technologies, and future applications. Inf Process Manag 59(6):103061
Zhang C, Xie Y, Bai H, Yu B, Li W, Gao Y (2021) A survey on federated learning. Knowl-Based Syst 216:106775
Wen J, Zhang Z, Lan Y, Cui Z, Cai J, Zhang W (2023) A survey on federated learning: challenges and applications. Int J Mach Learn Cybern 14(2):513–535
Parlak B, Uysal AK (2023) A novel filter feature selection method for text classification: extensive feature selector. J Inf Sci 49(1):59–78
Parlak B, Uysal AK (2021) The effects of globalisation techniques on feature selection for text classification. J Inf Sci 47(6):727–739
Janan F, Ghosh SK (2021) Prediction of student’s performance using support vector machine classifier. In: Proc. Int. Conf. Ind. Eng. Oper. Manag, pp. 7078–7088
Mahmood T, Li J, Pei Y, Akhtar F, Imran A, Yaqub M (2021) An automatic detection and localization of mammographic microcalcifications roi with multi-scale features using the radiomics analysis approach. Cancers 13(23):5916
Mahmood T, Li J, Pei Y, Akhtar F, Rehman MU, Wasti SH (2022) Breast lesions classifications of mammographic images using a deep convolutional neural network-based approach. PLoS ONE 17(1):0263126
Mahmood T, Li J, Pei Y, Akhtar F (2021) An automated in-depth feature learning algorithm for breast abnormality prognosis and robust characterization from mammography images using deep transfer learning. Biology 10(9):859
Rehman KU, Li J, Pei Y, Yasin A, Ali S, Mahmood T (2021) Computer vision-based microcalcification detection in digital mammograms using fully connected depthwise separable convolutional neural network. Sensors 21(14):4854
Mahmood T, Li J, Pei Y, Akhtar F, Jia Y, Khand ZH (2021) Breast mass detection and classification using deep convolutional neural networks for radiologist diagnosis assistance. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 1918–1923. IEEE
Sarker IH (2022) Ai-based modeling: techniques, applications and research issues towards automation, intelligent and smart systems. SN Comput Sci 3(2):158
Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, Santamaría J, Fadhel MA, Al-Amidie M, Farhan L (2021) Review of deep learning: concepts, cnn architectures, challenges, applications, future directions. J Big Data 8:1–74
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Moguerza JM, Muñoz A (2006) Support vector machines with applications. Stat Sci 21(3):322–336. https://doi.org/10.1214/088342306000000493
Ali S, Li J, Pei Y, Khurram R, Rehman KU, Mahmood T (2022) A comprehensive survey on brain tumor diagnosis using deep learning and emerging hybrid techniques with multi-modal mr image. Arch Comput Methods Eng 29(7):4871–4896
Yaqub M, Jinchao F, Arshid K, Ahmed S, Zhang W, Nawaz MZ, Mahmood T (2022) Deep learning-based image reconstruction for different medical imaging modalities. Comput Math Methods Med 2022:8750648
Marjanović M, Kovačević M, Bajat B, Voženílek V (2011) Landslide susceptibility assessment using svm machine learning algorithm. Eng Geol 123(3):225–234
Iqbal S, Qureshi AN, Li J, Choudhry IA, Mahmood T (2023) Dynamic learning for imbalance data in learning chest x-ray and ct images. Heliyon
Quinlan JR (2014) C4. 5: Programs for Machine Learning. Elsevier, Amsterdam
Iqbal S, Qureshi NA, Li J, Mahmood T (2023) On the analyses of medical images using traditional machine learning techniques and convolutional neural networks. Arch Comput Methods Eng 30(5):3173–3233
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. CRC press, Florida, Boca Raton
Divyabharathi Y, Someswari P (2018) A Framework for Student Academic Performance Using Naïve Bayes Classification. JAET
Iqbal S, Qureshi AN, Ullah A, Li J, Mahmood T (2022) Improving the robustness and quality of biomedical cnn models through adaptive hyperparameter tuning. Appl Sci 12(22):11870
Jabbar A, Naseem S, Mahmood T, Saba T, Alamri FS, Rehman A (2023) Brain tumor detection and multi-grade segmentation through hybrid caps-vggnet model. IEEE Access 11(1):72518–72536
Rehman A, Sadad T, Saba T, Hussain A, Tariq U (2021) Real-time diagnosis system of covid-19 using x-ray images and deep learning. It Professional 23(4):57–62
McMahan B, Moore E, Ramage D, Hampson S, y Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Saba T, Khan SU, Islam N, Abbas N, Rehman A, Javaid N, Anjum A (2019) Cloud-based decision support system for the detection and classification of malignant cells in breast cancer using breast cytology images. Microsc Res Tech 82(6):775–785
Sandra L, Lumbangaol F, Matsuo T (2021) Machine learning algorithm to predict student’s performance: a systematic literature review. TEM J 10(4):1919–1927
Naseem S, Mahmood T, Saba T, Alamri FS, Bahaj SA, Ateeq H, Farooq U (2023) Deepfert: An intelligent fertility rate prediction approach for men based on deep learning neural networks. IEEE Access
Chen H-C, Prasetyo E, Tseng S-S, Putra KT, Kusumawardani SS, Weng C-E (2022) Week-wise student performance early prediction in virtual learning environment using a deep explainable artificial intelligence. Appl Sci 12(4):1885
Khan A, Ghosh SK (2021) Student performance analysis and prediction in classroom learning: a review of educational data mining studies. Educ Inf Technol 26:205–240
Ismail NH, Ahmad F, Aziz AA (2013) Implementing weka as a data mining tool to analyze students’ academic performances using naïve bayes classifier. In: UniSZA Postgraduate Research Conference
Pandey M, Sharma VK (2013) A decision tree algorithm pertaining to the student performance analysis and prediction. Int J Comput Appl 61(13):1–5
Nedeva V, Pehlivanova T (2021) Students’ performance analyses using machine learning algorithms in weka. In: IOP Conference Series: Materials Science and Engineering, vol. 1031, pp 012061. IOP Publishing
Acknowledgements
The authors would like to thank China’s National Key R & D Program for providing the experimental facilities to perform these experiments. The author would like to thank Artificial Intelligence and Data Analytics Lab (AIDA) CCIS Prince Sultan University for their support.
Funding
This study is supported by the National Key R &D Program of China with project no. 2020YFB2104402.
Author information
Authors and Affiliations
Contributions
Farooq, Naseem, Mahmood, LI, Saba, Rehman, and Mustafa conceived the study and experimented. Farooq, Naseem, Mahmood, Saba, and Li reviewed, drafted, and revised the study. Naseem, Mahmood, Rehman and Mustafa contributed to the design and analyzed data. Farooq, Mahmood, Saba, Li, and Mustafa conducted the proofreading of the study. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Ethical approval
Not applicable.
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Farooq, U., Naseem, S., Mahmood, T. et al. Transforming educational insights: strategic integration of federated learning for enhanced prediction of student learning outcomes. J Supercomput 80, 16334–16367 (2024). https://doi.org/10.1007/s11227-024-06087-9
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s11227-024-06087-9

