Abstract
Adopting an efficient software process model is critical for building high-quality software applications. An important factor impacting the software development process is an accurate estimate of human effort required to complete the software project. While machine learning methods were historically used to develop estimation models, there has been little investigation into the potential of deep convolutional neural networks (DCNNs) for improving software effort estimation. One of the biggest obstacles in using DCNN for this purpose is the common nature of software datasets, which often consist of vectorized samples rather than matrices. To defeat this obstacle and reduce vagueness in software attribute measurement, this study uses Fuzzy theory to generate an appropriate two-dimensional datapoint representation. The fuzzy clustering is commonly used to split dataset samples into separate clusters, which can help to generate Fuzzy membership functions. This approach makes it easier to generate a two-dimensional array representation for each data sample based on the membership values, allowing it to be used as input to the DCNN model. The efficiency of the proposed model was thoroughly evaluated using PROMISE benchmark datasets. The findings based on mean absolute errors and standardized accuracy show that our proposed model produced very good performance with low error rates and outperformed several current state-of-the-art effort estimation models. Nonetheless, further research is needed to determine the impact of different cluster numbers and features on the performance of our model. In conclusion, this study emphasizes the possibility for incorporating DCNN into software effort estimates and highlights the viability of utilizing fuzzy modeling and clustering techniques to enhance the data representation of software datasets.








Similar content being viewed by others
Data availability
The data that support the findings of this study are openly available on public repository: http://promise.site.uottawa.ca/SERepository/datasets-page.html.
Change history
27 December 2024
A Correction to this paper has been published: https://doi.org/10.1007/s00521-024-10906-8
References
Alqasrawi Y, Azzeh M, Elsheikh Y (2022) Locally weighted regression with different kernel smoothers for software effort estimation. Sci Comput Program 214:102744. https://doi.org/10.1016/J.SCICO.2021.102744
Tawosi V, Sarro F, Petrozziello A, Harman M (2022) Multi-objective software effort estimation: a replication study. IEEE Trans Softw Eng 48(8):3185–3205. https://doi.org/10.1109/TSE.2021.3083360
Minku LL (2019) A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation. Empir Softw Eng 24(5):3153–3204. https://doi.org/10.1007/S10664-019-09686-W/TABLES/9
Xia T, Shu R, Shen X, Menzies T (2022) Sequential model optimization for software effort estimation. IEEE Trans Softw Eng 48(6):1994–2009. https://doi.org/10.1109/TSE.2020.3047072
Phannachitta P (2020) On an optimal analogy-based software effort estimation. Inf Softw Technol 125:106330. https://doi.org/10.1016/J.INFSOF.2020.106330
Azzeh M, Nassif AB (2017) Analyzing the relationship between project productivity and environment factors in the use case points method. J Softw: Evolut Process. https://doi.org/10.1002/smr.1882
Keung J, Kocaguneli E, Menzies T (2013) Finding conclusion stability for selecting the best effort predictor in software effort estimation. Autom Softw Eng 20(4):543–567. https://doi.org/10.1007/s10515-012-0108-5
Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Software Eng 38(6):1403–1416. https://doi.org/10.1109/TSE.2011.111
Kassaymeh S, Alweshah M, Al-Betar MA, Hammouri AI, M. A. Al-Ma’aitah, (2023) Software effort estimation modeling and fully connected artificial neural network optimization using soft computing techniques. Cluster Comput. https://doi.org/10.1007/S10586-023-03979-Y/METRICS
Pal N, Yadav MP, Yadav DK (2023) Appropriate number of analogues in analogy based software effort estimation using quality datasets. Cluster Comput. https://doi.org/10.1007/S10586-023-03967-2/METRICS
Pantoni RP, Mossin EA, Brandão D (2008) Task effort fuzzy estimator for software development. INFOCOMP J Comput Sci 7(2):84–89
Minku LL (2021) Multi-stream online transfer learning for software effort estimation: Is it necessary?. In: Proceedings of the 17th international conference on predictive models and data analytics in software engineeringhttps://doi.org/10.1145/3475960.3475988.
Azzeh M, Nassif AB (2013) Fuzzy Model Tree for early effort estimation. In: 2013 12th international conference on machine learning and applications. https://doi.org/10.1109/ICMLA.2013.115.
Kocaguneli E, Menzies T, Keung J, Cok D, Madachy R (2013) Active Learning and effort estimation: finding the essential content of software effort estimation data. IEEE Trans Softw Eng 39(8):1040–1053. https://doi.org/10.1109/TSE.2012.88
Rao KE, Rao GA, Rao E (2020) “Retraction Note: ensemble learning with recursive feature elimination integrated software effort estimation: a novel approach. Evolut Intell 14:151–162. https://doi.org/10.1007/s12065-020-00360-5
Kocaguneli E, Menzies T, Bener AB, Keung JW (2012) Exploiting the essential assumptions of analogy-based effort estimation. IEEE Trans Software Eng 38(2):425–438. https://doi.org/10.1109/TSE.2011.27
de Morais RA (2021) Deep learning based models for software effort estimation using story points in agile environments. https://doi.org/10.7939/R3-JCF5-8X08
Phan H, Jannesari A (2022) Story point effort estimation by text level graph neural network. Accessed 31 Mar 2023. [Online]. Available: https://arxiv.org/abs/2203.03062v2
Kassem H, Mahar K, Saad AA (2023) Story point estimation using issue reports with deep attention neural network. e-Inf Softw Eng J 17(1):230104. https://doi.org/10.37190/e-Inf230104
Marapelli B, Carie A, Islam SM ((2020) RNN-CNN model: A bi-directional long short-term memory deep learning network for story point estimation. In: CITISIA 2020 5th international conference on innovative technologies in intelligent systems and industrial applications. https://doi.org/10.1109/CITISIA50690.2020.9371770.
Choetkiertikul M, Dam HK, Tran T, Pham T, Ghose A, Menzies T (2019) A deep learning model for estimating story points. IEEE Trans Softw Eng 45(7):637–656. https://doi.org/10.1109/TSE.2018.2792473
Zadeh LA (1997) Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst 90(2):111–127. https://doi.org/10.1016/S0165-0114(97)00077-8
Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10(2–3):191–203. https://doi.org/10.1016/0098-3004(84)90020-7
Liu M, Shi J, Li Z, Li C, Zhu J, Liu S (2017) Towards better analysis of deep convolutional neural networks. IEEE Trans Vis Comput Graph 23(1):91–100. https://doi.org/10.1109/TVCG.2016.2598831
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516. https://doi.org/10.1007/s10462-020-09825-6
Kumar PS, Behera HS, Kumari A, Nayak J, Naik B (2020) Advancement from neural networks to deep learning in software effort estimation: perspective of two decades. Comput Sci Rev 38:100288
Kocaguneli E, Menzies T, Keung JW (2013) Kernel methods for software effort estimation: effects of different kernel functions and bandwidths on estimation accuracy. Empir Softw Eng 18(1):1–24. https://doi.org/10.1007/s10664-011-9189-1
Azzeh M, Nassi AB (2015) Analogy-based effort estimation: a new method to discover set of analogies from dataset characteristics. IET Softw. https://doi.org/10.1049/iet-sen.2013.0165
Jose Thiago JTH, Oliveira ALI (2021) Ensemble effort estimation using dynamic selection. J Syst Softw 175:110904. https://doi.org/10.1016/J.JSS.2021.110904
Nassif AB, Azzeh M, Capretz LF, Ho D (2013) A comparison between decision trees and decision tree forest models for software development effort estimation. In: 2013 Third international conference on communications and information technology (ICCIT). https://doi.org/10.1109/ICCITechnology.2013.6579553.
Zhang J, Chen L (2019) Clustering-based undersampling with random over sampling examples and support vector machine for imbalanced classification of breast cancer diagnosis. Comput Assist Surg 24(sup2):62–72. https://doi.org/10.1080/24699322.2019.1649074
Azzeh M, Nassif AB (2016) A hybrid model for estimating software project effort from use case points. Appl Soft Comput J 49:981–989. https://doi.org/10.1016/j.asoc.2016.05.008
Sarno R, Sidabutar J (2015) Improving the accuracy of COCOMO's effort estimation based on neural networks and fuzzy logic model. In: 2015 International Conference on Information & Communication Technology and Systems (ICTS) Accessed 07 Sep 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/7379898/
Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng 32(11):883–895. https://doi.org/10.1109/TSE.2006.114
Azzeh M, Nassif AB (2018) Project productivity evaluation in early software effort estimation. J Softw: Evolut Process. https://doi.org/10.1002/smr.2110
Kaushik A, Kaur P, Choudhary N (2022) Stacking regularization in analogy-based software effort estimation. Soft comput 26(3):1197–1216. https://doi.org/10.1007/S00500-021-06564-W/FIGURES/9
Jørgensen M, Halkjelsvik T (2020) Sequence effects in the estimation of software development effort. J Syst Softw 159:110448. https://doi.org/10.1016/j.jss.2019.110448
Azzeh M, Nassif AB, Minku LL (2015) An empirical evaluation of ensemble adjustment methods for analogy-based effort estimation. J Syst Softw. https://doi.org/10.1016/j.jss.2015.01.028
Minku LL, Yao X (2013) Ensembles and locality: insight on improving software effort estimation. Inf Softw Technol 55(8):1512–1528. https://doi.org/10.1016/j.infsof.2012.09.012
Gallego JJC, Rodríguez D, Sicilia MÁ, Rubio MG, Crespo AG (2007) Software project effort estimation based on multiple parametric models generated through data clustering. J Comput Sci Technol 22(3):371–378. https://doi.org/10.1007/s11390-007-9043-5
Azzeh M, Nassif AB, Martín CL (2021) Empirical analysis on productivity prediction and locality for use case points method. Softw Qual J 29(2):309–336. https://doi.org/10.1007/s11219-021-09547-0
Kaur P, Singh R (2015) A proposed framework for software effort estimation using the combinational approach of fuzzy logic and neural networks. Int J Hybrid Inf Technol 8(10):73–80. https://doi.org/10.14257/ijhit.2015.8.10.07
Amazal FA, Idri A (2020) Estimating software development effort using fuzzy clustering-based analogy. J Softw: Evolut Process. https://doi.org/10.1002/smr.2324
Nassif AB, Azzeh M, Idri A, Abran A (2019) Software development effort estimation using regression fuzzy models. Comput Intell Neurosci. https://doi.org/10.1155/2019/8367214
Azzeh M, Neagu D, Cowling P (2008) Adjusting analogy software effort estimation based on fuzzy logic. In: ICSOFT 2008-Proceedings of the 3rd international conference on software and data technologies
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(08):841–847. https://doi.org/10.1109/34.85677
Menzies B, Caglayan B, Kocaguneli E, Krall J, Peters F. The PROMISE repository of empirical software engineering. PROMISE. https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=The+PROMISE+Repository+of+empirical+software+engineering+data+&btnG= (Accessed 27 Dec 2020)
Azzeh M, Nassif AB, Elsheikh Y, Angelis L (2022) On the value of project productivity for early effort estimation. Sci Comput Program 219:102819. https://doi.org/10.1016/J.SCICO.2022.102819
Kocaguneli E, Menzies T (2013) Software effort models should be assessed via leave-one-out validation. J Syst Softw 86(7):1879–1890. https://doi.org/10.1016/j.jss.2013.02.053
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known conflict interest or personal relationships that could have appeared to influence the work reported in this paper.”
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised to correct the affiliation section
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Azzeh, M., Alkhateeb, A. & Bou Nassif, A. Software effort estimation using convolutional neural network and fuzzy clustering. Neural Comput & Applic 36, 14449–14464 (2024). https://doi.org/10.1007/s00521-024-09855-z
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s00521-024-09855-z

