Skip to main content
Log in

Software effort estimation using convolutional neural network and fuzzy clustering

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

A Correction to this article was published on 27 December 2024

This article has been updated

Abstract

Adopting an efficient software process model is critical for building high-quality software applications. An important factor impacting the software development process is an accurate estimate of human effort required to complete the software project. While machine learning methods were historically used to develop estimation models, there has been little investigation into the potential of deep convolutional neural networks (DCNNs) for improving software effort estimation. One of the biggest obstacles in using DCNN for this purpose is the common nature of software datasets, which often consist of vectorized samples rather than matrices. To defeat this obstacle and reduce vagueness in software attribute measurement, this study uses Fuzzy theory to generate an appropriate two-dimensional datapoint representation. The fuzzy clustering is commonly used to split dataset samples into separate clusters, which can help to generate Fuzzy membership functions. This approach makes it easier to generate a two-dimensional array representation for each data sample based on the membership values, allowing it to be used as input to the DCNN model. The efficiency of the proposed model was thoroughly evaluated using PROMISE benchmark datasets. The findings based on mean absolute errors and standardized accuracy show that our proposed model produced very good performance with low error rates and outperformed several current state-of-the-art effort estimation models. Nonetheless, further research is needed to determine the impact of different cluster numbers and features on the performance of our model. In conclusion, this study emphasizes the possibility for incorporating DCNN into software effort estimates and highlights the viability of utilizing fuzzy modeling and clustering techniques to enhance the data representation of software datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from €37.37 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Fig. 1
The alternative text for this image may have been generated using AI.
Fig. 2
The alternative text for this image may have been generated using AI.
Fig. 3
The alternative text for this image may have been generated using AI.
Fig. 4
The alternative text for this image may have been generated using AI.
Fig. 5
The alternative text for this image may have been generated using AI.
Fig. 6
The alternative text for this image may have been generated using AI.
Fig. 7
The alternative text for this image may have been generated using AI.
Fig. 8
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Data availability

The data that support the findings of this study are openly available on public repository: http://promise.site.uottawa.ca/SERepository/datasets-page.html.

Change history

References

  1. Alqasrawi Y, Azzeh M, Elsheikh Y (2022) Locally weighted regression with different kernel smoothers for software effort estimation. Sci Comput Program 214:102744. https://doi.org/10.1016/J.SCICO.2021.102744

    Article  Google Scholar 

  2. Tawosi V, Sarro F, Petrozziello A, Harman M (2022) Multi-objective software effort estimation: a replication study. IEEE Trans Softw Eng 48(8):3185–3205. https://doi.org/10.1109/TSE.2021.3083360

    Article  Google Scholar 

  3. Minku LL (2019) A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation. Empir Softw Eng 24(5):3153–3204. https://doi.org/10.1007/S10664-019-09686-W/TABLES/9

    Article  MATH  Google Scholar 

  4. Xia T, Shu R, Shen X, Menzies T (2022) Sequential model optimization for software effort estimation. IEEE Trans Softw Eng 48(6):1994–2009. https://doi.org/10.1109/TSE.2020.3047072

    Article  MATH  Google Scholar 

  5. Phannachitta P (2020) On an optimal analogy-based software effort estimation. Inf Softw Technol 125:106330. https://doi.org/10.1016/J.INFSOF.2020.106330

    Article  Google Scholar 

  6. Azzeh M, Nassif AB (2017) Analyzing the relationship between project productivity and environment factors in the use case points method. J Softw: Evolut Process. https://doi.org/10.1002/smr.1882

    Article  MATH  Google Scholar 

  7. Keung J, Kocaguneli E, Menzies T (2013) Finding conclusion stability for selecting the best effort predictor in software effort estimation. Autom Softw Eng 20(4):543–567. https://doi.org/10.1007/s10515-012-0108-5

    Article  Google Scholar 

  8. Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Software Eng 38(6):1403–1416. https://doi.org/10.1109/TSE.2011.111

    Article  MATH  Google Scholar 

  9. Kassaymeh S, Alweshah M, Al-Betar MA, Hammouri AI, M. A. Al-Ma’aitah, (2023) Software effort estimation modeling and fully connected artificial neural network optimization using soft computing techniques. Cluster Comput. https://doi.org/10.1007/S10586-023-03979-Y/METRICS

    Article  Google Scholar 

  10. Pal N, Yadav MP, Yadav DK (2023) Appropriate number of analogues in analogy based software effort estimation using quality datasets. Cluster Comput. https://doi.org/10.1007/S10586-023-03967-2/METRICS

    Article  MATH  Google Scholar 

  11. Pantoni RP, Mossin EA, Brandão D (2008) Task effort fuzzy estimator for software development. INFOCOMP J Comput Sci 7(2):84–89

    Google Scholar 

  12. Minku LL (2021) Multi-stream online transfer learning for software effort estimation: Is it necessary?. In: Proceedings of the 17th international conference on predictive models and data analytics in software engineeringhttps://doi.org/10.1145/3475960.3475988.

  13. Azzeh M, Nassif AB (2013) Fuzzy Model Tree for early effort estimation. In: 2013 12th international conference on machine learning and applications. https://doi.org/10.1109/ICMLA.2013.115.

  14. Kocaguneli E, Menzies T, Keung J, Cok D, Madachy R (2013) Active Learning and effort estimation: finding the essential content of software effort estimation data. IEEE Trans Softw Eng 39(8):1040–1053. https://doi.org/10.1109/TSE.2012.88

    Article  Google Scholar 

  15. Rao KE, Rao GA, Rao E (2020) “Retraction Note: ensemble learning with recursive feature elimination integrated software effort estimation: a novel approach. Evolut Intell 14:151–162. https://doi.org/10.1007/s12065-020-00360-5

    Article  MATH  Google Scholar 

  16. Kocaguneli E, Menzies T, Bener AB, Keung JW (2012) Exploiting the essential assumptions of analogy-based effort estimation. IEEE Trans Software Eng 38(2):425–438. https://doi.org/10.1109/TSE.2011.27

    Article  Google Scholar 

  17. de Morais RA (2021) Deep learning based models for software effort estimation using story points in agile environments. https://doi.org/10.7939/R3-JCF5-8X08

  18. Phan H, Jannesari A (2022) Story point effort estimation by text level graph neural network. Accessed 31 Mar 2023. [Online]. Available: https://arxiv.org/abs/2203.03062v2

  19. Kassem H, Mahar K, Saad AA (2023) Story point estimation using issue reports with deep attention neural network. e-Inf Softw Eng J 17(1):230104. https://doi.org/10.37190/e-Inf230104

    Article  Google Scholar 

  20. Marapelli B, Carie A, Islam SM ((2020) RNN-CNN model: A bi-directional long short-term memory deep learning network for story point estimation. In: CITISIA 2020 5th international conference on innovative technologies in intelligent systems and industrial applications. https://doi.org/10.1109/CITISIA50690.2020.9371770.

  21. Choetkiertikul M, Dam HK, Tran T, Pham T, Ghose A, Menzies T (2019) A deep learning model for estimating story points. IEEE Trans Softw Eng 45(7):637–656. https://doi.org/10.1109/TSE.2018.2792473

    Article  Google Scholar 

  22. Zadeh LA (1997) Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst 90(2):111–127. https://doi.org/10.1016/S0165-0114(97)00077-8

    Article  MathSciNet  MATH  Google Scholar 

  23. Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10(2–3):191–203. https://doi.org/10.1016/0098-3004(84)90020-7

    Article  MATH  Google Scholar 

  24. Liu M, Shi J, Li Z, Li C, Zhu J, Liu S (2017) Towards better analysis of deep convolutional neural networks. IEEE Trans Vis Comput Graph 23(1):91–100. https://doi.org/10.1109/TVCG.2016.2598831

    Article  MATH  Google Scholar 

  25. Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516. https://doi.org/10.1007/s10462-020-09825-6

    Article  MATH  Google Scholar 

  26. Kumar PS, Behera HS, Kumari A, Nayak J, Naik B (2020) Advancement from neural networks to deep learning in software effort estimation: perspective of two decades. Comput Sci Rev 38:100288

    Article  MathSciNet  MATH  Google Scholar 

  27. Kocaguneli E, Menzies T, Keung JW (2013) Kernel methods for software effort estimation: effects of different kernel functions and bandwidths on estimation accuracy. Empir Softw Eng 18(1):1–24. https://doi.org/10.1007/s10664-011-9189-1

    Article  Google Scholar 

  28. Azzeh M, Nassi AB (2015) Analogy-based effort estimation: a new method to discover set of analogies from dataset characteristics. IET Softw. https://doi.org/10.1049/iet-sen.2013.0165

    Article  Google Scholar 

  29. Jose Thiago JTH, Oliveira ALI (2021) Ensemble effort estimation using dynamic selection. J Syst Softw 175:110904. https://doi.org/10.1016/J.JSS.2021.110904

    Article  Google Scholar 

  30. Nassif AB, Azzeh M, Capretz LF, Ho D (2013) A comparison between decision trees and decision tree forest models for software development effort estimation. In: 2013 Third international conference on communications and information technology (ICCIT). https://doi.org/10.1109/ICCITechnology.2013.6579553.

  31. Zhang J, Chen L (2019) Clustering-based undersampling with random over sampling examples and support vector machine for imbalanced classification of breast cancer diagnosis. Comput Assist Surg 24(sup2):62–72. https://doi.org/10.1080/24699322.2019.1649074

    Article  MATH  Google Scholar 

  32. Azzeh M, Nassif AB (2016) A hybrid model for estimating software project effort from use case points. Appl Soft Comput J 49:981–989. https://doi.org/10.1016/j.asoc.2016.05.008

    Article  MATH  Google Scholar 

  33. Sarno R, Sidabutar J (2015) Improving the accuracy of COCOMO's effort estimation based on neural networks and fuzzy logic model. In: 2015 International Conference on Information & Communication Technology and Systems (ICTS) Accessed 07 Sep 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/7379898/

  34. Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng 32(11):883–895. https://doi.org/10.1109/TSE.2006.114

    Article  Google Scholar 

  35. Azzeh M, Nassif AB (2018) Project productivity evaluation in early software effort estimation. J Softw: Evolut Process. https://doi.org/10.1002/smr.2110

    Article  MATH  Google Scholar 

  36. Kaushik A, Kaur P, Choudhary N (2022) Stacking regularization in analogy-based software effort estimation. Soft comput 26(3):1197–1216. https://doi.org/10.1007/S00500-021-06564-W/FIGURES/9

    Article  MATH  Google Scholar 

  37. Jørgensen M, Halkjelsvik T (2020) Sequence effects in the estimation of software development effort. J Syst Softw 159:110448. https://doi.org/10.1016/j.jss.2019.110448

    Article  MATH  Google Scholar 

  38. Azzeh M, Nassif AB, Minku LL (2015) An empirical evaluation of ensemble adjustment methods for analogy-based effort estimation. J Syst Softw. https://doi.org/10.1016/j.jss.2015.01.028

    Article  MATH  Google Scholar 

  39. Minku LL, Yao X (2013) Ensembles and locality: insight on improving software effort estimation. Inf Softw Technol 55(8):1512–1528. https://doi.org/10.1016/j.infsof.2012.09.012

    Article  MATH  Google Scholar 

  40. Gallego JJC, Rodríguez D, Sicilia MÁ, Rubio MG, Crespo AG (2007) Software project effort estimation based on multiple parametric models generated through data clustering. J Comput Sci Technol 22(3):371–378. https://doi.org/10.1007/s11390-007-9043-5

    Article  MATH  Google Scholar 

  41. Azzeh M, Nassif AB, Martín CL (2021) Empirical analysis on productivity prediction and locality for use case points method. Softw Qual J 29(2):309–336. https://doi.org/10.1007/s11219-021-09547-0

    Article  MATH  Google Scholar 

  42. Kaur P, Singh R (2015) A proposed framework for software effort estimation using the combinational approach of fuzzy logic and neural networks. Int J Hybrid Inf Technol 8(10):73–80. https://doi.org/10.14257/ijhit.2015.8.10.07

    Article  MATH  Google Scholar 

  43. Amazal FA, Idri A (2020) Estimating software development effort using fuzzy clustering-based analogy. J Softw: Evolut Process. https://doi.org/10.1002/smr.2324

    Article  Google Scholar 

  44. Nassif AB, Azzeh M, Idri A, Abran A (2019) Software development effort estimation using regression fuzzy models. Comput Intell Neurosci. https://doi.org/10.1155/2019/8367214

    Article  MATH  Google Scholar 

  45. Azzeh M, Neagu D, Cowling P (2008) Adjusting analogy software effort estimation based on fuzzy logic. In: ICSOFT 2008-Proceedings of the 3rd international conference on software and data technologies

  46. Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(08):841–847. https://doi.org/10.1109/34.85677

    Article  MATH  Google Scholar 

  47. Menzies B, Caglayan B, Kocaguneli E, Krall J, Peters F. The PROMISE repository of empirical software engineering. PROMISE. https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=The+PROMISE+Repository+of+empirical+software+engineering+data+&btnG= (Accessed 27 Dec 2020)

  48. Azzeh M, Nassif AB, Elsheikh Y, Angelis L (2022) On the value of project productivity for early effort estimation. Sci Comput Program 219:102819. https://doi.org/10.1016/J.SCICO.2022.102819

    Article  Google Scholar 

  49. Kocaguneli E, Menzies T (2013) Software effort models should be assessed via leave-one-out validation. J Syst Softw 86(7):1879–1890. https://doi.org/10.1016/j.jss.2013.02.053

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Azzeh.

Ethics declarations

Conflict of interest

The authors declare that they have no known conflict interest or personal relationships that could have appeared to influence the work reported in this paper.”

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised to correct the affiliation section

Appendix A

Appendix A

1.1 Complete dataset descriptions

See Tables 6, 7, 8, 9, 10 and 11.

Table 6 Albrecht dataset description
Table 7 Kemerer dataset description
Table 8 Desharnais dataset description
Table 9 COCOMO dataset description
Table 10 Maxwell dataset description
Table 11 China dataset description

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Azzeh, M., Alkhateeb, A. & Bou Nassif, A. Software effort estimation using convolutional neural network and fuzzy clustering. Neural Comput & Applic 36, 14449–14464 (2024). https://doi.org/10.1007/s00521-024-09855-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s00521-024-09855-z

Keywords

Profiles

  1. Abedalrhman Alkhateeb
  2. Ali Bou Nassif