Software effort estimation using convolutional neural network and fuzzy clustering

Azzeh, Mohammad; Alkhateeb, Abedalrhman; Bou Nassif, Ali

doi:10.1007/s00521-024-09855-z

Software effort estimation using convolutional neural network and fuzzy clustering

Original Article
Published: 07 May 2024

Volume 36, pages 14449–14464, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

384 Accesses
7 Citations
Explore all metrics

A Correction to this article was published on 27 December 2024

This article has been updated

Abstract

Adopting an efficient software process model is critical for building high-quality software applications. An important factor impacting the software development process is an accurate estimate of human effort required to complete the software project. While machine learning methods were historically used to develop estimation models, there has been little investigation into the potential of deep convolutional neural networks (DCNNs) for improving software effort estimation. One of the biggest obstacles in using DCNN for this purpose is the common nature of software datasets, which often consist of vectorized samples rather than matrices. To defeat this obstacle and reduce vagueness in software attribute measurement, this study uses Fuzzy theory to generate an appropriate two-dimensional datapoint representation. The fuzzy clustering is commonly used to split dataset samples into separate clusters, which can help to generate Fuzzy membership functions. This approach makes it easier to generate a two-dimensional array representation for each data sample based on the membership values, allowing it to be used as input to the DCNN model. The efficiency of the proposed model was thoroughly evaluated using PROMISE benchmark datasets. The findings based on mean absolute errors and standardized accuracy show that our proposed model produced very good performance with low error rates and outperformed several current state-of-the-art effort estimation models. Nonetheless, further research is needed to determine the impact of different cluster numbers and features on the performance of our model. In conclusion, this study emphasizes the possibility for incorporating DCNN into software effort estimates and highlights the viability of utilizing fuzzy modeling and clustering techniques to enhance the data representation of software datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from €37.37 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

A deep multiple self-supervised clustering model based on autoencoder networks

Article Open access 26 May 2025

A Study on Application of Soft Computing Techniques for Software Effort Estimation

Data imputation via conditional generative adversarial network with fuzzy c mean membership based loss term

Article 24 August 2021

Data availability

The data that support the findings of this study are openly available on public repository: http://promise.site.uottawa.ca/SERepository/datasets-page.html.

Change history

27 December 2024
A Correction to this paper has been published: https://doi.org/10.1007/s00521-024-10906-8

References

Alqasrawi Y, Azzeh M, Elsheikh Y (2022) Locally weighted regression with different kernel smoothers for software effort estimation. Sci Comput Program 214:102744. https://doi.org/10.1016/J.SCICO.2021.102744
Article Google Scholar
Tawosi V, Sarro F, Petrozziello A, Harman M (2022) Multi-objective software effort estimation: a replication study. IEEE Trans Softw Eng 48(8):3185–3205. https://doi.org/10.1109/TSE.2021.3083360
Article Google Scholar
Minku LL (2019) A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation. Empir Softw Eng 24(5):3153–3204. https://doi.org/10.1007/S10664-019-09686-W/TABLES/9
Article MATH Google Scholar
Xia T, Shu R, Shen X, Menzies T (2022) Sequential model optimization for software effort estimation. IEEE Trans Softw Eng 48(6):1994–2009. https://doi.org/10.1109/TSE.2020.3047072
Article MATH Google Scholar
Phannachitta P (2020) On an optimal analogy-based software effort estimation. Inf Softw Technol 125:106330. https://doi.org/10.1016/J.INFSOF.2020.106330
Article Google Scholar
Azzeh M, Nassif AB (2017) Analyzing the relationship between project productivity and environment factors in the use case points method. J Softw: Evolut Process. https://doi.org/10.1002/smr.1882
Article MATH Google Scholar
Keung J, Kocaguneli E, Menzies T (2013) Finding conclusion stability for selecting the best effort predictor in software effort estimation. Autom Softw Eng 20(4):543–567. https://doi.org/10.1007/s10515-012-0108-5
Article Google Scholar
Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Software Eng 38(6):1403–1416. https://doi.org/10.1109/TSE.2011.111
Article MATH Google Scholar
Kassaymeh S, Alweshah M, Al-Betar MA, Hammouri AI, M. A. Al-Ma’aitah, (2023) Software effort estimation modeling and fully connected artificial neural network optimization using soft computing techniques. Cluster Comput. https://doi.org/10.1007/S10586-023-03979-Y/METRICS
Article Google Scholar
Pal N, Yadav MP, Yadav DK (2023) Appropriate number of analogues in analogy based software effort estimation using quality datasets. Cluster Comput. https://doi.org/10.1007/S10586-023-03967-2/METRICS
Article MATH Google Scholar
Pantoni RP, Mossin EA, Brandão D (2008) Task effort fuzzy estimator for software development. INFOCOMP J Comput Sci 7(2):84–89
Google Scholar
Minku LL (2021) Multi-stream online transfer learning for software effort estimation: Is it necessary?. In: Proceedings of the 17th international conference on predictive models and data analytics in software engineeringhttps://doi.org/10.1145/3475960.3475988.
Azzeh M, Nassif AB (2013) Fuzzy Model Tree for early effort estimation. In: 2013 12th international conference on machine learning and applications. https://doi.org/10.1109/ICMLA.2013.115.
Kocaguneli E, Menzies T, Keung J, Cok D, Madachy R (2013) Active Learning and effort estimation: finding the essential content of software effort estimation data. IEEE Trans Softw Eng 39(8):1040–1053. https://doi.org/10.1109/TSE.2012.88
Article Google Scholar
Rao KE, Rao GA, Rao E (2020) “Retraction Note: ensemble learning with recursive feature elimination integrated software effort estimation: a novel approach. Evolut Intell 14:151–162. https://doi.org/10.1007/s12065-020-00360-5
Article MATH Google Scholar
Kocaguneli E, Menzies T, Bener AB, Keung JW (2012) Exploiting the essential assumptions of analogy-based effort estimation. IEEE Trans Software Eng 38(2):425–438. https://doi.org/10.1109/TSE.2011.27
Article Google Scholar
de Morais RA (2021) Deep learning based models for software effort estimation using story points in agile environments. https://doi.org/10.7939/R3-JCF5-8X08
Phan H, Jannesari A (2022) Story point effort estimation by text level graph neural network. Accessed 31 Mar 2023. [Online]. Available: https://arxiv.org/abs/2203.03062v2
Kassem H, Mahar K, Saad AA (2023) Story point estimation using issue reports with deep attention neural network. e-Inf Softw Eng J 17(1):230104. https://doi.org/10.37190/e-Inf230104
Article Google Scholar
Marapelli B, Carie A, Islam SM ((2020) RNN-CNN model: A bi-directional long short-term memory deep learning network for story point estimation. In: CITISIA 2020 5th international conference on innovative technologies in intelligent systems and industrial applications. https://doi.org/10.1109/CITISIA50690.2020.9371770.
Choetkiertikul M, Dam HK, Tran T, Pham T, Ghose A, Menzies T (2019) A deep learning model for estimating story points. IEEE Trans Softw Eng 45(7):637–656. https://doi.org/10.1109/TSE.2018.2792473
Article Google Scholar
Zadeh LA (1997) Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst 90(2):111–127. https://doi.org/10.1016/S0165-0114(97)00077-8
Article MathSciNet MATH Google Scholar
Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10(2–3):191–203. https://doi.org/10.1016/0098-3004(84)90020-7
Article MATH Google Scholar
Liu M, Shi J, Li Z, Li C, Zhu J, Liu S (2017) Towards better analysis of deep convolutional neural networks. IEEE Trans Vis Comput Graph 23(1):91–100. https://doi.org/10.1109/TVCG.2016.2598831
Article MATH Google Scholar
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516. https://doi.org/10.1007/s10462-020-09825-6
Article MATH Google Scholar
Kumar PS, Behera HS, Kumari A, Nayak J, Naik B (2020) Advancement from neural networks to deep learning in software effort estimation: perspective of two decades. Comput Sci Rev 38:100288
Article MathSciNet MATH Google Scholar
Kocaguneli E, Menzies T, Keung JW (2013) Kernel methods for software effort estimation: effects of different kernel functions and bandwidths on estimation accuracy. Empir Softw Eng 18(1):1–24. https://doi.org/10.1007/s10664-011-9189-1
Article Google Scholar
Azzeh M, Nassi AB (2015) Analogy-based effort estimation: a new method to discover set of analogies from dataset characteristics. IET Softw. https://doi.org/10.1049/iet-sen.2013.0165
Article Google Scholar
Jose Thiago JTH, Oliveira ALI (2021) Ensemble effort estimation using dynamic selection. J Syst Softw 175:110904. https://doi.org/10.1016/J.JSS.2021.110904
Article Google Scholar
Nassif AB, Azzeh M, Capretz LF, Ho D (2013) A comparison between decision trees and decision tree forest models for software development effort estimation. In: 2013 Third international conference on communications and information technology (ICCIT). https://doi.org/10.1109/ICCITechnology.2013.6579553.
Zhang J, Chen L (2019) Clustering-based undersampling with random over sampling examples and support vector machine for imbalanced classification of breast cancer diagnosis. Comput Assist Surg 24(sup2):62–72. https://doi.org/10.1080/24699322.2019.1649074
Article MATH Google Scholar
Azzeh M, Nassif AB (2016) A hybrid model for estimating software project effort from use case points. Appl Soft Comput J 49:981–989. https://doi.org/10.1016/j.asoc.2016.05.008
Article MATH Google Scholar
Sarno R, Sidabutar J (2015) Improving the accuracy of COCOMO's effort estimation based on neural networks and fuzzy logic model. In: 2015 International Conference on Information & Communication Technology and Systems (ICTS) Accessed 07 Sep 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/7379898/
Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng 32(11):883–895. https://doi.org/10.1109/TSE.2006.114
Article Google Scholar
Azzeh M, Nassif AB (2018) Project productivity evaluation in early software effort estimation. J Softw: Evolut Process. https://doi.org/10.1002/smr.2110
Article MATH Google Scholar
Kaushik A, Kaur P, Choudhary N (2022) Stacking regularization in analogy-based software effort estimation. Soft comput 26(3):1197–1216. https://doi.org/10.1007/S00500-021-06564-W/FIGURES/9
Article MATH Google Scholar
Jørgensen M, Halkjelsvik T (2020) Sequence effects in the estimation of software development effort. J Syst Softw 159:110448. https://doi.org/10.1016/j.jss.2019.110448
Article MATH Google Scholar
Azzeh M, Nassif AB, Minku LL (2015) An empirical evaluation of ensemble adjustment methods for analogy-based effort estimation. J Syst Softw. https://doi.org/10.1016/j.jss.2015.01.028
Article MATH Google Scholar
Minku LL, Yao X (2013) Ensembles and locality: insight on improving software effort estimation. Inf Softw Technol 55(8):1512–1528. https://doi.org/10.1016/j.infsof.2012.09.012
Article MATH Google Scholar
Gallego JJC, Rodríguez D, Sicilia MÁ, Rubio MG, Crespo AG (2007) Software project effort estimation based on multiple parametric models generated through data clustering. J Comput Sci Technol 22(3):371–378. https://doi.org/10.1007/s11390-007-9043-5
Article MATH Google Scholar
Azzeh M, Nassif AB, Martín CL (2021) Empirical analysis on productivity prediction and locality for use case points method. Softw Qual J 29(2):309–336. https://doi.org/10.1007/s11219-021-09547-0
Article MATH Google Scholar
Kaur P, Singh R (2015) A proposed framework for software effort estimation using the combinational approach of fuzzy logic and neural networks. Int J Hybrid Inf Technol 8(10):73–80. https://doi.org/10.14257/ijhit.2015.8.10.07
Article MATH Google Scholar
Amazal FA, Idri A (2020) Estimating software development effort using fuzzy clustering-based analogy. J Softw: Evolut Process. https://doi.org/10.1002/smr.2324
Article Google Scholar
Nassif AB, Azzeh M, Idri A, Abran A (2019) Software development effort estimation using regression fuzzy models. Comput Intell Neurosci. https://doi.org/10.1155/2019/8367214
Article MATH Google Scholar
Azzeh M, Neagu D, Cowling P (2008) Adjusting analogy software effort estimation based on fuzzy logic. In: ICSOFT 2008-Proceedings of the 3rd international conference on software and data technologies
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(08):841–847. https://doi.org/10.1109/34.85677
Article MATH Google Scholar
Menzies B, Caglayan B, Kocaguneli E, Krall J, Peters F. The PROMISE repository of empirical software engineering. PROMISE. https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=The+PROMISE+Repository+of+empirical+software+engineering+data+&btnG= (Accessed 27 Dec 2020)
Azzeh M, Nassif AB, Elsheikh Y, Angelis L (2022) On the value of project productivity for early effort estimation. Sci Comput Program 219:102819. https://doi.org/10.1016/J.SCICO.2022.102819
Article Google Scholar
Kocaguneli E, Menzies T (2013) Software effort models should be assessed via leave-one-out validation. J Syst Softw 86(7):1879–1890. https://doi.org/10.1016/j.jss.2013.02.053
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Data Science, Princess Sumaya University for Technology, Amman, Jordan
Mohammad Azzeh
Department of Computer Science, Lakehead University, Thunder Bay, Canada
Abedalrhman Alkhateeb
Department of Computer Engineering, University of Sharjah, Sharjah, UAE
Ali Bou Nassif

Authors

Mohammad Azzeh
View author publications
Search author on:PubMed Google Scholar
Abedalrhman Alkhateeb
View author publications
Search author on:PubMed Google Scholar
Ali Bou Nassif
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Azzeh.

Ethics declarations

Conflict of interest

The authors declare that they have no known conflict interest or personal relationships that could have appeared to influence the work reported in this paper.”

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised to correct the affiliation section

Appendix A

1.1 Complete dataset descriptions

See Tables 6, 7, 8, 9, 10 and 11.

Table 6 Albrecht dataset description

Full size table

Table 7 Kemerer dataset description

Full size table

Table 8 Desharnais dataset description

Full size table

Table 9 COCOMO dataset description

Full size table

Table 10 Maxwell dataset description

Full size table

Table 11 China dataset description

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Azzeh, M., Alkhateeb, A. & Bou Nassif, A. Software effort estimation using convolutional neural network and fuzzy clustering. Neural Comput & Applic 36, 14449–14464 (2024). https://doi.org/10.1007/s00521-024-09855-z

Download citation

Received: 03 May 2023
Accepted: 12 April 2024
Published: 07 May 2024
Version of record: 07 May 2024
Issue date: August 2024
DOI: https://doi.org/10.1007/s00521-024-09855-z

Keywords

Profiles

Abedalrhman Alkhateeb View author profile
Ali Bou Nassif View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+

from €37.37 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (Netherlands)

Instant access to the full article PDF.

Institutional subscriptions

Software effort estimation using convolutional neural network and fuzzy clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A deep multiple self-supervised clustering model based on autoencoder networks

A Study on Application of Soft Computing Techniques for Software Effort Estimation

Data imputation via conditional generative adversarial network with fuzzy c mean membership based loss term

Explore related subjects

Data availability

Change history

27 December 2024

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A

Appendix A

1.1 Complete dataset descriptions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now