{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T09:44:38Z","timestamp":1753868678141,"version":"3.41.2"},"reference-count":32,"publisher":"Wiley","issue":"23","license":[{"start":{"date-parts":[[2020,7,20]],"date-time":"2020-07-20T00:00:00Z","timestamp":1595203200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"funder":[{"name":"Scientific Research Project of Changzhou College of Information Technology","award":["CXZK201704Z"],"award-info":[{"award-number":["CXZK201704Z"]}]},{"name":"The Excellent Teaching Team of Qinglan Project in Jiangsu Universities 2017"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2020,12,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Workload prediction has been widely researched in the literature. However, existing techniques are per\u2010job based and useful for service\u2010like tasks whose workloads exhibit seasonality and trend. But cloud jobs have many different workload patterns and some do not exhibit recurring workload patterns. We consider job\u2010pool\u2010based workload estimation, which analyzes the characteristics of existing tasks' workloads to estimate the currently running tasks' workload. First cluster existing tasks based on their workloads. For a new task <jats:styled-content><jats:bold><jats:italic>J<\/jats:italic><\/jats:bold><\/jats:styled-content>, collect the initial workload of <jats:styled-content><jats:bold><jats:italic>J<\/jats:italic><\/jats:bold><\/jats:styled-content> and determine which cluster <jats:styled-content><jats:bold><jats:italic>J<\/jats:italic><\/jats:bold><\/jats:styled-content> may belong to, then use the cluster's characteristics to estimate <jats:styled-content><jats:bold><jats:italic>J<\/jats:italic><\/jats:bold>\u2032<\/jats:styled-content>s workload. Based on the Google dataset, the algorithm is experimentally evaluated and its effectiveness is confirmed. However, the workload patterns of some tasks do have seasonality and trend, and conventional per\u2010job\u2010based regression methods may yield better workload prediction results. Also, in some cases, some new tasks may not follow the workload patterns of existing tasks in the pool. Thus, develop an integrated scheme which combines clustering and regression and utilize the best of them for workload prediction. Experimental study shows that the combined approach can further improve the accuracy of workload prediction.<\/jats:p>","DOI":"10.1002\/cpe.5931","type":"journal-article","created":{"date-parts":[[2020,7,20]],"date-time":"2020-07-20T07:45:50Z","timestamp":1595231150000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Integrating clustering and regression for workload estimation in the cloud"],"prefix":"10.1002","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3768-4920","authenticated-orcid":false,"given":"Yongjia","family":"Yu","sequence":"first","affiliation":[{"name":"Changzhou College of Information Technology  Changzhou China"}]},{"given":"Vasu","family":"Jindal","sequence":"additional","affiliation":[{"name":"Department of Computer Science University of Texas at Dallas  Richardson Texas USA"}]},{"given":"I\u2010Ling","family":"Yen","sequence":"additional","affiliation":[{"name":"Department of Computer Science University of Texas at Dallas  Richardson Texas USA"}]},{"given":"Farokh","family":"Bastani","sequence":"additional","affiliation":[{"name":"Department of Computer Science University of Texas at Dallas  Richardson Texas USA"}]},{"given":"Jie","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Computing University of Leeds  Leeds UK"}]},{"given":"Peter","family":"Garraghan","sequence":"additional","affiliation":[{"name":"School of Computing and Communications Lancaster University  Lancaster UK"}]}],"member":"311","published-online":{"date-parts":[[2020,7,20]]},"reference":[{"key":"e_1_2_8_2_1","unstructured":"HamiltonFJ. Internet scale service efficiency. Paper presented at: Proceedings of the Large\u2010Scale Distributed Systems and Middleware Workshop;2008."},{"key":"e_1_2_8_3_1","doi-asserted-by":"crossref","unstructured":"YeY XiaoL YenIL BastaniFB. Leveraging service clouds for power and QoS management for mobile devices. Paper presented at: Proceedings of the IEEE CLOUD;2011:235\u2010242; Washington DC.","DOI":"10.1109\/CLOUD.2011.55"},{"key":"e_1_2_8_4_1","doi-asserted-by":"crossref","unstructured":"ShvachkoK KuangH RadiaS ChanslerR. The Hadoop distributed file system. Paper presented at: Proceedings of the ACM Symposium on Mass Storage Systems and Technologies; May2010:1\u201010.","DOI":"10.1109\/MSST.2010.5496972"},{"key":"e_1_2_8_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1773912.1773922"},{"key":"e_1_2_8_6_1","doi-asserted-by":"crossref","unstructured":"YeY XiaoL YenI\u2010L BastaniFB. Secure dependable and high performance cloud storage. Paper presented at: 29th IEEE Symposium on Reliable Distributed Systems;2010:194\u2010203.","DOI":"10.1109\/SRDS.2010.30"},{"key":"e_1_2_8_7_1","doi-asserted-by":"crossref","unstructured":"GmachD RoliaJ CherkasovaL KemperA. Workload analysis and demand prediction of enterprise data center applications. Paper presented at: Proceedings of the IEEE International Symposium on Workload Characterization;2007:171\u2010180.","DOI":"10.1109\/IISWC.2007.4362193"},{"key":"e_1_2_8_8_1","doi-asserted-by":"crossref","unstructured":"RoyN DubeyA GokhaleA. Efficient autoscaling in the cloud using predictive models for workload forecasting. Paper presented at: IEEE International Conference on Cloud Computing;2011:500\u2010507.","DOI":"10.1109\/CLOUD.2011.42"},{"key":"e_1_2_8_9_1","unstructured":"JhengJ TsengF ChaoH Chou LD. A novel VM workload prediction using grey forecasting model in cloud data center. Paper presented at: International Conference on Information Networking;2014:40\u201045."},{"key":"e_1_2_8_10_1","doi-asserted-by":"crossref","unstructured":"GarraghanP TownendP XuJ. An analysis of the server characteristics and resource utilization in Google cloud. Paper presented at: Proceedings of the IEEE International Conference on Cloud Engineering;2013:124\u2010131.","DOI":"10.1109\/IC2E.2013.40"},{"key":"e_1_2_8_11_1","unstructured":"ReissC WilkesJ. Google cluster\u2010usage traces: format + schema. Version May 6 Google Inc;2013."},{"key":"e_1_2_8_12_1","doi-asserted-by":"crossref","unstructured":"WuY YuanY YangG ZhengW. Load prediction using hybrid model for computational grid. Paper presented at: 8th IEEE\/ACM International Conference on Grid Computing;2007:235\u2010242.","DOI":"10.1109\/GRID.2007.4354138"},{"key":"e_1_2_8_13_1","doi-asserted-by":"crossref","unstructured":"CaronE DesprezF MuresanA. Forecasting on grid and cloud computing on\u2010demand resources based on pattern matching. Paper presented at: IEEE Second International Conference on Cloud Computing Technology & Science;2010;456\u2010463.","DOI":"10.1109\/CloudCom.2010.65"},{"key":"e_1_2_8_14_1","doi-asserted-by":"crossref","unstructured":"BobroffN KochutA BeatyK. Dynamic placement of virtual machines for managing SLA violations. Paper presented at: IEEE Internationnal Symposium on Integrated Network Management;2007:119\u2010128.","DOI":"10.1109\/INM.2007.374776"},{"key":"e_1_2_8_15_1","unstructured":"DanielS KwonM. Prediction\u2010based virtual instance migration for balanced workload in the cloud datacenters. RIT;2011."},{"key":"e_1_2_8_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comcom.2005.11.013"},{"key":"e_1_2_8_17_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.1093"},{"key":"e_1_2_8_18_1","doi-asserted-by":"crossref","unstructured":"NiehorsterO KriegerA SimonJ BrinkmannA. Autonomic resource management with support vector machines. Paper presented at: 12th IEEE\/ACM International Conference on Grid Computing;2011:157\u2010164.","DOI":"10.1109\/Grid.2011.28"},{"key":"e_1_2_8_19_1","unstructured":"ChenY GanapathiAS GriffithR KatzRH. Analysis and lessons from a publicly available Google cluster trace. Tech. Rep. UCB\/EECS\u20102010\u201095;2010."},{"key":"e_1_2_8_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-014-1131-z"},{"key":"e_1_2_8_21_1","doi-asserted-by":"crossref","unstructured":"PatelJ JindalV YenIL BastaniF XuJ GarraghanP. Workload estimation for improving resource management decisions in the cloud. Paper presented at: IEEE 12th International Symposium on Autonomous Decentralized Systems;2015:25\u201032.","DOI":"10.1109\/ISADS.2015.17"},{"key":"e_1_2_8_22_1","doi-asserted-by":"crossref","unstructured":"YuY JindalV YenI\u2010L BastaniFB. Integrating clustering and learning for improved workload prediction in the cloud. Paper presented at: Proceedings of the IEEE CLOUD; July;2016:876\u2010879; San Francisco.","DOI":"10.1109\/CLOUD.2016.0127"},{"key":"e_1_2_8_23_1","doi-asserted-by":"crossref","unstructured":"YuY JindalV BastaniFB LiF YenI\u2010L. Improving the smartness of cloud management via machine learning based workload prediction. Paper presented at: Proceedings of the IEEE COMPSAC; July2018:38\u201044; Tokyo.","DOI":"10.1109\/COMPSAC.2018.10200"},{"issue":"3","key":"e_1_2_8_24_1","first-page":"377","article-title":"Energy\u2010efficient resource allocation and provisioning framework for cloud data centers","volume":"12","author":"Dabbagh M","year":"2015","journal-title":"IEEE TNSM"},{"key":"e_1_2_8_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2002.1017616"},{"key":"e_1_2_8_26_1","first-page":"405","volume-title":"Clustering by means of medoids","author":"Kaufman L","year":"1987"},{"key":"e_1_2_8_27_1","unstructured":"SalvadorS ChanP. FastDTW: toward accurate dynamic time warping in linear time and space. Paper presented at: Proceedings of the KDD Workshop on Mining Temporal and Sequential Data;2004:70\u201080."},{"key":"e_1_2_8_28_1","doi-asserted-by":"publisher","DOI":"10.26599\/BDMA.2018.9020021"},{"key":"e_1_2_8_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1007-0214(08)70036-1"},{"key":"e_1_2_8_30_1","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v027.i03"},{"key":"e_1_2_8_31_1","doi-asserted-by":"crossref","unstructured":"MorenoIS GarraghanP TownendP XuJ. An approach for characterizing workloads in Google cloud to derive realistic resource utilization models. Paper presented at: Proceedings of the 2013 IEEE Seventh International Symposium on Service\u2010Oriented System Engineering;2013:49\u201060.","DOI":"10.1109\/SOSE.2013.24"},{"key":"e_1_2_8_32_1","doi-asserted-by":"publisher","DOI":"10.26599\/BDMA.2019.9020010"},{"key":"e_1_2_8_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2015.06.039"}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fcpe.5931","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.5931","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/cpe.5931","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.5931","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,4]],"date-time":"2023-09-04T11:09:41Z","timestamp":1693825781000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.5931"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,20]]},"references-count":32,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2020,12,10]]}},"alternative-id":["10.1002\/cpe.5931"],"URL":"https:\/\/doi.org\/10.1002\/cpe.5931","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"type":"print","value":"1532-0626"},{"type":"electronic","value":"1532-0634"}],"subject":[],"published":{"date-parts":[[2020,7,20]]},"assertion":[{"value":"2020-03-03","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-06-02","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-07-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"e5931"}}