You can use optional run parameters:
- -s - to specify logs source directory (by default
logs-hfis set) - -d - to specify logs destination directory (by default
./is set) - -w - to specify
workflow.jsonfile path (by default./is set) - -o - to omit creading new dedicated directory in the destination path (defaults to false)
e.g.
python3 parser.py -s logs-hf -d parsed-logs -w workflow.json
Logs are written to directory with name pattern <dest_dir>/<workflow_name>__<workflow_size>__<version>__<date_time>, where:
dest_dir- destination directory from run parameters,workflow_name- extracted fromworkflow.jsonfile,undefinedifnamekey does not exist,workflow_size- extracted fromworkflow.jsonfile, number of processes ifsizekey does not exist,version- extracted fromworkflow.jsonfile,1.0.0ifversionkey does not exist,date_time- timestamp in%Y-%m-%d-%H-%M-%Sformat
eg. montage__0.25__1.0.0__2020-04-20-12-01-24,
Parser generates following files in JSON lines format:
job_descriptions.jsonlsys_info.jsonlmetrics.jsonl
Identifiers:
hyperflowId- eg. HbD2SFH5workflowId- eg. HbD2SFH5-16jobId- eg. HbD2SFH5-16-44
{
"workflowName":"montage",
"size":"0.25",
"version":"1.0.0",
"hyperflowId":"6ZYgjDbbG",
"jobId":"6ZYgjDbbG-1-29",
"env":{
"podIp":"10.40.0.56",
"nodeName":"gke-cluster-x-default-pool-917ea268-7zqx",
"podName":"job4kxmy-mdifffit-29-1-qxfg7",
"podServiceAccount":"default",
"podNamespace":"default"
},
"nodeName":"gke-cluster-x-default-pool-917ea268-7zqx",
"executable":"mBgModel",
"args":[
"-i",
"100000",
"pimages_20180402_165339_22325.tbl",
"fits.tbl",
"corrections.tbl"
],
"inputs":[
{
"name":"fits.tbl",
"size":3745
},
{
"name":"pimages_20180402_165339_22325.tbl",
"size":1936
}
],
"outputs":[
{
"name":"corrections.tbl",
"size":573
}
],
"name":"mBgModel",
"command":"mBgModel -i 100000 pimages_20180402_165339_22325.tbl fits.tbl corrections.tbl",
"execTimeMs":1030
}{
"cpu":{
"manufacturer":"Intel®",
"brand":"Xeon®",
"vendor":"",
"family":"",
"model":"",
"stepping":"",
"revision":"",
"voltage":"",
"speed":"2.00",
"speedmin":"",
"speedmax":"",
"governor":"",
"cores":2,
"physicalCores":2,
"processors":1,
"socket":"",
"cache":{
"l1d": 32768,
"l1i": 32768,
"l2": 1048576,
"l3": 40370176
}
},
"mem":{
"total":2095239168,
"free":130646016,
"used":1964593152,
"active":849100800,
"available":1246138368,
"buffers":105852928,
"cached":1078996992,
"slab":149585920,
"buffcache":1334435840,
"swaptotal":0,
"swapused":0,
"swapfree":0
},
"jobId":"6ZYgjDbbG-1-29"
}Two types of metrics (values for key parameter):
- events -
event - measurements -
cpu,memory,ctime,io,network
Possible values for events:
handlerStartjobStartjobEndhandlerEnd
{
"time":"2020-03-30T17:22:45.160",
"workflowId":"6ZYgjDbbG-1",
"jobId":"6ZYgjDbbG-1-1",
"name":"mProjectPP",
"parameter":"event",
"value":"jobStart"
}cpu
{
"time":"2020-03-30T17:23:31.083",
"pid":"8",
"workflowId":"6ZYgjDbbG-1",
"jobId":"6ZYgjDbbG-1-29",
"name":"mBgModel",
"parameter":"cpu",
"value":0
}memory
{
...,
"parameter":"memory",
"value":11304960
}ctime
{
...,
"parameter":"ctime",
"value":30
}io
{
...,
"parameter":"io",
"value":{
"read":1225,
"write":1,
"readSyscalls":5,
"writeSyscalls":1,
"readReal":0,
"writeReal":0,
"writeCancelled":0
}
}network
{
...,
"parameter":"network",
"value":{
"name":"eth0",
"rxBytes":5777,
"rxPackets":15,
"rxErrors":0,
"rxDrop":0,
"rxFifo":0,
"rxFrame":0,
"rxCompressed":0,
"rxMulticast":0,
"txBytes":1336,
"txPackets":15,
"txErrors":0,
"txDrop":0,
"txFifo":0,
"txColls":0,
"txCarrier":0,
"txCompressed":0
}
}