[BACKPORT 25.10] Add spot interruption tracking to trace records (#6606)#6674
Conversation
bf2119b to
db6e9a4
Compare
plugins/nf-amazon/src/test/nextflow/cloud/aws/batch/AwsBatchTaskHandlerTest.groovy
Outdated
Show resolved
Hide resolved
Track and report spot/preemptible instance interruptions for cloud batch executors. Changes: - Add `numSpotInterruptions` transient field to TraceRecord - AWS Batch: detect spot interruptions by checking status reason pattern "Host EC2*" - Google Batch: detect spot preemptions via exit code 50001 in status events - Tower plugin: send numSpotInterruptions to Seqera Platform telemetry This enables workflow optimization and cost analysis by tracking how often tasks are retried due to spot instance reclamation. (cherry picked from commit eecd816) Signed-off-by: Lorenzo Fontana <fontanalorenz@gmail.com>
db6e9a4 to
b62f0e6
Compare
pditommaso
left a comment
There was a problem hiding this comment.
Look good, just not merge yet
| def trace = handler.getTraceRecord() | ||
| then: | ||
| 1 * handler.isCompleted() >> false | ||
| 2 * handler.isCompleted() >> false |
There was a problem hiding this comment.
Not sure to understand why this changed to 2
There was a problem hiding this comment.
because another call is added to isCompleted in the getNumSpotInterruptions method.
Since here we are testing getTraceRecord() method which already have one call to isCompleted()
There was a problem hiding this comment.
What's the resolution here?
There was a problem hiding this comment.
there is one call here
and another call is added here
Both are in getTraceRecord() method
we can remove one from getNumSpotInterruptions, it a separate method, so if it called somewhere else where isCompleted is not called then that can lead to an issue
There was a problem hiding this comment.
yeah, I think this check !isCompleted() is not really needed
There was a problem hiding this comment.
Please let me know if anything else needs to be changed
plugins/nf-amazon/src/main/nextflow/cloud/aws/batch/AwsBatchTaskHandler.groovy
Outdated
Show resolved
Hide resolved
plugins/nf-amazon/src/test/nextflow/cloud/aws/batch/AwsBatchTaskHandlerTest.groovy
Outdated
Show resolved
Hide resolved
plugins/nf-google/src/main/nextflow/cloud/google/batch/GoogleBatchTaskHandler.groovy
Outdated
Show resolved
Hide resolved
Signed-off-by: Munish Chouhan <hrma017@gmail.com>
plugins/nf-google/src/test/nextflow/cloud/google/batch/GoogleBatchTaskHandlerTest.groovy
Outdated
Show resolved
Hide resolved
Signed-off-by: Munish Chouhan <hrma017@gmail.com>
Signed-off-by: Munish Chouhan <hrma017@gmail.com>
Signed-off-by: Munish Chouhan <hrma017@gmail.com>
Signed-off-by: Munish Chouhan <hrma017@gmail.com>
Adds
numSpotInterruptionsfield to trace records for Tower/Platform telemetry:(cherry picked from commit eecd816)