Deadlock on 25.10.3 on AWS Batch

## Bug report 

Hi! I hit a deadlock using the AWS Batch executor on 25.10.3. Specifically I believe the issue is related to https://github.com/nextflow-io/nextflow/pull/6729 which was backported to 25.10.3: https://github.com/nextflow-io/nextflow/commit/f59137e6c0c221b9f050b0971298e538e040f924

That PR also puts quite a bit more load on `DescribeJobs`. Specifically, it calls `describeJob` for every task: https://github.com/nextflow-io/nextflow/blob/887443e77f6eb0bc0cea9dfc606da4933a884965/plugins/nf-amazon/src/main/nextflow/cloud/aws/batch/AwsBatchTaskHandler.groovy#L932

Unlike the task polling supervisor, it does _not_ pass in a `context` so these calls are not batched: https://github.com/nextflow-io/nextflow/blob/887443e77f6eb0bc0cea9dfc606da4933a884965/plugins/nf-amazon/src/main/nextflow/cloud/aws/batch/AwsBatchTaskHandler.groovy#L194-L203

AWS doesn't have a prescribed rate limit for `DescribeJobs` but it'd be so nice if these were batched as these calls also hit the throttle-wrapped AWS client. Anyway, that's not a chief issue here.

### Expected behavior and actual behavior

I'd expect my pipeline to run

### Steps to reproduce the problem

I have a pipeline that kicks off 176 tasks. It then pairs the results, kicking off another 88. On 25.10.3, it's deadlocking before submitting any of the 88. Unfortunately that pipeline is very complex and I don't have a minimal reproducible example but I have other resources that should help.

### Program output 

I generated a stack dump with `jstack`: 

[thread_dump.txt](https://github.com/user-attachments/files/25080844/thread_dump.txt)

Admittedly I used claude to analyze it:

```
  All 10 AWSBatch-executor threads are deadlocked waiting for each other:

  AWSBatch-executor-1 through 10:
    ThrottlingExecutor$Recoverable.call()        ← Running a task submission
      → ParallelPollingMonitor$1.invoke()
      → submit0() → submit()
      → notifyTaskSubmit()
      → getTraceRecord()                          ← Called during submission!
      → getNumSpotInterruptions()
      → describeJob()
      → ClientProxyThrottler.invokeMethod()
      → doInvoke1()
      → FutureTask.get()                          ← BLOCKED waiting for result


  The deadlock cycle:
  1. Pool has 10 threads (likely availableProcessors * 5 on a 2-core machine)
  2. All 10 threads are executing task submissions
  3. During submission, getTraceRecord() is called (from Session.notifyTaskSubmit())
  4. getTraceRecord() calls getNumSpotInterruptions() → describeJob() → submits to the same executor and blocks
  5. No threads available to execute the submitted describeJob tasks
  6. Deadlock
  The "Task monitor" thread is also blocked on checkIfRunning() → describeJob() - waiting for a thread that will never be free.
```

However this explanation matches the behaviour I saw, and reinstating the `isCompleted()` check in `getNumSpotInterruptsions` fixes the issue.

### Environment 

* Nextflow version: 25.10.3
* Java version: 21
* Operating system: linux
* Bash version: N/A

**Critically: I was running on a 2 core AWS instance.** I believe this limits the number of workers available and can make it easier to hit deadlocks.

### Additional context

Can we backport a fix to check `isCompleted()` again? I'm not sure if Google executors or other executors have the same issue and have fixed it differently.

	if( context ) {
	// check if this response is cached in the batch collector
	if( context.contains(jobId) ) {
	log.trace "[AWS BATCH] hit cache for describe job=$jobId"
	return context.get(jobId)
	}
	log.trace "[AWS BATCH] missed cache for describe job=$jobId"
	// get next 100 job ids for which it's required to check the status
	batchIds = context.getBatchFor(jobId, 100)
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlock on 25.10.3 on AWS Batch #6802

Bug report

Expected behavior and actual behavior

Steps to reproduce the problem

Program output

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Deadlock on 25.10.3 on AWS Batch #6802

Description

Bug report

Expected behavior and actual behavior

Steps to reproduce the problem

Program output

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions