Skip to content

Fix issues in pod issue detection#4703

Merged
JamesMurkin merged 2 commits intomasterfrom
fix_executor_platform_issue_detection
Feb 20, 2026
Merged

Fix issues in pod issue detection#4703
JamesMurkin merged 2 commits intomasterfrom
fix_executor_platform_issue_detection

Conversation

@JamesMurkin
Copy link
Contributor

  • Removed the util.HasCurrentStateBeenReported(pod) check and now just simply ignore all terminal pods
    • Other processes handle terminal pods - leave it to them to handle
    • Having util.HasCurrentStateBeenReported(pod) just added a race between detecting issues / reporting state
  • Made it so we don't handle pods stuck in terminal state IF the executor was the one that called delete
    • Cluster context will handle this issue itself as it was the one to call delete, so leave it to that to handle
    • Making the issue handle tell the cluster context about this potential issue
      • This is so if the executor restarts, the pod being stuck will still get handled
      • We don't need to send events about it, as the executor already was trying to delete it

 - Removed the `util.HasCurrentStateBeenReported(pod)` check and now just simply ignore all terminal pods
   - Other processes handle terminal pods - leave it to them to handle
   - Having `util.HasCurrentStateBeenReported(pod)` just added a race between detecting issues / reporting state
 - Made it so we don't handle pods stuck in terminal state IF the executor was the one that called delete
   - Cluster context will handle this issue itself as it was the one to call delete, so leave it to that to handle
   - Making the issue handle tell the cluster context about this potential issue
     - This is so if the executor restarts, the pod being stuck will still get handled
     - We don't need to send events about it, as the executor already was trying to delete it

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>
@JamesMurkin JamesMurkin marked this pull request as ready for review February 20, 2026 14:13
@JamesMurkin JamesMurkin enabled auto-merge (squash) February 20, 2026 16:57
@JamesMurkin JamesMurkin merged commit 520ca98 into master Feb 20, 2026
15 checks passed
@JamesMurkin JamesMurkin deleted the fix_executor_platform_issue_detection branch February 20, 2026 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants