Skip to content

Make deployments self-sufficient and add E2E restart test#750

Merged
prathamesh0 merged 4 commits into
mainfrom
pj-so-078-self-sufficient-deployments
Apr 28, 2026
Merged

Make deployments self-sufficient and add E2E restart test#750
prathamesh0 merged 4 commits into
mainfrom
pj-so-078-self-sufficient-deployments

Conversation

@prathamesh0

Copy link
Copy Markdown
Collaborator
  • deploy create now copies each pod's commands.py into <deployment>/hooks/. call_stack_deploy_start loads from there, so deployment start / restart no longer need the live stack source on disk to run the start() hook
  • Only the start() hook is affected. init, setup, and create still load from the live source — they only run at deploy create time, when the source is guaranteed to be present
  • Multi-repo stacks produce hooks/commands_0.py, hooks/commands_1.py, …; call_stack_deploy_start loads them all in sorted order
  • Adds tests/k8s-deploy/run-restart-test.sh covering the full single-repo restart cycle (v1 -> mutate working tree -> restart re-copies and re-executes v2) and the multi-repo file-naming + multi-hook invocation. Wired into the existing K8s Deploy Test workflow

pranavjadhav007 and others added 4 commits April 27, 2026 19:37
Covers two scenarios on a single Kind cluster:
- Single-repo: deploy create copies commands.py into hooks/, deployment
  start runs it, mutating the stack-source working tree to v2 + deployment
  restart re-copies and re-executes v2.
- Multi-repo: stack with two pod repos produces hooks/commands_0.py +
  commands_1.py, deployment start invokes both pod start() hooks.

The test stages stack files into a temp git clone (bare + working) so
restart's git pull has a real upstream. busybox pods keep the harness
trivial. Phase 2 uses kubectl wait directly because deployment ps's
substring filter (deploy_k8s.py:1366) doesn't list multi-pod stacks.

Also tightens the _copy_hooks docstring to spell out that only
call_stack_deploy_start loads from the copied location.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rename the stack and script accordingly:

- test-restart-hook stack → test-restart
- test-restart-hook-multi stack → test-restart-multi
- run-restart-hook-test.sh → run-restart-test.sh
- start-hook-marker file → marker
- pod-repo dirs test-restart-hook-pod-{a,b} → test-restart-pod-{a,b}

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y test

dns_probe.py used 'list[str]' (PEP 585 generic alias for builtins),
which only parses on Python 3.9+. CI runs Python 3.8, so any caller
of 'deployment restart' (which lazy-imports dns_probe) crashed at
module load with 'TypeError: type object is not subscriptable'. Use
'List[str]' from typing to keep 3.8 compatibility, matching the rest
of the file's imports.

run-deploy-test.sh previously ended with --skip-cluster-management,
leaving the Kind cluster running for the next CI step to inherit.
Switch the final stop to --perform-cluster-management so subsequent
tests (e.g. run-restart-test.sh) start from a clean host, and replace
the now-trivial namespace assertion with a real check that the kind
cluster is actually gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@prathamesh0 prathamesh0 merged commit 7c65d39 into main Apr 28, 2026
7 checks passed
@prathamesh0 prathamesh0 deleted the pj-so-078-self-sufficient-deployments branch April 28, 2026 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants