test(e2e): move test cases to per-SDK tests/ folders#682
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR centralizes shared E2E test utilities, switches the E2E runner to explicitly invoke per-SDK ChangesE2E Test Suite Infrastructure Reorganization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
c7cc31a to
24b3efa
Compare
21 new e2e cases under sdks/python/tests/e2e/ that pin contracts the
existing local-FFI integration tests can't reach. All ship with
`xfail(strict=True)` for the 10 known production bugs they pin:
test_error_code_mapping.py 10 cases 4 xfails
test_quota_enforcement.py 5 cases 5 xfails (module-level)
test_runner_concurrency.py 6 cases 1 xfail
-- --
21 10
Each xfail's reason= string carries a `file:line` pointer to the bug
location and a short root-cause note, so a follow-up fix-PR can grep
for the right reason, drop the marker, and watch the test go green.
`strict=True` means an unexpected xpass becomes a CI failure — i.e.
the moment any of these bugs is silently fixed we'll notice.
xfail coverage map (which PR removes which marker):
test_exec_after_box_removed_is_typed_error
→ PR C (runner exec error mapping: IsNotFound → 404)
test_invalid_argument_zero_cpu_returns_400 )
test_invalid_argument_negative_memory_returns_400 ) needs API
test_oversized_cpu_returns_400 ) ValidationPipe
5x test_quota_enforcement.py ) + per-sandbox
) quota fix
) (separate
) follow-up)
test_execution_invalid_command_returns_422
→ needs Rust spawn-failed → ErrExecution mapping fix
(separate follow-up; not in C or D)
Split out of boxlite-ai#681 — this is **PR B of 4**:
A (boxlite-ai#682) reorganize — merged before this rebases cleanly
B (this) 21 new cases
C runner exec error mapping fix
D FFI exec drain race fix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`POST /v1/{prefix}/boxes/{id}/exec` was leaking HTTP 500 when the
underlying box was already gone / stopped / in a non-runnable state
by the time the runner tried to spawn. Two related observable bugs:
exec on a removed box → 500 "spawn_failed: build failed"
exec on a stopped box → 500 (same shape)
exec on box mid-rebuild → 500 (same shape)
Root cause: `BoxliteExec` in `apps/runner/pkg/api/controllers/boxlite_exec.go`
called `execManager.Start(...)` and unconditionally wrapped any non-nil
error as 500. The SDK Start path already tunnels typed errors from the
Rust core (sdks/go/errors.go: IsNotFound / IsStopped / IsInvalidState)
when the box state changed under the request — they just weren't being
read.
Fix: extend `classifyExecError` to peek for those SDK typed errors and
return the canonical mapping from `src/shared/src/errors.rs:198-280`:
IsNotFound → 404 Not Found
IsStopped → 409 Conflict
IsInvalidState → 409 Conflict
(other / no SDK match) → 500 (unchanged)
Also routes the `Start()` error site through `classifyExecError`
instead of hard-coding 500.
Flips the xfail on `test_exec_after_box_removed_is_typed_error`
(sdks/python/tests/e2e/test_runner_concurrency.py) — that case now
xpasses, which under strict=True trips the marker and fails CI; the
fix here removes the marker so it lands green.
Other related xfails in test_error_code_mapping.py are NOT addressed
by this PR (they need DTO validation / quota / Rust spawn-failed
reclassification — separate follow-ups).
Split out of boxlite-ai#681 — this is **PR C of 4**:
A (boxlite-ai#682) reorganize
B (boxlite-ai#683) 21 new e2e cases (xfails pin bugs)
C (this) runner exec error mapping fix + flip 1 xfail
D FFI exec drain race fix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Go SDK's `box.Exec` would occasionally return with stdout chunks
still in flight — the user's callback got `OnExit` (or the wait gRPC
reply) before the matching `OnStdout`/`OnStderr` chunks landed on
the queue. From the caller's perspective, the exec had finished but
the last few bytes of stdout were missing.
Root cause: `execution_wait` and `exit_pump` pushed their terminal
events as soon as the underlying process exited, racing the still-
draining stream pumps. The FFI queue is ordered, but the wait/exit
events were entering it ahead of late stdout chunks.
Fix: track a `streams_pending` count on `ExecutionHandle` instead of
the previous per-pump receiver list. Both `exit_pump` and the
`execution_wait` task now await `streams_pending → 0` (via a
`tokio::sync::Notify` notification + standard register-then-check
pattern to avoid post-notification miss) before pushing their
terminal events.
Deterministic ordering tests added inline in execution.rs cover:
- stdout-only exec: all chunks land before Exit
- mixed stdout/stderr: every chunk lands before Wait
- process exit before pump completion: terminal event waits
- pump completion before process exit: terminal event fires
immediately (no spurious wait)
Side effect on the existing e2e suite: `test_p0_6_exec_stdout_race`
in sdks/python/tests/e2e/ (the regression test for boxlite-ai#563) flips from
~90% loss against stock 0.9.5 to 0% loss against this PR. That test
is NOT in the PR-B xfail set (it was pre-existing), so no markers
need flipping.
Other PR-B xfails (DTO validation, quota, spawn-failed mapping) are
NOT addressed here — separate root causes, separate follow-ups.
Split out of boxlite-ai#681 — this is **PR D of 4**:
A (boxlite-ai#682) reorganize
B (boxlite-ai#683) 21 new e2e cases (xfails pin bugs)
C (boxlite-ai#684) runner exec error mapping fix + flip 1 xfail
D (this) FFI exec drain race fix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR boxlite-ai#678 shipped e2e pytest cases under scripts/test/e2e/cases/, a fresh location that doesn't match the repo's existing per-SDK test layout (src/cli/tests/*.rs, sdks/python/tests/, sdks/node/tests/, sdks/c/tests/, sdks/go/*_test.go). This PR moves the cases to match. Driver scripts stay in scripts/test/e2e/ (cross-SDK infra); test files now live next to the SDK they exercise: scripts/test/e2e/cases/test_c_entry.py → sdks/c/tests/e2e/ scripts/test/e2e/cases/test_go_entry.py → sdks/go/tests/e2e/ scripts/test/e2e/cases/test_node_entry.py → sdks/node/tests/e2e/ scripts/test/e2e/cases/test_cli_entry.py → src/cli/tests/e2e/ scripts/test/e2e/cases/test_*.py (Python pytest) → sdks/python/tests/e2e/ scripts/test/e2e/cases/conftest.py → sdks/python/tests/e2e/ scripts/test/e2e/sdks/{c,go,node}/e2e_basic.{c,go,ts} → sdks/{c,go,node}/tests/e2e/ Stays in scripts/test/e2e/: bootstrap.sh, teardown.sh, run.sh, two_sided.sh, fixture_setup.py, pytest.ini, README.md. Supporting wiring (forced by the renames): - run.sh: pytest invoked with the 5 SDK e2e dirs as explicit args (testpaths in pytest.ini can't resolve outside its rootdir) - pytest.ini: comment explaining the testpaths omission - .github/workflows/e2e-stack.yml: paths filter expanded to all new e2e locations - README.md: layout map updated Two collisions the rename exposed (both fixed in this PR): 1. SDK unit-test pytest workflow auto-discovers sdks/python/tests/ and would import the new e2e/ subdir, whose conftest needs Python 3.11+ tomllib and a live API stack the unit-test workflow doesn't provide. Fix: norecursedirs = e2e in sdks/python/pytest.ini. The e2e runner passes the dir explicitly on the CLI which bypasses norecursedirs. 2. pytest's default --import-mode=prepend puts each test file's rootpath on sys.path. For sdks/python/tests/e2e/test_*.py the rootpath becomes sdks/python/, which then shadows the installed boxlite wheel with the source-tree stub package (sdks/python/boxlite/__init__.py — missing the compiled .so so Rust-bound symbols like BoxliteRestOptions are absent). Every fixture using boxlite.BoxliteRestOptions failed at setup with AttributeError. Fix: --import-mode=importlib in scripts/test/e2e/pytest.ini (modern mode, no sys.path injection). Tests that previously did `from conftest import drain` couldn't keep doing that under importlib mode (conftest no longer on sys.path) — extracted drain/collect_stream/ stdout_line_count from conftest into scripts/test/e2e/lib/e2e_helpers.py (alongside path_verification which already used the same lib/ + sys.path.insert pattern), and updated the 8 test files' imports. Verified locally on Ubuntu 22.04 + Python 3.11 + maturin-built boxlite wheel — make test:e2e against PR-A: 21 passed, 7 failed (the 7 are real production bugs the suite is designed to catch: PR-C fixes runner exec error mapping, PR-D fixes FFI drain race). Before this PR's collision fix: 0 passed, 32 fixture-setup errors. Originally part of boxlite-ai#681 which also bundled 21 new test cases + 2 fixes; this split-out covers reorganize only — the rest ships in follow-up PRs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
24b3efa to
b5e6f30
Compare
21 new e2e cases under sdks/python/tests/e2e/ that pin contracts the
existing local-FFI integration tests can't reach. All ship with
`xfail(strict=True)` for the 10 known production bugs they pin:
test_error_code_mapping.py 10 cases 4 xfails
test_quota_enforcement.py 5 cases 5 xfails (module-level)
test_runner_concurrency.py 6 cases 1 xfail
-- --
21 10
Each xfail's reason= string carries a `file:line` pointer to the bug
location and a short root-cause note, so a follow-up fix-PR can grep
for the right reason, drop the marker, and watch the test go green.
`strict=True` means an unexpected xpass becomes a CI failure — i.e.
the moment any of these bugs is silently fixed we'll notice.
xfail coverage map (which PR removes which marker):
test_exec_after_box_removed_is_typed_error
→ PR C (runner exec error mapping: IsNotFound → 404)
test_invalid_argument_zero_cpu_returns_400 )
test_invalid_argument_negative_memory_returns_400 ) needs API
test_oversized_cpu_returns_400 ) ValidationPipe
5x test_quota_enforcement.py ) + per-sandbox
) quota fix
) (separate
) follow-up)
test_execution_invalid_command_returns_422
→ needs Rust spawn-failed → ErrExecution mapping fix
(separate follow-up; not in C or D)
Split out of boxlite-ai#681 — this is **PR B of 4**:
A (boxlite-ai#682) reorganize — merged before this rebases cleanly
B (this) 21 new cases
C runner exec error mapping fix
D FFI exec drain race fix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`POST /v1/{prefix}/boxes/{id}/exec` was leaking HTTP 500 when the
underlying box was already gone / stopped / in a non-runnable state
by the time the runner tried to spawn. Two related observable bugs:
exec on a removed box → 500 "spawn_failed: build failed"
exec on a stopped box → 500 (same shape)
exec on box mid-rebuild → 500 (same shape)
Root cause: `BoxliteExec` in `apps/runner/pkg/api/controllers/boxlite_exec.go`
called `execManager.Start(...)` and unconditionally wrapped any non-nil
error as 500. The SDK Start path already tunnels typed errors from the
Rust core (sdks/go/errors.go: IsNotFound / IsStopped / IsInvalidState)
when the box state changed under the request — they just weren't being
read.
Fix: extend `classifyExecError` to peek for those SDK typed errors and
return the canonical mapping from `src/shared/src/errors.rs:198-280`:
IsNotFound → 404 Not Found
IsStopped → 409 Conflict
IsInvalidState → 409 Conflict
(other / no SDK match) → 500 (unchanged)
Also routes the `Start()` error site through `classifyExecError`
instead of hard-coding 500.
Flips the xfail on `test_exec_after_box_removed_is_typed_error`
(sdks/python/tests/e2e/test_runner_concurrency.py) — that case now
xpasses, which under strict=True trips the marker and fails CI; the
fix here removes the marker so it lands green.
Other related xfails in test_error_code_mapping.py are NOT addressed
by this PR (they need DTO validation / quota / Rust spawn-failed
reclassification — separate follow-ups).
Split out of boxlite-ai#681 — this is **PR C of 4**:
A (boxlite-ai#682) reorganize
B (boxlite-ai#683) 21 new e2e cases (xfails pin bugs)
C (this) runner exec error mapping fix + flip 1 xfail
D FFI exec drain race fix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Go SDK's `box.Exec` would occasionally return with stdout chunks
still in flight — the user's callback got `OnExit` (or the wait gRPC
reply) before the matching `OnStdout`/`OnStderr` chunks landed on
the queue. From the caller's perspective, the exec had finished but
the last few bytes of stdout were missing.
Root cause: `execution_wait` and `exit_pump` pushed their terminal
events as soon as the underlying process exited, racing the still-
draining stream pumps. The FFI queue is ordered, but the wait/exit
events were entering it ahead of late stdout chunks.
Fix: track a `streams_pending` count on `ExecutionHandle` instead of
the previous per-pump receiver list. Both `exit_pump` and the
`execution_wait` task now await `streams_pending → 0` (via a
`tokio::sync::Notify` notification + standard register-then-check
pattern to avoid post-notification miss) before pushing their
terminal events.
Deterministic ordering tests added inline in execution.rs cover:
- stdout-only exec: all chunks land before Exit
- mixed stdout/stderr: every chunk lands before Wait
- process exit before pump completion: terminal event waits
- pump completion before process exit: terminal event fires
immediately (no spurious wait)
Side effect on the existing e2e suite: `test_p0_6_exec_stdout_race`
in sdks/python/tests/e2e/ (the regression test for boxlite-ai#563) flips from
~90% loss against stock 0.9.5 to 0% loss against this PR. That test
is NOT in the PR-B xfail set (it was pre-existing), so no markers
need flipping.
Other PR-B xfails (DTO validation, quota, spawn-failed mapping) are
NOT addressed here — separate root causes, separate follow-ups.
Split out of boxlite-ai#681 — this is **PR D of 4**:
A (boxlite-ai#682) reorganize
B (boxlite-ai#683) 21 new e2e cases (xfails pin bugs)
C (boxlite-ai#684) runner exec error mapping fix + flip 1 xfail
D (this) FFI exec drain race fix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
scripts/test/e2e/lib/e2e_helpers.py (1)
20-39: ⚡ Quick winConsider adding parameter type annotations for better IDE support and type safety.
The helper functions are well-implemented and correct, but adding explicit type annotations for the
streamandexparameters would improve the developer experience when importing and using these utilities. This is especially valuable for a shared module that will be imported across multiple test files.📝 Suggested type annotations
+from typing import AsyncIterator, Any + -async def collect_stream(stream) -> str: +async def collect_stream(stream: AsyncIterator[bytes] | AsyncIterator[str] | None) -> str: + """Collect and decode an async stream into a single string.""" if stream is None: return "" chunks: list[str] = [] async for ch in stream: chunks.append( ch.decode("utf-8", "replace") if isinstance(ch, bytes) else str(ch) ) return "".join(chunks) -async def drain(ex) -> tuple[str, str]: +async def drain(ex: Any) -> tuple[str, str]: """Drain stdout + stderr concurrently — required for REST exec.""" out_t = asyncio.create_task(collect_stream(ex.stdout())) err_t = asyncio.create_task(collect_stream(ex.stderr())) return await asyncio.gather(out_t, err_t) def stdout_line_count(s: str) -> int: + """Count non-empty lines in a string.""" return len([ln for ln in s.splitlines() if ln])🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/test/e2e/lib/e2e_helpers.py` around lines 20 - 39, Add explicit parameter type annotations to improve IDE/type-checker support: annotate collect_stream(stream) with an appropriate async-iterable type (e.g., stream: Optional[AsyncIterable[Union[bytes, str]]]) and annotate drain(ex) with the expected execution object type (e.g., ex: Any or a small Protocol with stdout()/stderr() -> AsyncIterable[Union[bytes, str]]), and update the drain signature to return Tuple[str, str] (or import Tuple) if not already recognized; also add any necessary typing imports (Optional, AsyncIterable, Union, Any, Tuple) at the top. Ensure you keep the existing return types and semantics of collect_stream and drain while only adding these parameter type hints.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@scripts/test/e2e/lib/e2e_helpers.py`:
- Around line 20-39: Add explicit parameter type annotations to improve
IDE/type-checker support: annotate collect_stream(stream) with an appropriate
async-iterable type (e.g., stream: Optional[AsyncIterable[Union[bytes, str]]])
and annotate drain(ex) with the expected execution object type (e.g., ex: Any or
a small Protocol with stdout()/stderr() -> AsyncIterable[Union[bytes, str]]),
and update the drain signature to return Tuple[str, str] (or import Tuple) if
not already recognized; also add any necessary typing imports (Optional,
AsyncIterable, Union, Any, Tuple) at the top. Ensure you keep the existing
return types and semantics of collect_stream and drain while only adding these
parameter type hints.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ffa6a850-42c0-4fe7-a972-75621dd42e1d
📒 Files selected for processing (27)
.github/workflows/e2e-stack.ymlmake/test.mkscripts/test/e2e/README.mdscripts/test/e2e/lib/e2e_helpers.pyscripts/test/e2e/pytest.iniscripts/test/e2e/run.shscripts/test/e2e/two_sided.shsdks/c/tests/e2e/e2e_basic.csdks/c/tests/e2e/test_c_entry.pysdks/go/tests/e2e/e2e_basic.gosdks/go/tests/e2e/test_go_entry.pysdks/node/tests/e2e/e2e_basic.tssdks/node/tests/e2e/test_node_entry.pysdks/python/pytest.inisdks/python/tests/e2e/conftest.pysdks/python/tests/e2e/test_box_management.pysdks/python/tests/e2e/test_errors.pysdks/python/tests/e2e/test_exec_options.pysdks/python/tests/e2e/test_exec_timeout.pysdks/python/tests/e2e/test_exec_unit_shape.pysdks/python/tests/e2e/test_execution_shutdown.pysdks/python/tests/e2e/test_lifecycle.pysdks/python/tests/e2e/test_p0_6_exec_stdout_race.pysdks/python/tests/e2e/test_path_verification.pysdks/python/tests/e2e/test_resize_tty.pysdks/python/tests/e2e/test_shutdown.pysrc/cli/tests/e2e/test_cli_entry.py
Add a file-level header comment to each of the two pytest.ini files so readers can tell at a glance which suite each one drives and where the sibling config lives, without having to follow run.sh / Makefile targets. - scripts/test/e2e/pytest.ini — cross-SDK e2e (all SDKs go through Python pytest entries; native code lives in subprocess'd SDK binaries). - sdks/python/pytest.ini — Python SDK only (unit + single-machine integration via local PyO3/FFI). Existing inline comments near `testpaths` / `norecursedirs` are kept as-is — they explain the mechanical reason for those specific settings, while the new header explains the file's overall scope. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…xlite-ai#682) Adds Python type hints to the two helper functions that were missing parameter annotations, and docstrings where they were missing too. Per CodeRabbit nitpick on PR boxlite-ai#682 — quality-of-life only, no behavior change. `from __future__ import annotations` already in the file means the `|` union syntax works on Python 3.10 too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
21 new e2e cases under scripts/test/e2e/cases/ that pin contracts the existing local-FFI integration tests can't reach. Cases shipped with xfail(strict=True) markers for the known production bugs they pin — a follow-up fix-PR flipping a bug closed will trip the strict marker into a CI failure, surfacing the fact that the xfail can now be dropped. Files added: scripts/test/e2e/cases/test_error_code_mapping.py scripts/test/e2e/cases/test_quota_enforcement.py scripts/test/e2e/cases/test_runner_concurrency.py Replaces an earlier split-out attempt that stacked on a reorganize PR (boxlite-ai#682). That structural reorg is now abandoned; new e2e cases land under scripts/test/e2e/cases/ matching boxlite-ai#678's existing layout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
POST /v1/{prefix}/boxes/{id}/exec was leaking HTTP 500 when the
underlying box was already gone / stopped / in a non-runnable state
by the time the runner tried to spawn:
exec on a removed box -> 500 "spawn_failed: build failed"
exec on a stopped box -> 500 (same shape)
exec on box mid-rebuild -> 500 (same shape)
Root cause: BoxliteExec in boxlite_exec.go wraps the SDK error in a
generic 500 without consulting the SDK's typed error helpers
(IsNotFound, IsStopped, IsInvalidState). Other handlers in this file
(signal/resize/kill/status) already classify these; POST /exec was
the one that got missed.
Fix: extend classifyExecError to recognise the SDK's typed errors and
map them to 404 (NotFound), 409 (Stopped, InvalidState) per the
canonical mapping at src/shared/src/errors.rs.
Pin: test_exec_on_stopped_box_is_typed_error in boxlite-ai#678's e2e suite goes
from 500 -> 4xx after this change. Re-validation: stack on top of
boxlite-ai#683 (which adds test_runner_concurrency.py) for the
exec-after-soft-deleted-box variant.
Replaces an earlier split-out attempt that stacked on boxlite-ai#682's reorg
(now abandoned). Branch rebuilt against current main.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Go SDK's box.Exec would occasionally return with stdout chunks still in flight — the user's callback got OnExit (or the wait gRPC reply) before the matching OnStdout/OnStderr chunks landed on the queue. From the caller's perspective, the exec had finished but its stdout was silently truncated. Root cause: execution_wait spawned an independent terminal task that pushed the Wait event as soon as wait_on_clone returned — with no drain barrier — so the wait reply could land in the event queue ahead of still-flushing stream pumps. Fix: make exit_pump the sole owner of terminal-event dispatch. Both execution_wait and register_exit fan into it; exit_pump awaits all stream pump receivers before pushing the terminal event. Queue order becomes Stdout/Stderr* -> Exit -> Wait*, all from the same task. Pin: test_p0_6_exec_stdout_race in boxlite-ai#678's e2e suite goes from ~70% stdout-loss to 0%. Replaces an earlier split-out attempt that stacked on boxlite-ai#682's reorg (now abandoned). Branch rebuilt against current main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… pin known bugs) (#683) real e2e tests add-up ## What 21 new e2e cases under \`sdks/python/tests/e2e/\` pin contracts the existing local-FFI integration tests can't reach: | File | Cases | xfails | Topic | |---|---|---|---| | \`test_error_code_mapping.py\` | 10 | 4 | \`BoxliteError → HTTP\` mapping (\`src/shared/src/errors.rs:198-280\`) | | \`test_quota_enforcement.py\` | 5 | 5 | API create-boundary per-sandbox quota | | \`test_runner_concurrency.py\` | 6 | 1 | Cross-process state-machine races | All 10 xfails are \`strict=True\` — an unexpected xpass fails CI, so the moment any bug is silently fixed we notice. Each xfail's \`reason=\` carries a \`file:line\` pointer and short root-cause. ## xfail → fix PR map | xfail | Will flip in | |---|---| | \`test_exec_after_box_removed_is_typed_error\` | PR C (runner classifies \`IsNotFound → 404\`) | | \`test_invalid_argument_zero_cpu_returns_400\` | needs API \`ValidationPipe\` fix (separate follow-up, not in C/D) | | \`test_invalid_argument_negative_memory_returns_400\` | same | | \`test_oversized_cpu_returns_400\` | same + per-sandbox quota (separate follow-up) | | 5× \`test_quota_enforcement.py\` (module-level) | per-sandbox quota fix (separate follow-up) | | \`test_execution_invalid_command_returns_422\` | needs Rust \`spawn_failed → ErrExecution\` mapping (separate follow-up) | ## Stack | | what | this PR | |---|---|---| | A (#682) | reorganize only (rename + wiring) | ← prereq | | **B (this)** | **21 new e2e cases** | ← you are here | | C | runner exec error mapping fix + flip 1 xfail | depends on B | | D | FFI exec drain race fix (no PR-B xfails to flip) | depends on B | 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Added end-to-end coverage validating API HTTP status ↔ error-code mappings, auth boundary behavior, unknown-resource and unregistered-image semantics, and state-transition error handling. * Added quota enforcement tests covering per-sandbox/org limits, boundary cases (including zero CPUs), and checks that quota rejections do not silently create resources. * Added concurrency/regression tests for parallel execs and creates, delete-during-exec behavior, repeated execs, and idempotent/typed stop semantics. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
move test files to different SDK dirs as previous
What
PR #678 shipped e2e pytest cases under
scripts/test/e2e/cases/, a fresh location that doesn't match the repo's existing per-SDK test layout. This PR moves them so each SDK's e2e tests live next to that SDK's other tests:scripts/test/e2e/cases/test_c_entry.pysdks/c/tests/e2e/scripts/test/e2e/cases/test_go_entry.pysdks/go/tests/e2e/scripts/test/e2e/cases/test_node_entry.pysdks/node/tests/e2e/scripts/test/e2e/cases/test_cli_entry.pysrc/cli/tests/e2e/scripts/test/e2e/cases/test_*.py(Python pytest)sdks/python/tests/e2e/scripts/test/e2e/cases/conftest.pysdks/python/tests/e2e/scripts/test/e2e/sdks/{c,go,node}/e2e_basic.{c,go,ts}sdks/{c,go,node}/tests/e2e/Stays put in
scripts/test/e2e/: bootstrap.sh, teardown.sh, run.sh, two_sided.sh, fixture_setup.py, pytest.ini, README.md (cross-SDK orchestration, not per-SDK).Supporting wiring (forced by the renames)
scripts/test/e2e/run.sh: pytest invoked with the 5 SDK e2e dirs as explicit args.testpathsinpytest.inican't resolve outside its rootdir.scripts/test/e2e/pytest.ini: comment explaining thetestpathsomission..github/workflows/e2e-stack.yml:paths:filter expanded to all new e2e locations (sdks/{c,go,node,python}/tests/e2e/**,src/cli/tests/e2e/**).scripts/test/e2e/README.md: layout map updated.Stack
xfail(strict=True))Splitting #681 into 4 because the original bundled rename + new tests + 2 source fixes into one commit — hard to review, hard to revert any single piece without unwinding the rest.
Test plan
Result against local stack with the full A→D stack applied:
make test:e2e— 43 passed, 10 xfailed in 37.05stest_drain_before_terminal*)Breakdown of the 43p/10xf (covers all four PRs in the stack):
test_error_code_mapping.pyBoxliteError → HTTPmapping (src/shared/src/errors.rs:198-280)test_quota_enforcement.pytest_runner_concurrency.pyxfail (bugs pinned, will xpass-strict once fixed):
cpus=0/memory_mib=-1silently accepted — DTO@Min+ValidationPipetransform interaction (apps/api/src/main.ts:66-67)exec /nonexistent→ 500 instead of 422 — Rustspawn_failedmisclassified asErrInternalcpus=999silently clamped — no@Max, no quota lookup increateFromSnapshot(apps/api/src/sandbox/services/sandbox.service.ts)exec-after-box-removed— 500 leak via runner's spawn-against-soft-deleted-box path (distinct from theStopped→500leak C closes)Known caveat
The pytest.ini-in-scripts/ + tests-in-sdks/ layout means
cd sdks/python/tests/e2e && pytest(without going throughrun.sh) won't read the pytest.ini (asyncio_mode, markers, etc.). Always invoke viamake test:e2eorscripts/test/e2e/run.sh. Cleaner would be to movepytest.inito repo root so its rootdir matches — left as follow-up.🤖 Generated with Claude Code
Summary by CodeRabbit