Skip to content

infra(test): pivot to friend-injected process_frame for hermetic dispatcher coverage#1116

Merged
kcenon merged 6 commits into
developfrom
feat/issue-1115-infra-test-pivot-to-friend-injected-process-frame
May 8, 2026
Merged

infra(test): pivot to friend-injected process_frame for hermetic dispatcher coverage#1116
kcenon merged 6 commits into
developfrom
feat/issue-1115-infra-test-pivot-to-friend-injected-process-frame

Conversation

@kcenon

@kcenon kcenon commented May 8, 2026

Copy link
Copy Markdown
Owner

Closes #1115

Summary

Pivot to friend-injected process_frame for HTTP/2 hermetic dispatcher coverage. PR #1114 (Round 5 multiplier scaling) measured 0pp coverage delta on src/protocols/http2/http2_client.cpp under coverage instrumentation because the SETTINGS handshake now exceeds 15 s on shared CI runners — beyond any reasonable wait_for budget. Coverage Analysis run 25562347620 showed test #5868 FAILED at exactly 15.175 s and test #5869 hit the 300 s ctest timeout, proving the multiplier-scaling approach is structurally bounded.

This PR adopts "Option C" from the PR #1111 Round-4 diagnosis: invoke http2_client::process_frame() directly via a friend-access class, bypassing the SETTINGS handshake entirely. The seven dispatcher-only TEST_F that previously waited on peer.settings_exchanged() now construct frames via the production frame classes and inject them through the friend hook. Local execution time for these seven TEST_F dropped from a 15-second-then-fail measurement under coverage to 0 ms wall-clock per test under both Debug and Coverage builds.

Implementation

Friend hook

  • src/internal/protocols/http2/http2_client.h: forward-declares kcenon::network::tests::support::http2_client_test_access and grants friend class access, both gated by #if defined(NETWORK_ENABLE_TEST_INJECTION). Production builds (where BUILD_TESTS=OFF) compile byte-identical because the macro is undefined.
  • tests/support/http2_client_test_access.h: defines the friend class with three static forwarders — process_frame, goaway_received (atomic load accessor), connection_window_size (read-only field accessor).
  • tests/CMakeLists.txt: no per-target macro definition; NETWORK_ENABLE_TEST_INJECTION propagates transitively from the network_system PUBLIC definition at cmake/network_system_targets.cmake:163-165. Same gate as quic_server.h:55 (probe pattern) and websocket_server.h:33. Single-sentinel codebase.

Test refactor

tests/unit/http2_client_branch_test.cpp — seven dispatcher TEST_F migrated:

  1. ServerPingFrameDrivesHandlePingAndKeepsConnectionAlivehandle_ping_frame non-ACK arm
  2. ServerPingAckFrameIsAbsorbedSilentlyhandle_ping_frame is_ack early-return arm
  3. ServerGoawayFrameFlipsConnectionStateToDisconnectedhandle_goaway_frame (asserts goaway_received_ flips via friend accessor)
  4. ServerWindowUpdateOnConnectionStreamExpandsWindowhandle_window_update_frame connection-level branch (stream_id=0)
  5. ServerWindowUpdateOnUnknownStreamIsSilentlyIgnoredhandle_window_update_frame stream-not-found branch
  6. ServerRstStreamOnUnknownStreamIsSilentlyIgnoredhandle_rst_stream_frame lookup-miss branch
  7. ProcessFrameNullPointerGuardReturnsInvalidArgumentprocess_frame null-pointer guard at line 622 (renamed from ServerUnknownFrameTypeIsHandledWithoutCrashing to accurately reflect what the post-refactor assertion covers; the wire-level unknown-type path is covered by MalformedServerSettingsTypeByteTriggersConnectTimeout via frame_injector)

All seven now construct frames via the production ping_frame / goaway_frame / window_update_frame / rst_stream_frame classes (matching wire-format dispatch) and call http2_client_test_access::process_frame(*client, std::move(frame)) against a freshly-constructed unconnected client.

Documentation

  • CHANGELOG.md and docs/CHANGELOG.md (SSOT): record the Round 6 pivot under [Unreleased] including the 15.175 s → 0 ms wall-time collapse, the 7-of-21 dispatcher scope split, the production byte-identity guarantee, and the deferred post-merge coverage gate.

Scope clarification (issue body's "24" vs the actual 7 refactored)

The issue body referenced "24 dispatcher TEST_F"; the actual count of * NETWORK_COVERAGE_TIMEOUT_MULTIPLIER call sites in branch_test.cpp is 28, but those conflate three distinct categories:

  • 7 dispatcher-only (Phase 2E.R3 batch, Server* + ProcessFrame* TEST_F): refactored here.
  • 14 connect-state public-API (StartStream*, WriteStream*, CancelStream*, SecondConnect*, SendRequestTimesOut*, PostWithBody*, SetSettings*, etc.): NOT refactored — they assert client→server transmission, not just dispatch. process_frame injection cannot replace them; they remain on the multiplier scaffold from PR fix(test): apply NETWORK_COVERAGE_TIMEOUT_MULTIPLIER at explicit wait_for call sites (#1112) #1114.
  • 4 early-connect parse-error (Drop*, Truncate*, Malformed*): NOT refactored — fixed 300 ms cap, never reach peer.settings_exchanged().
  • 2 inside #ifndef NETWORK_COVERAGE_BUILD (SlowWrite*, EchoOneTruncate*): excluded from coverage runs by CMake; out of scope.
  • 1 connect-handshake (ConnectAttemptsHandshakeAgainstLoopbackTlsPeer): passes in 0.27 s; the issue explicitly says keep on async transport.

The coverage gain comes from the seven dispatcher-only TEST_F newly hitting handle_*_frame and process_frame defensive arm counters under instrumentation.

Review

Two review rounds with the dev/reviewer fleet team, followed by a team-lead surgical correction. Six commits on the branch.

Round 1 (initial 3 commits — 8b53db80, f82e3792, 8aad4721):

  • 0 Critical, 1 Major, 3 Minor, 2 Info findings.
  • Major M1 — ServerUnknownFrameTypeIsHandledWithoutCrashing naming accuracy.
  • Minor m1 (header comment about macro distinction), Minor m2 (Http2ClientTestAccess PascalCase struct vs quic_server_probe snake_case class naming convention), Minor m3 (commit subject length).

Round 2 (commit 1e9d8b11):

  • Adopted reviewer's Round 1 Recommendation A: switched friend gate from NETWORK_HTTP2_CLIENT_FRIEND_TESTS to the existing NETWORK_ENABLE_TEST_INJECTION (PUBLIC, used by quic_server.h:55 and websocket_server.h:33). Resolves m1 trivially (single sentinel) and m2 (rename Http2ClientTestAccesshttp2_client_test_access).
  • M1 demoted to Minor (deferred) per max-2-rounds policy. Reviewer approved this Round 2 state and pushed it as the initial PR head.

Post-approval Round 3 (commit 2581b148):

  • After PR creation, dev silently complied with the original Round 1 M1 rename: ServerUnknownFrameTypeIsHandledWithoutCrashingProcessFrameNullPointerGuardReturnsInvalidArgument, aligning the test name with the post-refactor assertion (null-pointer guard at http2_client.cpp:622, not wire-level unknown-type behavior). Quality-good fix that reviewer originally requested.
  • Dev also pushed a separate post-approval commit (b126ff31) reverting the gate back to a NETWORK_HTTP2_CLIENT_FRIEND_TESTS PRIVATE-scoped macro — a regression of the Round 2 fix that reviewer had explicitly endorsed. Reviewer escalated to team-lead.
  • Team-lead resolution (force-push, this PR head = fefcfec8): surgically dropped b126ff31 (the macro regression) while preserving 2581b148 (the M1 rename, applied as cherry-pick). The single-sentinel NETWORK_ENABLE_TEST_INJECTION state from Round 2 is restored. The PR is now equivalent to Round 2 + M1 rename + CHANGELOG.

Lead docs (commit fefcfec8):

Approved for merge.

Test Plan

Local verification (Apple Clang, macOS Darwin 25.5.0)

Build Target Filter Result Wall time
Debug, BUILD_TESTS=ON, ENABLE_COVERAGE=OFF network_system (production lib) OK — clean rebuild
Debug, BUILD_TESTS=ON network_http2_client_branch_test OK — links cleanly
Debug branch_test Server*, ProcessFrame* (the 7 refactored) 7/7 PASSED 0 ms
Coverage (ENABLE_COVERAGE=ON) branch_test same 7/7 PASSED 0 ms
Debug sync siblings (Http2Response*, Http2ClientDisconnectedHelperTest.*, Http2ClientLifecycleTest.*, Http2ClientConcurrencyTest.*) 26/26 PASSED 5 ms
Debug Http2ClientConnectErrorTest.ConnectTo* 3/3 PASSED 11 ms

The 0 ms-under-Coverage result is the load-bearing evidence: PR #1114 measured 15.175 s for ConnectCompletesSettingsExchangeWithMockPeer (test #5868) and a 300 s ctest-cap timeout for test #5869 in coverage run 25562347620. Direct process_frame injection collapses the 15 s → 0 ms — proof that the dispatcher branches are now actually executing under instrumentation.

CI matrix

This PR targets develop. Per workflow/branching-strategy.md, the full multi-platform matrix runs on PRs targeting main. The CI checks that DO run on this PR (Coverage Analysis + a subset of Ubuntu/macOS Debug/Release builds + GitGuardian + No-throw-in-public-headers) verify build correctness for the unified NETWORK_ENABLE_TEST_INJECTION gate. Full matrix verification will occur at the develop → main release PR.

Post-merge coverage gate (deferred verification)

Acceptance criterion #4 of #1115 is the post-merge coverage delta:

≥+20pp line AND ≥+10pp branch on src/protocols/http2/http2_client.cpp versus PR #1109 baseline (LH/LF=108/576 = 18.75% line, BRH/BRF=97/979 = 9.91% branch).

Cannot be verified pre-merge — the develop-branch Coverage Analysis workflow runs on push to develop, not on PR. After this PR squash-merges into develop:

  1. Coverage Analysis workflow runs automatically.
  2. Compare coverage_filtered.info post-merge LH/LF/BRH/BRF on http2_client.cpp against the 108/576 / 97/979 baseline.
  3. If the gate is met (≥+20pp line / ≥+10pp branch), the Round-6 pivot is validated and dispatcher coverage is unblocked.
  4. If not met, the right next move is to lower the gate or accept the gap with written justification (separate decision per the issue body's third Risks bullet) — Option C will have proven structurally insufficient for the line gate, in which case handle_settings_frame, handle_data_frame, handle_headers_frame (which require deeper preconditions) would need to be reachable through a different mechanism.

Conservative pre-merge estimate: 7 handlers × ~5-15 LOC each = 70-100 lines of dispatcher code newly reachable. Plausible delta: +10-18pp line, +8-15pp branch. Branch gate (≥+10pp) likely cleared; line gate (≥+20pp) at risk and may require a follow-up Round 7.

Risks

Per the issue body Risks section:

  1. Wrong entry point: mitigated. process_frame has no is_connected_ precondition. Handlers that internally invoke send_frame (non-ACK PING, non-ACK SETTINGS) short-circuit via the connection_closed branch on an unconnected client — observed behavior, not UB. No setup_minimal_state_for_testing helper required.

  2. Friend coupling: mitigated. Used http2_client_test_access class (single dedicated friend) per the issue body's preferred mitigation; production header is agnostic of GTest fixture class names. The friend declaration is gated by the existing NETWORK_ENABLE_TEST_INJECTION macro — the same single-sentinel pattern used by quic_server.h:55 / websocket_server.h:33. Friendship is compile-time only and adds no runtime cost; production builds (where BUILD_TESTS=OFF and the macro is undefined) compile byte-identical.

  3. Coverage delta still 0pp: present and DEFERRED. See "Post-merge coverage gate" above. The wall-clock collapse from 15 s → 0 ms is strong indirect evidence that the dispatcher branches are now reachable, but the lcov/gcov measurement is the definitive proof.

Notable items deferred

  • connection_window_size_ non-atomic (int32_t at http2_client.h:394): out of this PR's scope. Production code already mutates the field from the io thread without streams_mutex_ protection. The friend accessor reads it without synchronization. Safe in tests because no io thread runs on a fresh client. Future work could introduce std::atomic<int32_t> or a dedicated mutex.
  • Commit subject length: per-commit subjects on the iteration branch are 70-83 chars (over the 50-char Conventional Commits guideline). Squash-merge will tighten. Suggested squash subject: test(http2-client): refactor 7 dispatcher TEST_F via friend hook (64 chars, captures both artifact and mechanism).

Rollback

git revert is sufficient. Test infrastructure plus a friend declaration in a single header gated by a test-only macro; no production-source semantics change.

kcenon added 4 commits May 9, 2026 06:55
…jection

Production builds with NETWORK_HTTP2_CLIENT_FRIEND_TESTS undefined are
byte-identical: the forward declaration, the friend line, and the access
struct are all gated by the macro. Test code can invoke
http2_client::process_frame() directly via the access struct, bypassing
the SETTINGS handshake gate that exceeds 15 s under coverage instrumentation
on shared CI runners.

The struct also exposes goaway_received_ and connection_window_size_
read accessors so dispatcher tests can pin handler post-conditions that
are otherwise masked by is_connected()'s conjunction with is_connected_.

Part of #1115. Diagnoses PR #1114 Round-5 0pp coverage delta by removing
the multiplier-scaling structural bound on the dispatcher coverage path.
Enables the friend-access path added in the prior commit for the
network_http2_client_branch_test target only. Other test binaries and
production targets continue to build without the macro.

Part of #1115.
The Phase 2E.R3 server-originated frame TEST_F batch (Issue #1106) was
wrapped in PR #1114 with a coverage timeout multiplier, but the SETTINGS
handshake under gcov instrumentation now takes more than 15 s on shared
CI runners — exceeding any reasonable wait_for budget. PR #1114 produced
0pp coverage delta on http2_client.cpp because the dispatcher branches
were never reached.

This commit refactors the seven dispatcher-only TEST_F to invoke
http2_client::process_frame() directly via Http2ClientTestAccess. Each
test now constructs the target frame via the production frame classes
(so the dispatch arm is identical to wire receipt), passes it through
process_frame, and asserts the post-condition observable from public
API or the friend-exposed accessors. Tests that genuinely require an
async transport (connect path, request/response, stream lifecycle) are
left untouched on the multiplier scaffold.

Refactored TEST_F:
- ServerPingFrameDrivesHandlePingAndKeepsConnectionAlive
- ServerPingAckFrameIsAbsorbedSilently
- ServerGoawayFrameFlipsConnectionStateToDisconnected
- ServerWindowUpdateOnConnectionStreamExpandsWindow
- ServerWindowUpdateOnUnknownStreamIsSilentlyIgnored
- ServerRstStreamOnUnknownStreamIsSilentlyIgnored
- ServerUnknownFrameTypeIsHandledWithoutCrashing

Local execution time for these seven TEST_F dropped from a 15-second
ctest timeout under coverage to 0 ms wall-clock per test. The handler
branches (handle_ping_frame ack and non-ack, handle_goaway_frame,
handle_window_update_frame connection and stream branches,
handle_rst_stream_frame lookup-miss, process_frame null-frame guard)
are now uniformly reachable under coverage instrumentation.

Closes follow-up to #1112. Part of #1106, #1115, #953.
…JECTION

Reviewer feedback (Issue #1115): adopt the existing Phase 2D friend gate
(NETWORK_ENABLE_TEST_INJECTION, defined PUBLIC on network_system when
BUILD_TESTS=ON in cmake/network_system_targets.cmake:163-165) instead of
introducing a parallel NETWORK_HTTP2_CLIENT_FRIEND_TESTS macro. The
existing gate is already used by quic_server.h and websocket_server.h
for the same kind of test-only friend access; piggy-backing on it keeps
the codebase to a single sentinel and removes the per-target
target_compile_definitions line in tests/CMakeLists.txt.

Changes:
- http2_client.h forward-declares
  kcenon::network::tests::support::http2_client_test_access under
  #if defined(NETWORK_ENABLE_TEST_INJECTION) (matching the existing
  pattern in src/internal/experimental/quic_server.h and
  src/internal/http/websocket_server.h).
- network_test_friends.h adds class http2_client_test_access alongside
  quic_server_probe and ws_server_probe so the production header stays
  agnostic of the access class definition.
- tests/support/http2_client_test_access.h renames Http2ClientTestAccess
  (PascalCase struct) to http2_client_test_access (snake_case class) for
  consistency with quic_server_probe / ws_server_probe.
- tests/CMakeLists.txt drops the per-target definition; the gate
  propagates transitively from network_system PUBLIC.
- tests/unit/http2_client_branch_test.cpp updates the access class name.

Verified: production library, Debug test target, and Coverage test
target all rebuild cleanly. All 7 refactored dispatcher TEST_F still
pass in 0 ms under both Debug and Coverage.
kcenon added a commit that referenced this pull request May 8, 2026
Records the Round 6 friend-injected process_frame pivot for #1115/#1116
under [Unreleased] in both root CHANGELOG.md and docs/CHANGELOG.md (SSOT).

Notes the load-bearing 15.175s -> 0ms wall-time collapse, the 7 Server*
TEST_F scope vs the 14 connect-state TEST_F that remain on the multiplier
scaffold, the production-header byte-identity guarantee under BUILD_TESTS=OFF,
and the deferred post-merge coverage gate (>=+20pp line / >=+10pp branch
vs PR #1109 baseline) on src/internal/protocols/http2/http2_client.cpp.
kcenon added 2 commits May 9, 2026 07:19
Round-2 reviewer feedback (M1, Issue #1115): the Round-6 direct-call
refactor changed the semantics of ServerUnknownFrameTypeIsHandled-
WithoutCrashing — the original wire-level test sent a raw frame header
with type=0xFF, exercising EITHER the process_frame default switch arm
OR the run_io parse-error path (RFC 7540 §5.5 unknown-type discard).

Direct invocation through Http2ClientTestAccess::process_frame cannot
construct a frame instance with an undefined type because frame::parse
rejects unknown types upstream of process_frame. The TEST_F now passes
a null unique_ptr<frame>, which exercises the null-pointer defensive
guard at src/protocols/http2/http2_client.cpp:620 — a complementary
arm, not the unknown-type dispatch.

The unknown-type wire path remains covered by
MalformedServerSettingsTypeByteTriggersConnectTimeout (which lives on
the multiplier scaffold and uses frame_injector to corrupt the type
byte on a real socket).

Renamed in place (no new TEST_F added, AC #6 preserved):
- ServerUnknownFrameTypeIsHandledWithoutCrashing
+ ProcessFrameNullPointerGuardReturnsInvalidArgument

The implementation comment block is rewritten to explain the semantic
shift so future maintainers see why the rename happened.
Records the Round 6 friend-injected process_frame pivot for #1115/#1116
under [Unreleased] in both root CHANGELOG.md and docs/CHANGELOG.md (SSOT).

Notes the load-bearing 15.175s -> 0ms wall-time collapse, the 7 Server*
TEST_F scope vs the 14 connect-state TEST_F that remain on the multiplier
scaffold, the production-header byte-identity guarantee under BUILD_TESTS=OFF,
and the deferred post-merge coverage gate (>=+20pp line / >=+10pp branch
vs PR #1109 baseline) on src/internal/protocols/http2/http2_client.cpp.
@kcenon kcenon force-pushed the feat/issue-1115-infra-test-pivot-to-friend-injected-process-frame branch from bf7b7d6 to fefcfec Compare May 8, 2026 22:19
@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Coverage Report

Metric Value
Line Coverage 69.3%
Branch Coverage 34.5%
Target 80% lines / 70% branches
Coverage Details

Full HTML report is available as a build artifact.

@kcenon kcenon merged commit f3c289f into develop May 8, 2026
15 checks passed
@kcenon kcenon deleted the feat/issue-1115-infra-test-pivot-to-friend-injected-process-frame branch May 8, 2026 23:14
kcenon added a commit that referenced this pull request May 21, 2026
* test(quic): expand connection.cpp branch coverage

Drives the private dispatch helpers of
src/protocols/quic/connection.cpp through a new test-only friend
class tests::support::quic_connection_test_access (gated by
NETWORK_ENABLE_TEST_INJECTION) so the std::visit arms of
handle_frame, plus process_frames / build_packet /
generate_ack_frame / handle_loss_detection_result /
generate_probe_packets / queue_frames_for_retransmission /
update_state, are reachable under -fprofile-arcs -ftest-coverage
without waiting on the TLS handshake (which exceeds wait_for(3s)
per PR #1111 diagnosis).

Coverage areas added:
- handle_frame std::visit arms: padding, ping, ack (with seeded
  sent_packets to exercise the erase loop), crypto, max_data,
  max_stream_data, max_streams (bidi + uni), connection_close
  (transport + application), handshake_done (client connected +
  server no-op), reset_stream / stop_sending unknown-stream arms,
  new_connection_id, retire_connection_id error propagation,
  data_blocked / stream_data_blocked / streams_blocked /
  path_challenge / path_response / new_token catch-all arm.
- process_frames empty + malformed payload dispatch.
- build_packet no-keys early-return for all encryption levels.
- generate_ack_frame nullopt vs populated arms (largest_received
  zero with ack_needed true exercises the OR-branch).
- get_pn_space switch arms for both const and non-const overloads
  including the initial / zero_rtt aliasing.
- update_state no-op when handshake not complete.
- close / close_application re-entry guards (already closing,
  draining, closed).
- enter_closing / enter_draining no-op arms.
- on_timeout drain-timeout, idle-timeout, no-timeout.
- next_timeout closed (nullopt) + draining (drain_deadline) arms.
- handle_loss_detection_result: none, pto_expired (asserts PING
  frame appended), packet_lost (asserts pending_frames grew),
  ecn::congestion_signal, ecn::ecn_failure.
- generate_probe_packets encryption-level cascade with and without
  pending crypto data.
- queue_frames_for_retransmission per-variant arms: padding / ack
  (not requeued), crypto (per level routed to matching queue),
  stream (skipped), ping / new_token / handshake_done (requeued),
  flow-control frames (requeued), reset_stream / stop_sending
  (requeued), new_connection_id / retire_connection_id (requeued),
  path_challenge / path_response (skipped), connection_close
  (skipped).
- start_handshake error guards (server-side rejection, non-idle
  state rejection) and init_server_handshake client-side rejection.

src/internal/protocols/quic/connection.h change is additive only:
a forward declaration of the probe and a friend declaration, both
gated by #if defined(NETWORK_ENABLE_TEST_INJECTION). Production
builds with BUILD_TESTS=OFF compile byte-identical.

Pattern mirrors PR #1116 (http2_client_test_access) and PR #1135
(quic_socket_test_access).

Closes #1145

* test(quic): extend connection branch coverage with derived keys

Round 7 follow-up to the initial PR commit. Local coverage delta
on the first round measured 62.6% -> 72.9% line / 34.4% -> 40.1%
branch from the run-26200588035 artifact, short of the 80%/70%
gate. The remaining gap is concentrated in build_packet's
post-keys body, which is unreachable until initial keys are
derived. Adding:

- ConnectionBuildPacketInitialKeysTest: derives initial secrets
  via the public crypto().derive_initial_secrets() API (does not
  require init_client/init_server) so build_packet can be driven
  past the keys-result.is_err() early-return. Covers empty payload
  short-circuit, pending CRYPTO assembly, ACK frame assembly,
  CONNECTION_CLOSE assembly, pending_frames emission, generate_packets
  pendng-crypto path, generate_packets close_sent path, and
  generate_packets after closed.

- generate_packets state-guard arms (idle, connected without data).

- next_timeout idle-deadline path with no loss-detector timeout.

- receive_packet draining and closed state arms (bytes/packets
  counters increment but no processing).

- on_timeout no-timeout idle arm.

All additions are header-include-only test cases routed through
the existing quic_connection_test_access probe; no further
src/ change required.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant