-
Notifications
You must be signed in to change notification settings - Fork 371
Description
Problem Statement
When a network connection targets an endpoint with L7 policy rules, the sandbox proxy emits both a CONNECT (L4) log entry and one or more L7_REQUEST log entries for the same connection. The L4 CONNECT entry is always action=allow for L7-inspected connections, making it look like an independent policy decision when it's really just a tunnel lifecycle event. This creates confusion for operators — the same connection appears to have two policy decisions at different layers. The log message type should distinguish between a standalone L4 policy decision and a tunnel-open event that precedes L7 inspection.
Technical Context
The sandbox proxy handles CONNECT requests in a two-phase pipeline: first L4 evaluation (host:port + binary identity via OPA network_action rule), then optional L7 inspection (HTTP method/path via OPA allow_request rule). Both phases independently emit structured tracing::info! log lines, which are captured by LogPushLayer and pushed to the server via gRPC PushSandboxLogs. The TUI renders these as distinct log line types with separate field orderings. The L7 OPA rule is a strict superset of L4 — it re-evaluates endpoint_allowed AND binary_allowed before additionally checking request_allowed_for_endpoint.
A key constraint is that a single CONNECT tunnel can carry many HTTP requests (keep-alive), creating a 1:N relationship between L4 connections and L7 request logs. Suppressing the CONNECT log entirely would lose the "connection opened" lifecycle event, which has value as context for the L7 requests that follow.
Affected Components
| Component | Key Files | Role |
|---|---|---|
| Proxy | crates/openshell-sandbox/src/proxy.rs |
L4 CONNECT evaluation + logging, L7 relay dispatch |
| L7 Relay | crates/openshell-sandbox/src/l7/relay.rs |
L7 per-request evaluation + logging |
| OPA Engine | crates/openshell-sandbox/src/opa.rs |
Policy evaluation for both L4 and L7 |
| Rego Policy | crates/openshell-sandbox/data/sandbox-policy.rego |
Rule definitions (network_action, allow_request) |
| Log Push | crates/openshell-sandbox/src/log_push.rs |
Captures tracing spans and pushes to server |
| Denial Aggregator | crates/openshell-sandbox/src/denial_aggregator.rs |
Aggregates denial events for policy recommendation |
| TUI Logs | crates/openshell-tui/src/ui/sandbox_logs.rs |
Renders L4 and L7 log lines with different field layouts |
Technical Investigation
Architecture Overview
The proxy's handle_tcp_connection function (proxy.rs) processes each CONNECT request through:
- L4 evaluation (
evaluate_opa_tcp(), line 340-347) — resolves process identity via/proc/net/tcp, evaluates OPAnetwork_actionrule - L4 logging (
"CONNECT"info!, lines 385-400) — always emitted, regardless of whether L7 follows - L7 config query (
query_l7_config(), line 501) — checks if endpoint has L7 protocol config - If L7 configured →
relay_with_inspection()(lines 544/600) - L7 per-request evaluation (
evaluate_l7_request(), relay.rs line 114) — evaluates OPAallow_requestrule - L7 logging (
"L7_REQUEST"info!, relay.rs lines 123-133) — emitted per HTTP request in the tunnel
The OPA Rego rules confirm L7 is a superset of L4:
- L4:
allow_networkchecksendpoint_allowed+binary_allowed(rego lines 18-20) - L7:
allow_requestchecksendpoint_allowed+binary_allowed+request_allowed_for_endpoint(rego lines 160-173)
Code References
| Location | Description |
|---|---|
proxy.rs:385-400 |
L4 "CONNECT" log emission — always fires, even for L7-inspected connections |
proxy.rs:340-347 |
evaluate_opa_tcp() call for L4 decision |
proxy.rs:500-501 |
query_l7_config() — determines if L7 inspection is needed |
proxy.rs:544,600 |
relay_with_inspection() dispatch for L7 |
l7/relay.rs:114 |
evaluate_l7_request() call for L7 decision |
l7/relay.rs:116-120 |
L7 decision string mapping (allow/audit/deny) |
l7/relay.rs:123-133 |
L7 "L7_REQUEST" log emission |
l7/relay.rs:19-34 |
L7EvalContext struct — carries L4 context into L7 |
opa.rs:32-36 |
NetworkAction enum (Allow/Deny) |
sandbox-policy.rego:149-154 |
network_action L4 rule |
sandbox-policy.rego:160-173 |
allow_request L7 rule (superset of L4) |
denial_aggregator.rs:20-37 |
DenialEvent struct — note L7 relay does NOT emit these |
sandbox_logs.rs:321-348 |
TUI field orderings for CONNECT vs L7 log types |
Current Behavior
For an L7-configured endpoint (e.g., api.github.com:443 with REST rules):
INFO CONNECT action=allow dst_host=api.github.com dst_port=443 policy=github_api ...
INFO L7_REQUEST l7_decision=allow l7_action=GET l7_target=/repos/org/repo dst_host=api.github.com ...
INFO L7_REQUEST l7_decision=deny l7_action=DELETE l7_target=/repos/org/repo dst_host=api.github.com ...
The CONNECT action=allow entry looks like a policy decision but is misleading — it will always be allow for any connection that reaches L7. The real policy decisions are the L7_REQUEST entries. Meanwhile, L4-only endpoints correctly use CONNECT as their sole policy decision.
What Would Need to Change
Core change: Differentiate the log message type based on whether L7 inspection follows:
CONNECT— L4-only endpoint. This is the standalone policy decision. No L7 follows.CONNECT_L7— L7-configured endpoint. This is a tunnel lifecycle event (connection opened), not a policy decision. TheL7_REQUESTentries that follow within the tunnel are the actual policy decisions.
Implementation:
-
Defer log emission: The L7 config query (
query_l7_config()) happens after the current CONNECT log atproxy.rs:385-400. Move the log emission to after the L7 config check, or query L7 config earlier, so we know which message type to emit. -
Change the message string: When L7 config is present, emit
"CONNECT_L7"instead of"CONNECT". All existing fields remain the same. -
TUI rendering: Add a
CONNECT_L7_FIELD_ORDERtosandbox_logs.rs(or reuse the existingCONNECT_FIELD_ORDER) so the TUI renders these correctly. The TUI could also visually distinguish tunnel lifecycle events from policy decisions. -
Secondary: Enrich L7 logs with process identity: The
L7EvalContextalready carriesbinary_path,ancestors, andcmdline_pathsfrom L4. Adding these to theL7_REQUESTlog fields ensures the policy decision logs are self-contained. -
Secondary: Denial aggregator gap: The L7 relay does not emit
DenialEvents to the denial aggregator. The proto definesl7_denyandl7_auditstages, and test data references them (mechanistic_mapper.rs:581), but the relay code never sends them. Consider fixing in the same pass.
Desired Behavior
INFO CONNECT_L7 action=allow dst_host=api.github.com dst_port=443 policy=github_api ...
INFO L7_REQUEST l7_decision=allow l7_action=GET l7_target=/repos/org/repo dst_host=api.github.com ...
INFO L7_REQUEST l7_decision=deny l7_action=DELETE l7_target=/repos/org/repo dst_host=api.github.com ...
For L4-only endpoints, behavior is unchanged:
INFO CONNECT action=allow dst_host=example.com dst_port=443 policy=default ...
Log consumers can trivially distinguish:
CONNECT= standalone L4 policy decisionCONNECT_L7= tunnel lifecycle event (context for L7_REQUEST entries)L7_REQUEST= L7 policy decision (the authoritative decision for this endpoint)
Patterns to Follow
- Log field naming follows the existing
snake_caseconvention withdst_host,dst_port,policy, etc. - The
DenialEventpattern withdenial_stagediscriminator is the established way to categorize denial types. - The TUI
*_FIELD_ORDERarrays define display priority — new fields should follow the existing ordering convention.
Proposed Approach
Distinguish L4-only connections from L7-inspected tunnels at the log message level. When the proxy determines that an allowed CONNECT will proceed to L7 inspection, emit "CONNECT_L7" instead of "CONNECT". This preserves the tunnel lifecycle event (no logs are suppressed, no 1:N relationship is broken) while making it clear that CONNECT_L7 is context, not a policy decision. The L7_REQUEST entries remain the authoritative policy decisions for L7 endpoints. As secondary improvements, enrich L7_REQUEST logs with process identity fields and wire up DenialEvent emission from the L7 relay.
Scope Assessment
- Complexity: Low
- Confidence: High — minimal code change, clear semantics
- Estimated files to change: 3-4 (
proxy.rs,sandbox_logs.rs, optionallyl7/relay.rsanddenial_aggregator.rsfor secondary improvements) - Issue type:
refactor
Risks & Open Questions
- Log consumer migration: Any tooling that filters on
message == "CONNECT"will need to also match"CONNECT_L7"if it wants all connections. This is a minor but breaking change to log format. - Denial aggregator gap: The L7 relay not emitting
DenialEvents is a pre-existing issue. Should this be fixed in the same pass or tracked separately? - Process identity in L7 logs (secondary): Adding
binary,ancestors,cmdlinefields toL7_REQUESTmakes each L7 log self-contained but increases log volume per line. Worth doing?
Test Considerations
- Unit test: L4-only endpoint emits
"CONNECT"message - Unit test: L7-configured endpoint emits
"CONNECT_L7"message - Integration test:
L7_REQUESTentries still appear correctly after the change - TUI test: verify rendering handles both
CONNECTandCONNECT_L7log types - If denial aggregator changes are included: verify
DenialEventemission from L7 relay - Existing test patterns in
crates/openshell-sandbox/src/should be followed
Created by spike investigation. Use build-from-issue to plan and implement.