Pipeline inspector hook one-pager by phisco · Pull Request #7025 · crossplane/crossplane

phisco · 2026-01-16T15:31:48Z

Description of your changes

This PR adds the Pipeline Inspector design document, which proposes a mechanism to capture RunFunctionRequest and RunFunctionResponse data for debugging and observability of function pipeline executions.

Fixes #

I have:

Read and followed Crossplane's contribution process.
~~Run earthly +reviewable to ensure this PR is ready for review.~~ (design doc only)
~~Added or updated unit tests.~~ (design doc only)
~~Added or updated e2e tests.~~ (design doc only)
~~Linked a PR or a docs tracking issue to document this change.~~ (design doc only)
~~Added backport release-x.y labels to auto-backport this PR.~~ (design doc only)
~~Followed the API promotion workflow if this PR introduces, removes, or promotes an API.~~ (design doc only)

Need help with this checklist? See the cheat sheet.

coderabbitai · 2026-01-16T15:35:33Z

📝 Walkthrough

Walkthrough

Adds a design document for a Function Pipeline Inspector: an InspectingFunctionRunner wrapper that emits sanitized RunFunctionRequest/Response JSON plus StepMeta over gRPC to a sidecar PipelineInspectorService, with fail‑open behavior, credential redaction, config/Helm notes, metrics, and a phased rollout plan. (50 words)

Changes

Cohort / File(s)	Summary
Design document `design/one-pager-pipeline-inspector.md`	New comprehensive design describing an InspectingFunctionRunner wrapper that emits pre/post RunFunctionRequest/Response JSON with a StepMeta protobuf to a PipelineInspectorService (gRPC EmitRequest/EmitResponse). Covers payload sanitization (credential/secret redaction), 100ms fail‑open emit timeout, health/metrics, config flags and Helm chart notes, SocketPipelineInspector gRPC client, correlation semantics aligned with OpenTelemetry, data‑volume guidance (~4MB per message via separate RPCs), minimal sidecar reference implementation, and a phased implementation plan.

Sequence Diagram

sequenceDiagram
    participant Composition as Composition
    participant IFR as InspectingFunctionRunner
    participant FR as Underlying FunctionRunner
    participant Inspector as PipelineInspector Sidecar
    Composition->>IFR: RunFunction(request)
    IFR->>IFR: Build StepMeta (trace_id, step_index, iteration, composition id, timestamp)
    IFR->>IFR: Sanitize request JSON (strip creds, redact secrets)
    IFR->>Inspector: EmitRequest(JSON payload, StepMeta) [fire-and-forget, 100ms timeout]
    IFR->>FR: RunFunction(request)
    FR-->>IFR: response
    IFR->>IFR: Sanitize response JSON
    IFR->>Inspector: EmitResponse(JSON payload, error?, StepMeta) [fire-and-forget, 100ms timeout]
    IFR-->>Composition: response

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Add sidecar container support to Crossplane Helm chart #7007 — Helm-sidecar support likely related to deploying the Pipeline Inspector sidecar described in this design.

Suggested reviewers

negz
ezgidemirel

Would you like the diagram expanded to show optional OpenTelemetry span correlation and sidecar health/fail‑open branching?

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and descriptively summarizes the main change—adding a design document for the Pipeline Inspector feature.
Description check	✅ Passed	The description is directly related to the changeset, explaining that the PR adds a Pipeline Inspector design document for capturing RunFunctionRequest and RunFunctionResponse data.
Breaking Changes	✅ Passed	The breaking changes check is scoped to files under `apis/` and `cmd/` directories. This PR only adds a design document at `design/one-pager-pipeline-inspector.md`, which falls outside scope.
Feature Gate Requirement	✅ Passed	This PR adds exclusively a design document with no code implementation, API changes, or behavioral modifications. A design-only document is not itself a feature implementation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

jbw976

This looks great @phisco! i like how it will finally provide a foundation for function pipeline debugging and I like how it's using a proven pattern we've already rolled out with change logs.

A few comments for your consideration 🙇‍♂️

jbw976 · 2026-01-18T03:20:32Z

design/one-pager-pipeline-inspector.md

+
+### Security: Credential Stripping
+
+The `RunFunctionRequest` includes a `credentials` field that may contain sensitive data (secrets, connection details). **This field must be cleared before emission**:


another possible source of sensitive data to strip would be the observed connection details of the resources. Those get passed in the request also and we should consider stripping those.

Looks like you've also added stripping out any Secret composed resources, which makes sense given that is the recommended path for exposing XR connection details in v2.

However, I was referring to the connection details that are included for every observed resource, e.g. the connection_details field in the protobuf. That should get stripped too to be consistent, right?

design/one-pager-pipeline-inspector.md

jbw976 · 2026-01-18T03:44:26Z

design/one-pager-pipeline-inspector.md

+service PipelineInspectorService {
+    // EmitRequest receives the function request before execution.
+    // This is a fire-and-forget call; errors do not affect pipeline execution.
+    rpc EmitRequest(EmitRequestRequest) returns (EmitRequestResponse);


This will be using unary RPC right? https://grpc.io/docs/what-is-grpc/core-concepts/#unary-rpc

I think the EmitRequest() call in the InspectingFunctionRunner would technically be blocking then - the client gRPC call will block until the server sends a response or the context times out.

This likely will not cause any problems (haven't heard of it causing any problems in changelogs that are in the middle of the managed reconciler), because it is just over a local unix socket and responses should have very minimal latency. But it may still be good to acknowledge that this isn't technically non-blocking.

we could consider to use a pretty short context/timeout then just for the gRPC calls, so they can't ever have too much of slow down effect on the overall function pipeline.

do i have that right about the blocking nature of unary gRPC? what do you think about a shorter context for them?

Good catch! I initially thought of handling them from a separate goroutine, but I ended up deciding not to. Just continuing on error could be enough indeed, I'll update the doc accordingly.

yep that makes sense to continue on error. i did want to point out, just to be sure you've seen it, the approach taken for change logs that uses a separate shorter context instead of the outer parent context:
https://github.com/crossplane/crossplane-runtime/blob/main/pkg/reconciler/managed/changelogger.go#L114-L117

Perhaps this is what you were referring to later in the doc with this?

The default emit timeout is 100ms. If the sidecar doesn't respond within this time, the emit fails and pipeline execution continues.

Yes, that's what I meant 👍

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@design/one-pager-pipeline-inspector.md`:
- Around line 234-236: Update the wording for clarity: replace the phrase “On
top of that” with a shorter alternative like “Additionally” or “Also”, and fix
the sentence fragment at the later mention of RunFunctionRequest by adding a
subject (e.g., “It can be used…” or “This field can be used…”) so the sentence
about clearing composed Secret resource `data` reads as a complete sentence;
ensure references to `RunFunctionRequest`, its `credentials` and connection
fields, and composed Secret `data` are preserved and unchanged except for the
clarified wording.
- Around line 9-22: Reflow the long Markdown paragraphs in the "Crossplane
Compositions are defined as pipelines" section and the "Current State" section
(including the bullet points for `crossplane render`, `crossplane beta trace`,
and "Function outputs (logs, events, conditions)") to wrap lines at
approximately 100 columns; ensure sentences are broken between words (not
mid-word), preserve existing punctuation, lists, and code spans, and apply the
same wrapping style to the other long paragraphs noted in the review so the file
reads consistently.

🧹 Nitpick comments (1)

design/one-pager-pipeline-inspector.md (1)

234-249: Ensure secret data is stripped from both observed and desired resources.

RunFunctionRequest can carry desired resources too, which may include Secret data or connection details. The doc and sample code only mention observed resources; please clarify that desired resources are sanitized as well (or explain why not).

design/one-pager-pipeline-inspector.md

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@design/one-pager-pipeline-inspector.md`:
- Around line 408-416: The text currently claims splitting RPCs keeps messages
within the default 4MB gRPC limit but omits that JSON encoding can inflate
payload size; update the paragraph referencing RunFunctionRequest,
RunFunctionResponse and the EmitRequest/EmitResponse examples to explicitly note
that JSON marshaling may increase message size versus protobuf, and recommend
that operators either increase the sidecar flag (--max-recv-msg-size) or enable
compression on the sidecar/clients as mitigation; adjust the “Sidecar
configuration” sentence to mention both bumping --max-recv-msg-size and optional
compression as recommended defenses.
- Around line 394-395: The documentation line for `composite_resource_uid` is a
fragment; update the description to a full sentence by adding an explicit
subject (e.g., change "Can be used to group all function calls for the same
resource." to "It can be used to group all function calls for the same
resource."), modifying the `composite_resource_uid` bullet in the doc to improve
readability.
- Around line 303-317: Update SocketPipelineInspector.EmitRequest to sanitize
the desired state as well as observed: after cloning RunFunctionRequest and
clearing Credentials and observed resources' ConnectionDetails/Secret.data, also
clear sanitizedReq.GetDesired().GetComposite().GetResource().ConnectionDetails
and iterate sanitizedReq.GetDesired().GetResources() to set each
Resource.ConnectionDetails = nil and, if the resource Kind/Type indicates a
Secret, clear its data field. Ensure the documentation text mentions sanitizing
both observed and desired resources and Secrets before emission.

design/one-pager-pipeline-inspector.md

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@design/one-pager-pipeline-inspector.md`:
- Around line 107-127: When step.BuildMetadata(ctx, name, req) returns an error
in Runner.RunFunction, ensure we preserve the fail‑open behavior by populating a
safe, non‑nil StepMeta rather than leaving meta nil: after the error path set
meta to a minimal partial step.StepMeta (e.g., with FunctionName/name and any
safe defaults or empty values) and keep logging the BuildMetadata error; then
call inspector.EmitRequest and inspector.EmitResponse with that safe meta (or
alternatively make EmitRequest/EmitResponse tolerate nil by adding guards),
ensuring no nil deref occurs while preserving the current proceed-on-error
semantics.

♻️ Duplicate comments (5)

design/one-pager-pipeline-inspector.md (5)

9-22: Wrap long Markdown lines to ~100 columns.

A few paragraphs and bullets here exceed the 100‑column guideline; could we
reflow them for readability? Thank you! As per coding guidelines, please wrap
Markdown lines at ~100 columns unless a link would be less readable.

303-317: Sanitize desired resources (and Secret data) before emission.

The snippet strips observed resources only; desired resources can also carry
connection details and Secret data. Could we extend the sanitization text and
example to cover desired state as well? Thanks!

408-416: Note JSON expansion vs 4MB gRPC limit.

Even with split RPCs, JSON marshaling can inflate payloads beyond the default
4MB limit. Could we mention this and suggest max‑recv‑msg‑size or compression as
mitigation? Appreciate it!

303-306: Minor wording polish for clarity.

Could we replace “On top of that” with a tighter phrasing (e.g., “Also” or
“Additionally”)? Thanks!

394-395: Complete the sentence for composite_resource_uid.

Line 394 reads as a fragment; consider adding a subject (e.g., “It can be used
to group…”). Thanks!

design/one-pager-pipeline-inspector.md

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@design/one-pager-pipeline-inspector.md`:
- Around line 285-287: The `iteration` field is ambiguous; pick a single
semantics and update the doc comment and any related descriptions to match:
either make `iteration` a reconciliation-wide monotonically increasing counter
(shared by all steps in the same reconciliation) or make it a per-step retry
counter (unique to each step). Update the comment on the `iteration` field (and
any mentions elsewhere in the document) to clearly state which choice was made,
e.g., "Monotonically increasing reconciliation-wide counter..." or "Per-step
retry counter, reset for each step...", and ensure all examples and references
to iteration semantics (shared vs per-step) consistently use that definition.
- Around line 302-307: Update the "Security: Credential Stripping" section to
explicitly require sanitization of the pipeline RunFunctionRequest.context in
addition to clearing RunFunctionRequest.credentials and Secret.data: define the
chosen strategy (allowlist, denylist, or opt‑in emission) and the exact behavior
(which keys are emitted, which are redacted, and default policy), state who
controls the policy and how to opt in/out, and add a brief example of
sanitization rules so implementers know to filter context keys before emitting
RunFunctionRequest.

🧹 Nitpick comments (1)

design/one-pager-pipeline-inspector.md (1)

79-82: Would a cache‑hit indicator help downstream debugging?

Since the wrapper runs before the cached runner, a cache hit will still emit req/resp without an actual function call. Would it be useful to include a cache_hit (or response_source) field in StepMeta to help consumers distinguish cached vs executed steps?

design/one-pager-pipeline-inspector.md

coderabbitai · 2026-01-19T14:06:29Z

design/one-pager-pipeline-inspector.md

+### Security: Credential Stripping
+
+The `RunFunctionRequest` includes `credentials` and connection details fields that may contain sensitive data, **this field must be cleared before emission**.
+
+On top of that, if any composed resource is a Secret, its `data` field must also be cleared before emission.
+


⚠️ Potential issue | 🟠 Major

Consider redacting/controlling pipeline context data.

Functions can write arbitrary data into the pipeline context, which is included in RunFunctionRequest. If that data can be sensitive, emitting it verbatim risks leaking secrets/PII. Should we define a sanitization strategy (e.g., allowlist, denylist, or opt‑in emission for context keys) in this section? Thanks for considering.

✏️ Suggested doc addition

-The `RunFunctionRequest` includes `credentials` and connection details fields that may contain sensitive data, **this field must be cleared before emission**. +The `RunFunctionRequest` includes `credentials` and connection details fields that may contain sensitive data, **this field must be cleared before emission**. +Additionally, the pipeline `context` can contain arbitrary data written by functions; consider a +clear policy for sanitizing or selectively emitting context keys to avoid leaking secrets/PII.

🧰 Tools

🪛 LanguageTool

[style] ~305-~305: ‘On top of that’ might be wordy. Consider a shorter alternative.
Context: ...eld must be cleared before emission**. On top of that, if any composed resource is a Secret, ...

(EN_WORDINESS_PREMIUM_ON_TOP_OF_THAT)

🤖 Prompt for AI Agents

In `@design/one-pager-pipeline-inspector.md` around lines 302 - 307, Update the "Security: Credential Stripping" section to explicitly require sanitization of the pipeline RunFunctionRequest.context in addition to clearing RunFunctionRequest.credentials and Secret.data: define the chosen strategy (allowlist, denylist, or opt‑in emission) and the exact behavior (which keys are emitted, which are redacted, and default policy), state who controls the policy and how to opt in/out, and add a brief example of sanitization rules so implementers know to filter context keys before emitting RunFunctionRequest.

jbw976

This is looking good @phisco! Just a couple clarifications on some of the topics we've been chatting about, but otherwise I'll go ahead and put an approve on this now. Thanks! ✅

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@design/one-pager-pipeline-inspector.md`:
- Around line 347-349: The design currently emits the pipeline "context" field
as-is which creates a security risk; update the document to treat context as
potentially sensitive by (1) proposing an allowlist approach for safe context
keys and an explicit API/configuration flag to control context emission, (2)
requiring that the sidecar or pipeline implementation sanitize or redact context
entries (esp. user-written values) before emitting composed resources, and (3)
adding clear guidance for authors about what may never be written to context
(secrets, tokens, PII) and documenting default behavior and override mechanism;
reference the "context" field, composed resource handling for Secrets (clear
`data`), and the sidecar/pipeline emission path in the text so reviewers can
find and apply these changes.
- Around line 204-217: The sanitizer must detect Kubernetes Secrets by checking
the extracted metadata: if meta.CompositeResourceKind (from BuildMetadata)
equals "Secret" (compare case-insensitively and optionally verify
apiVersion/group indicates core/v1), then treat it as sensitive; implement the
sanitizer branch to "drop data" by clearing the Secret payload fields (set the
Secret's data and stringData fields to an empty map or nil) and avoid returning
any secret values (you can also remove binaryData if present), and
record/annotate that data was stripped so callers know content was redacted.
Ensure the check references meta.CompositeResourceKind and
meta.CompositeResourceAPIVersion and the sanitizer function that currently
contains the Secret placeholders so the logic replaces the // ... with these
explicit checks and field clears.

🧹 Nitpick comments (2)

design/one-pager-pipeline-inspector.md (2)

403-406: Consider expanding observability metrics for the inspector itself.

Currently only one error counter is defined. For operational visibility, would it be helpful to add:

Success metrics: function_run_function_pipeline_inspector_emits_total (to see if it's working at all)

Latency metrics: Histogram of emit duration to detect when sidecars are slow

Sidecar health: A gauge indicating whether the sidecar connection is healthy

These would help operators distinguish between "feature disabled," "sidecar down," and "sidecar slow" scenarios. Since this is an observability feature, having good observability of the feature itself seems valuable!

What do you think?

479-513: Consider adding operational guidance for sidecar lifecycle.

The Helm configuration looks good! One thing I'm curious about: what happens if the sidecar crashes or isn't ready when Crossplane starts emitting data?

The fail-open behavior handles runtime failures well (continues on timeout), but should the design document mention:

Whether the sidecar needs readiness/liveness probes

How Crossplane handles initial connection failures (retry logic?)

Graceful behavior during sidecar restarts

This might be worth a brief note in the "Fail-Open Behavior" section or here, to help operators understand the operational characteristics. Not a blocker, just wondering if we should set expectations!

coderabbitai · 2026-01-22T08:51:47Z

design/one-pager-pipeline-inspector.md

+    // Extract XR metadata from the request
+    xr := req.GetObserved().GetComposite().GetResource()
+    if xr != nil {
+        meta.CompositeResourceAPIVersion = getStringField(xr, "apiVersion")
+        meta.CompositeResourceKind = getStringField(xr, "kind")
+        if metadata := getStructField(xr, "metadata"); metadata != nil {
+            meta.CompositeResourceName = getStringField(metadata, "name")
+            meta.CompositeResourceNamespace = getStringField(metadata, "namespace")
+            meta.CompositeResourceUID = getStringField(metadata, "uid")
+        }
+    }
+
+    return &meta, nil
+}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Search for existing Secret kind detection patterns in the codebase rg -nP --type=go -C3 '\bkind.*[Ss]ecret|[Ss]ecret.*kind' | head -100

Repository: crossplane/crossplane

Length of output: 1903

🏁 Script executed:

# Also search for how resources are typically checked for their kind/type in Go rg -nP --type=go -C2 'GetKind\(\)|\.Kind' | head -80

Repository: crossplane/crossplane

Length of output: 4685

🏁 Script executed:

# Read the specific lines in the design document to understand the context cat -n design/one-pager-pipeline-inspector.md | sed -n '200,220p'

Repository: crossplane/crossplane

Length of output: 1092

🏁 Script executed:

# Read the sanitization section mentioned in the review cat -n design/one-pager-pipeline-inspector.md | sed -n '360,380p'

Repository: crossplane/crossplane

Length of output: 892

🏁 Script executed:

# Read the BuildMetadata section completely to understand what it extracts cat -n design/one-pager-pipeline-inspector.md | sed -n '150,230p'

Repository: crossplane/crossplane

Length of output: 3730

Clarify the Secret detection mechanism in the sanitization section.

Good catch that BuildMetadata extracts the CompositeResourceKind at line 208! However, the sanitization logic at lines 362-363 and 371-372 is shown as placeholders (// ...) for detecting and clearing Secret data. Could you briefly explain how the sanitizer will identify which resources are Secrets (e.g., by checking kind == "Secret")? Also, clarifying what "drop data" means for Secrets would be helpful—does it mean clearing the entire data field in the Kubernetes Secret structure, or specific sub-fields?

🤖 Prompt for AI Agents

In `@design/one-pager-pipeline-inspector.md` around lines 204 - 217, The sanitizer must detect Kubernetes Secrets by checking the extracted metadata: if meta.CompositeResourceKind (from BuildMetadata) equals "Secret" (compare case-insensitively and optionally verify apiVersion/group indicates core/v1), then treat it as sensitive; implement the sanitizer branch to "drop data" by clearing the Secret payload fields (set the Secret's data and stringData fields to an empty map or nil) and avoid returning any secret values (you can also remove binaryData if present), and record/annotate that data was stripped so callers know content was redacted. Ensure the check references meta.CompositeResourceKind and meta.CompositeResourceAPIVersion and the sanitizer function that currently contains the Secret placeholders so the logic replaces the // ... with these explicit checks and field clears.

the implementation will address that.

@phisco, understood! It makes sense to work out those implementation details when coding the actual sanitizer. The design document establishes the intent to redact Secret data, and the specifics of detection and field handling can be refined during development.

Thanks for clarifying! 👍

🧠 Learnings used

Learnt from: CR Repo: crossplane/crossplane PR: 0 File: contributing/README.md:0-0 Timestamp: 2025-12-02T14:41:54.485Z Learning: Review code for functional correctness, proper documentation and intent explanation, adherence to Coding Style guidelines, idiomatic alignment with existing Crossplane APIs, sufficient test coverage, DCO sign-off, and passing CI jobs

Learnt from: CR Repo: crossplane/crossplane PR: 0 File: contributing/README.md:0-0 Timestamp: 2025-12-02T14:41:54.485Z Learning: Discuss changes in a GitHub issue before starting implementation; new features can only be merged during the active development period of a Crossplane release cycle

coderabbitai · 2026-01-22T08:51:47Z

design/one-pager-pipeline-inspector.md

+Additionally, if any composed resource (observed or desired) is a Secret, its `data` field must
+also be cleared before emission. The `context` field is passed through as-is, since functions
+already have access to it and it typically contains non-sensitive configuration data.


⚠️ Potential issue | 🟠 Major

Reconsider the security boundary for the context field.

The current design passes the pipeline context field through as-is with the reasoning that "functions already have access to it and it typically contains non-sensitive configuration data." However, I'm concerned about this for a few reasons:

Different trust boundaries: Functions run as part of the pipeline with certain security constraints, while the sidecar is a separate component that may have different access patterns or be implemented by third parties.

User-defined data: Functions can write arbitrary data to the context, and we can't guarantee it's non-sensitive. For example, a custom function might cache intermediate computed values, tokens, or other data.

"Typically" isn't "always": The phrase "typically contains non-sensitive configuration data" acknowledges there could be exceptions.

Could you clarify the decision here? Options might include:

Define an allowlist of context keys that are safe to emit

Add a configuration option to control context emission

Document that implementations should sanitize context

Provide guidance on what should/shouldn't be written to context

Past reviews flagged this concern; I want to make sure we've thought through the security implications. Thanks!

Based on learnings, security review is critical for Crossplane.

🤖 Prompt for AI Agents

In `@design/one-pager-pipeline-inspector.md` around lines 347 - 349, The design currently emits the pipeline "context" field as-is which creates a security risk; update the document to treat context as potentially sensitive by (1) proposing an allowlist approach for safe context keys and an explicit API/configuration flag to control context emission, (2) requiring that the sidecar or pipeline implementation sanitize or redact context entries (esp. user-written values) before emitting composed resources, and (3) adding clear guidance for authors about what may never be written to context (secrets, tokens, PII) and documenting default behavior and override mechanism; reference the "context" field, composed resource handling for Secrets (clear `data`), and the sidecar/pipeline emission path in the text so reviewers can find and apply these changes.

that's a fair concern, but there is little we can do here unfortunately, i think that's where we should draw the boundary between what we need to solve on crossplane side and what downstream implementations should solve.

@phisco, that makes sense—I appreciate the clarification on where the boundary lies. Following the pattern of minimal upstream hooks with downstream implementations handling the specifics is architecturally consistent with other Crossplane designs like change logs.

Since the responsibility for sanitizing the context field falls to downstream implementations, would it make sense to add a note in the design document highlighting this? Something like:

Note on Context Security: The pipeline context field is emitted as-is. Downstream sidecar implementations are responsible for sanitizing or filtering context entries based on their security requirements and trust boundaries. Platform teams should review what data functions write to the context and configure their inspector implementations accordingly.

This would help ensure downstream implementers are aware of the security considerations when building their sidecars. But if you feel the current documentation is sufficient, I'm good with the decision as-is. Thanks for thinking through the design boundary with me! 🙏

✏️ Learnings added

Learnt from: phisco Repo: crossplane/crossplane PR: 7025 File: design/one-pager-pipeline-inspector.md:347-349 Timestamp: 2026-01-22T17:27:31.137Z Learning: In the Pipeline Inspector design for Crossplane, the pipeline `context` field is intentionally emitted as-is to downstream implementations. The security boundary is drawn such that Crossplane provides the minimal hook mechanism, and downstream sidecar implementations are responsible for sanitizing or filtering the context based on their security requirements.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: CR Repo: crossplane/crossplane PR: 0 File: contributing/README.md:0-0 Timestamp: 2025-12-02T14:41:54.485Z Learning: Review code for functional correctness, proper documentation and intent explanation, adherence to Coding Style guidelines, idiomatic alignment with existing Crossplane APIs, sufficient test coverage, DCO sign-off, and passing CI jobs

Learnt from: CR Repo: crossplane/crossplane PR: 0 File: contributing/README.md:0-0 Timestamp: 2025-12-02T14:41:54.485Z Learning: Discuss changes in a GitHub issue before starting implementation; new features can only be merged during the active development period of a Crossplane release cycle

phisco · 2026-01-22T17:30:25Z

@haarchri suggested that it'd be useful to also know whether the response was cached, unfortunately at a quick look it doesn't look so easy, as the context only flows down, not up, so we can't leverage that directly to pass the information up from the cached runner to the inspected wrapping runner. So, I'd prefer to proceed with this and then add that in the future.

negz

I didn't get around to reading this design until it was merged, though I discussed with @phisco and was aligned with the direction. LGTM overall but a few comments to consider for the implementation.

negz · 2026-01-31T00:56:11Z

design/one-pager-pipeline-inspector.md

+    // ...
+
+    // Inject step metadata into context
+    stepCtx := step.ContextWithStepMeta(ctx, traceID, stepIndex, compositionName, 0)


The iteration index, starts from zero, it's increased if unsatisfied requirements are received and another call is needed.

dropped from the signature, given that it's always going to be initialized to 0 there

negz · 2026-01-31T00:57:11Z

design/one-pager-pipeline-inspector.md

+    }
+
+    // Generate a unique span_id for this function invocation
+    meta.SpanID = uuid.NewString()


Worth using ulid or some other ordered UID for span and trace IDs?

They have no such guarantee in OTEL, and although I agree it would be nice, I don't want to give more guarantees than required. It should be completely opaque for now.

negz · 2026-01-31T01:00:47Z

design/one-pager-pipeline-inspector.md

+    // Name of the Composition defining this pipeline.
+    string composition_name = 6;
+
+    // UID of the composite resource being reconciled.
+    string composite_resource_uid = 7;
+
+    // Name of the composite resource being reconciled.
+    string composite_resource_name = 8;
+
+    // Namespace of the composite resource (empty for cluster-scoped resources).
+    string composite_resource_namespace = 9;
+
+    // API version of the composite resource (e.g., "example.org/v1").
+    string composite_resource_api_version = 10;
+
+    // Kind of the composite resource (e.g., "XDatabase").
+    string composite_resource_kind = 11;


What if it's an operation function? Worth having distinct sub-messages for composition and operation?

negz · 2026-01-31T01:02:31Z

design/one-pager-pipeline-inspector.md

+    // The original function request as JSON bytes (with credentials stripped for security).
+    // This allows consumers to parse the request without needing the proto schema.
+    bytes request = 1;


The tradeoff of JSON bytes is more space on the wire and potentially on disk, right? Is expecting consumers to have the protobuf schema unreasonable?

I don't think they'd need the same schema as Crossplane. Though to be fair if they didn't they might miss fields that were actually in the request (which JSON wouldn't).

This is going to the sidecar, so the wire size is less of an issue. Then if the sidecar wants to convert it back to a more storage/network optimized encoding for its own purposes, it can. I think it makes sense to use plain json here and have as little assumptions as possible for sidecars.

negz · 2026-01-31T01:07:02Z

design/one-pager-pipeline-inspector.md

+
+### gRPC Message Size Limits
+
+By default, each `EmitRequest` and `EmitResponse` call stays within the 4MB gRPC limit. However, users who have increased the `--max-recv-msg-size` on their functions may have larger payloads.


Does it? It's more likely to be within the limit, but I imagine a request very close to the limit + the step meta could push it over.

phisco requested review from a team and negz as code owners January 16, 2026 15:31

phisco requested review from jbw976 and removed request for jbw976 January 16, 2026 15:31

jbw976 reviewed Jan 18, 2026

View reviewed changes

coderabbitai bot reviewed Jan 18, 2026

View reviewed changes

design/one-pager-pipeline-inspector.md Outdated Show resolved Hide resolved

design/one-pager-pipeline-inspector.md Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

design/one-pager-pipeline-inspector.md Outdated Show resolved Hide resolved

design/one-pager-pipeline-inspector.md Outdated Show resolved Hide resolved

design/one-pager-pipeline-inspector.md Show resolved Hide resolved

phisco added 3 commits January 19, 2026 14:28

Pipeline inspector hook one-pager

bcceb67

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

add reference to kubecon talk

5c370a8

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

review

6acc763

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

phisco force-pushed the one-pager-sidecar-inspector-upstream branch from 4f3adf4 to 9d6e440 Compare January 19, 2026 13:28

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

design/one-pager-pipeline-inspector.md Show resolved Hide resolved

refinements from POC

39d74c8

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

phisco force-pushed the one-pager-sidecar-inspector-upstream branch from 9d6e440 to 39d74c8 Compare January 19, 2026 14:02

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

jbw976 approved these changes Jan 21, 2026

View reviewed changes

reviews

d69369c

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

This was referenced Jan 21, 2026

Pipeline inspector implementation #7031

Merged

feat: pipeline inspector protobuf/gRPC definitions crossplane/crossplane-runtime#908

Merged

Align with upstream proto definition in crossplane/crossplane-runtime

688d51d

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

coderabbitai bot reviewed Jan 22, 2026

View reviewed changes

phisco mentioned this pull request Jan 22, 2026

feat: initial implementation crossplane/inspector-sidecar#2

Merged

2 tasks

phisco merged commit cb59913 into crossplane:main Jan 22, 2026
24 of 25 checks passed

negz reviewed Jan 31, 2026

View reviewed changes

phisco mentioned this pull request Feb 2, 2026

refactor(pipelineinspector): use oneof for operation/composition context crossplane/crossplane-runtime#914

Merged

5 tasks

BrewTestBot mentioned this pull request Feb 18, 2026

crossplane 2.2.0 Homebrew/homebrew-core#268085

Merged


		### Security: Credential Stripping

		The `RunFunctionRequest` includes a `credentials` field that may contain sensitive data (secrets, connection details). This field must be cleared before emission:


		### gRPC Message Size Limits

		By default, each `EmitRequest` and `EmitResponse` call stays within the 4MB gRPC limit. However, users who have increased the `--max-recv-msg-size` on their functions may have larger payloads.

Conversation

phisco commented Jan 16, 2026

Description of your changes

Uh oh!

coderabbitai bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

jbw976 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

jbw976 left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

phisco commented Jan 22, 2026

Uh oh!

coderabbitai bot commented Jan 16, 2026 •

edited

Loading

coderabbitai bot Jan 22, 2026 •

edited

Loading

coderabbitai bot Jan 22, 2026 •

edited

Loading