Skip to content

Caplin req/resp: stream multi-chunk responses instead of materializing the full body in requestPeer #21616

@yperbasis

Description

@yperbasis

Follow-up from #21562 (raised in review by @domiwei).

Background

#21562 made cl/sentinel/httpreqresp stream the peer response through a per-topic size cap instead of buffering it in the handler, so memory is now bounded by the per-request byte budget. That fixes the original unbounded-buffering OOM.

Remaining improvement

The full response is still materialized once, in service.requestPeer:

data, err := io.ReadAll(resp.Body)
...
return &sentinelproto.ResponseData{Data: data, ...}

For a large by_range request the budget is large (~ requested-count x per-chunk), so this is still a big single allocation; and ResponseData.Data is a single gRPC bytes field, so the whole response also crosses the sentinel->cl/rpc boundary as one blob before parseResponseData decodes it.

Proposal

Decode req/resp chunks incrementally rather than buffering the whole response:

  • keep max_response_bytes as a wire-budget guard, enforced by the counting reader already in place;
  • decode chunks incrementally from the io.Reader, rejecting as soon as the byte budget / chunk count / per-chunk size is exceeded;
  • avoid the second full copy in requestPeer — this likely requires the sentinel SendRequest/SendPeerRequest gRPC to become server-streaming, since ResponseData.Data is one bytes field today.

Peak memory would then be ~ one chunk + the decoded object, the right shape for multi-chunk req/resp.

Note

Not a live OOM — #21562 already bounds the allocation by the per-request budget. This is memory-shape hardening, hence a regular enhancement.

Metadata

Metadata

Assignees

Labels

CaplinCaplin: Consensus Layer, Beacon APINetworking

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions