Merged
Conversation
Replace stack-allocated 16 KB buffers in pump() with pooled 4 KB slices. Stack-allocated arrays force goroutine stacks to grow to 32 KB and never shrink. Pooled heap buffers keep stacks at 2-4 KB. 4 KB is safe because the TLS layer handles record reassembly internally — smaller relay chunks do not increase syscalls. Replace the ctx.Done() cleanup goroutine with context.AfterFunc, which avoids a dedicated goroutine during the relay lifetime.
Replace stack-allocated 16 KB buffer in start() with a pooled slice, reducing the goroutine stack from 32 KB to 2-4 KB. Merge Clock's timer goroutine directly into the start() loop, eliminating one goroutine and one channel per connection. The semantics are preserved: timer fires, data is processed, timer resets. Backpressure works identically since the timer is not reset until the current iteration completes. Remove clock.go and clock_test.go — Clock behavior is covered by the existing conn_test.go integration tests.
Replace the ctx.Done() goroutine in ServeConn with context.AfterFunc. This eliminates a goroutine that was alive for the entire connection duration, saving ~2 KB of stack per connection. The AfterFunc callback only spawns a goroutine when cancellation actually occurs.
0aebb86 to
718dec0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sync.Pooland reduce from 16 KB to 4 KB — stack-allocated arrays forced goroutine stacks to 32 KB (never shrink); pooled heap buffers keep stacks at 2–4 KBstart()— eliminates one goroutine + one channel per connectioncontext.AfterFuncin relay and proxy — saves ~4 KB per connection (two fewer goroutines)Per-connection impact
Production measurement (Amsterdam)
Old binary: 27 228 KB RSS @ ~45 connections → ~160 KB/conn
New binary: 25 664 KB RSS @ 61 connections → ~93 KB/conn
Observed reduction: ~42% per connection.
Safety
context.AfterFuncis semantically equivalent to the replaced goroutines-raceTest plan
go vet ./...go test ./...— all packages passgo test -race ./mtglib/internal/relay/ ./mtglib/internal/doppel/ ./mtglib/— no racesGOOS=linux GOARCH=amd64