Skip to content

mem: Optimize buffer object re-use#8784

Merged
arjan-bal merged 7 commits into
grpc:masterfrom
arjan-bal:better-recycle-buffer-objects
Jan 9, 2026
Merged

mem: Optimize buffer object re-use#8784
arjan-bal merged 7 commits into
grpc:masterfrom
arjan-bal:better-recycle-buffer-objects

Conversation

@arjan-bal

@arjan-bal arjan-bal commented Dec 22, 2025

Copy link
Copy Markdown
Contributor

Splitting a buffer results in fetching a new buffer object from a sync.Pool. The buffer object is returned back to the pool only once the shared ref count falls to 0. As a result, only one of the buffer objects is returned back to the pool for re-use. The "leaked" buffer objects may cause noticeable allocations when buffers are split more frequently. I noticed this when attempting to remove a buffer copy by replacing the bufio.Reader.

Solution

This PR introduces a root-owner model for the underlying *[]byte within buffer objects. The root object manages the slice's lifecycle, returning it to the pool only when its reference count reaches zero.

When a buffer is split, the new buffer is treated as a child, incrementing the ref counts for both itself and the root. Once a child’s ref count hits zero, it returns itself to the pool and decrements the root’s count.

Additionally, this PR removes the sync.Pool used for *atomic.Int32 by embedding atomic.Int32 as a value field within the buffer struct. By eliminating the second pool and the associated pointer indirection, we reduce allocation overhead and improve cache locality during buffer lifecycle events.

Benchmarks

A micro-benchmark showing the buffer object leak:

func BenchmarkSplit(b *testing.B) {
	pool := mem.DefaultBufferPool()
	b.Run("split", func(b *testing.B) {
		for b.Loop() {
			size := 1 << 15 // 32 KB
			slice := pool.Get(size)
			buf := mem.NewBuffer(slice, pool)
			left, right := mem.SplitUnsafe(buf, size/2)
			left.Free()
			right.Free()
		}
	})
	b.Run("no-split", func(b *testing.B) {
		for b.Loop() {
			size := 1 << 15 // 32 KB
			slice := pool.Get(size)
			buf := mem.NewBuffer(slice, pool)
			buf.Free()
		}
	})
}

Result on master vs this PR.

goos: linux
goarch: amd64
pkg: google.golang.org/grpc/mem
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
                  │   old.txt   │               new.txt               │
                  │   sec/op    │   sec/op     vs base                │
Split/split-48      418.2n ± 0%   263.9n ± 1%  -36.89% (p=0.000 n=10)
Split/no-split-48   221.1n ± 1%   208.5n ± 0%   -5.70% (p=0.000 n=10)
geomean             304.1n        234.6n       -22.86%

                  │   old.txt    │                 new.txt                 │
                  │     B/op     │    B/op     vs base                     │
Split/split-48      64.00 ± 0%      0.00 ± 0%  -100.00% (p=0.000 n=10)
Split/no-split-48   0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                        ²               ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

                  │   old.txt    │                 new.txt                 │
                  │  allocs/op   │ allocs/op   vs base                     │
Split/split-48      1.000 ± 0%     0.000 ± 0%  -100.00% (p=0.000 n=10)
Split/no-split-48   0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                        ²               ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

The effect on local gRPC benchmarks is negligible since the SplitUnsafe function isn't called very frequently.

$ go run benchmark/benchresult/main.go unary-before unary-after       
unary-networkMode_Local-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurr
entCalls_120-reqSize_16000B-respSize_16000B-compressor_off-channelz_false-preloader_false-clientReadBufferSize_-1-c
lientWriteBufferSize_-1-serverReadBufferSize_-1-serverWriteBufferSize_-1-sleepBetweenRPCs_0s-connections_1-recvBuff
erPool_simple-sharedWriteBuffer_false
               Title       Before        After Percentage
            TotalOps      2985694      3024364     1.30%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     74784.94     74784.99     0.00%
           Allocs/op       133.67       133.89     0.00%
             ReqT/op 6369480533.33 6451976533.33     1.30%
            RespT/op 6369480533.33 6451976533.33     1.30%
            50th-Lat   2.410033ms    2.40116ms    -0.37%
            90th-Lat   3.145118ms   3.081771ms    -2.01%
            99th-Lat   3.563055ms   3.629663ms     1.87%
             Avg-Lat   2.410529ms   2.379513ms    -1.29%
           GoVersion     go1.24.8     go1.24.8
         GrpcVersion   1.78.0-dev   1.78.0-dev

RELEASE NOTES:

  • mem: Improve pooling of buffer objects on using SplitUnsafe.

@arjan-bal arjan-bal added this to the 1.79 Release milestone Dec 22, 2025
@arjan-bal arjan-bal added Type: Performance Performance improvements (CPU, network, memory, etc) Area: Transport Includes HTTP/2 client/server and HTTP server handler transports and advanced transport features. labels Dec 22, 2025
@codecov

codecov Bot commented Dec 22, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.44444% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 83.43%. Comparing base (81a00ce) to head (45d246d).
⚠️ Report is 33 commits behind head on master.

Files with missing lines Patch % Lines
mem/buffers.go 94.44% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8784      +/-   ##
==========================================
+ Coverage   83.22%   83.43%   +0.21%     
==========================================
  Files         418      417       -1     
  Lines       32385    32952     +567     
==========================================
+ Hits        26952    27494     +542     
- Misses       4050     4064      +14     
- Partials     1383     1394      +11     
Files with missing lines Coverage Δ
mem/buffers.go 88.17% <94.44%> (+1.30%) ⬆️

... and 51 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@arjan-bal arjan-bal force-pushed the better-recycle-buffer-objects branch from 45c3231 to e1a28ac Compare December 22, 2025 10:11
@arjan-bal arjan-bal requested review from dfawley and easwars December 22, 2025 10:38
@arjan-bal arjan-bal assigned easwars, dfawley and arjan-bal and unassigned easwars and dfawley Dec 22, 2025
@arjan-bal arjan-bal force-pushed the better-recycle-buffer-objects branch 2 times, most recently from 7619ed9 to c33b2ae Compare December 23, 2025 09:42
@arjan-bal arjan-bal force-pushed the better-recycle-buffer-objects branch from c33b2ae to 3331987 Compare December 23, 2025 09:56
@arjan-bal arjan-bal changed the title mem: Correctly recycle buffer objects after SplitUnsafe mem: Optimize buffer re-use Dec 23, 2025
@arjan-bal arjan-bal changed the title mem: Optimize buffer re-use mem: Optimize buffer object re-use Dec 23, 2025
@arjan-bal arjan-bal assigned easwars and dfawley and unassigned arjan-bal Dec 26, 2025
Comment thread mem/buffers.go Outdated
Comment thread mem/buffers.go Outdated
Comment on lines +76 to +79
// initialized enables sanity checks without the overhead of atomic
// operations. This field is not safe for concurrent access and is used in a
// best-effort manner for assertion purposes only. It does not play a role
// in the concurrent logic of reference counting.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of things here:

  • The Buffer interface does that a buffer is not safe for concurrent access. Given that, do we need this to be mentioned here?
  • Do you have an idea of how much overhead the atomic operation of checking if the ref count is zero causes? The reason I'm asking is because this new field (and the checks associated with it) are sprinkled across multiple methods and I'm wondering if the code complexity (and the maintenance costs) are worth it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a little confused about this line from the docstring:

// Note that a Buffer is not safe for concurrent access and instead each
// goroutine should use its own reference to the data, which can be acquired via
// a call to Ref().

A call to Ref simply increments the reference count. It does not return a new reference to the existing buffer that can be used concurrently. Do we ever use buffers concurrently?

Also, why did we earlier have a pointer to an atomic and not store the atomic by value?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Buffer interface documentation states that a buffer is not safe for concurrent access. Given that, do we need to explicitly mention this here?

A call to Ref simply increments the reference count. It does not return a new reference to the existing buffer that can be used concurrently. Do we ever use buffers concurrently?

In the initial design, buf.Ref() likely returned a new object intended to be transferred to a separate goroutine:

ref := buf.Ref()
go func() {
  // use ref here
}()
buf.Free()

However, in the merged implementation, Ref does not return a new object. So, the usage pattern becomes:

buf.Ref()
go func() {
  // use buf here
}()
buf.Free()

Technically, this implies buf is being accessed concurrently. However, the specific pattern that is unsafe is attempting to reference buf in a new goroutine without incrementing the count first:

go func() {
  // Unsafe: Race condition with buf.Free() below
  ref := buf.Ref() 
}()
buf.Free()

Source: #8209 (comment)

Yes, we do follow the safe pattern above by pushing data frame buffers into an unbounded channel to be consumed by another goroutine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an idea of how much overhead the atomic operation of checking if the ref count is zero causes? The reason I'm asking is because this new field (and the checks associated with it) are sprinkled across multiple methods and I'm wondering if the code complexity (and the maintenance costs) are worth it?

Earlier there was a check if b.refs == nil, which is not possible using a non-pointer field. Using initialized provides the test coverage.

There are some methods such are Ref and Free which perform atomic operations anyways, so we can check the return value for validation. However, for method like ReadData that don't perform atomic operations, the overhead is significant. According to Gemini, an atomic operation is roughly 10x-15x slower than a similar non-atomic operation under low contention and the difference becomes orders of magnitude larger under high contention.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why did we earlier have a pointer to an atomic and not store the atomic by value?

Previously, the new buffer created by SplitUnsafe pointed to the same atomic.Int32 as the original buffer, which required the field to be a pointer. Now, the new object maintains its own ref count and stores a pointer to the original buffer instead. Therefore, the reference count (atomic.Uint32) no longer needs to be a pointer.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the information. That helps.

I would still like to see if there is actually any significant performance improvement by having the initialized field. The if b.refs == nil check could also be replaced with a if b.refs.Load() == 0 if there is no significant performance impact.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the following microbenchmark for measuring the impact of introducing an atomic load operation in ReadOnlyData:

func BenchmarkSplit(b *testing.B) {
	pool := mem.DefaultBufferPool()
	size := 1 << 15 // 32 KB
	slice := pool.Get(size)
	buf := mem.NewBuffer(slice, pool)
	b.Run("read-only-data", func(b *testing.B) {
		for b.Loop() {
			_ = buf.ReadOnlyData()
		}
	})
	buf.Free()
}

Here are the results:

goos: linux
goarch: amd64
pkg: google.golang.org/grpc/mem
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
                        │ non-atomic.txt │          atomic.txt           │             master.txt             │
                        │     sec/op     │   sec/op     vs base          │   sec/op     vs base               │
Split/read-only-data-48      2.005n ± 1%   2.020n ± 2%  ~ (p=0.137 n=10)   2.124n ± 1%  +5.94% (p=0.000 n=10)

                        │ non-atomic.txt │           atomic.txt           │           master.txt           │
                        │      B/op      │    B/op     vs base            │    B/op     vs base            │
Split/read-only-data-48       0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

                        │ non-atomic.txt │           atomic.txt           │           master.txt           │
                        │   allocs/op    │ allocs/op   vs base            │ allocs/op   vs base            │
Split/read-only-data-48       0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

The atomic version is ~2% slower than the non-atomic version. Both are faster than the master branch. I've updated the code to use the atomic for the sanity checks too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that we can use therootBuf pointer to check if the buffer has been initialized in the read methods, avoiding the atomic operation since the rootbuf field is set during initialization and unset before sending the buffer back into the pool. This brings the performance to the same level as the non-atomic version.

goos: linux
goarch: amd64
pkg: google.golang.org/grpc/mem
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
                        │ non-atomic.txt │          rootbuf.txt          │
                        │     sec/op     │   sec/op     vs base          │
Split/read-only-data-48      2.005n ± 1%   2.014n ± 0%  ~ (p=0.305 n=10)

                        │ non-atomic.txt │          rootbuf.txt           │
                        │      B/op      │    B/op     vs base            │
Split/read-only-data-48       0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

                        │ non-atomic.txt │          rootbuf.txt           │
                        │   allocs/op    │ allocs/op   vs base            │
Split/read-only-data-48       0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

@easwars easwars assigned arjan-bal and unassigned easwars and dfawley Jan 6, 2026
@arjan-bal arjan-bal removed their assignment Jan 6, 2026

@easwars easwars left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing I need convincing is about the use of the initialized field, as opposed to directly checking the reference count by doing an atomic read of the value. Otherwise LGTM.

Comment thread mem/buffers.go Outdated
Comment on lines +76 to +79
// initialized enables sanity checks without the overhead of atomic
// operations. This field is not safe for concurrent access and is used in a
// best-effort manner for assertion purposes only. It does not play a role
// in the concurrent logic of reference counting.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the information. That helps.

I would still like to see if there is actually any significant performance improvement by having the initialized field. The if b.refs == nil check could also be replaced with a if b.refs.Load() == 0 if there is no significant performance impact.

@easwars easwars assigned arjan-bal and unassigned easwars Jan 7, 2026
@arjan-bal arjan-bal assigned easwars and unassigned arjan-bal Jan 8, 2026
@easwars easwars assigned arjan-bal and unassigned easwars Jan 8, 2026
@arjan-bal arjan-bal merged commit d675ef5 into grpc:master Jan 9, 2026
14 checks passed
@arjan-bal arjan-bal deleted the better-recycle-buffer-objects branch January 9, 2026 06:27
mbissa pushed a commit to mbissa/grpc-go that referenced this pull request Feb 16, 2026
[Splitting a
`buffer`](https://github.com/grpc/grpc-go/blob/40466769682557e7179b8c74ba3820cc78d49b4b/mem/buffers.go#L172-L187)
results in fetching a new `buffer` object from a `sync.Pool`. The
`buffer` object is returned back to the pool only once [the shared ref
count falls to
0](https://github.com/grpc/grpc-go/blob/40466769682557e7179b8c74ba3820cc78d49b4b/mem/buffers.go#L152-L155).
As a result, only one of the `buffer` objects is returned back to the
pool for re-use. The "leaked" buffer objects may cause noticeable
allocations when buffers are split more frequently. I noticed this when
[attempting to remove a buffer
copy](https://github.com/grpc/grpc-go/compare/master...arjan-bal:zero-copy-buf-reader?expand=1)
by replacing the bufio.Reader.

## Solution
This PR introduces a root-owner model for the underlying `*[]byte`
within `buffer` objects. The root object manages the slice's lifecycle,
returning it to the pool only when its reference count reaches zero.

When a `buffer` is split, the new `buffer` is treated as a child,
incrementing the ref counts for both itself and the root. Once a child’s
ref count hits zero, it returns itself to the pool and decrements the
root’s count.

Additionally, this PR removes the `sync.Pool` used for `*atomic.Int32`
by embedding `atomic.Int32` as a value field within the `buffer` struct.
By eliminating the second pool and the associated pointer indirection,
we reduce allocation overhead and improve cache locality during buffer
lifecycle events.

## Benchmarks

A micro-benchmark showing the buffer object leak:
```go
func BenchmarkSplit(b *testing.B) {
	pool := mem.DefaultBufferPool()
	b.Run("split", func(b *testing.B) {
		for b.Loop() {
			size := 1 << 15 // 32 KB
			slice := pool.Get(size)
			buf := mem.NewBuffer(slice, pool)
			left, right := mem.SplitUnsafe(buf, size/2)
			left.Free()
			right.Free()
		}
	})
	b.Run("no-split", func(b *testing.B) {
		for b.Loop() {
			size := 1 << 15 // 32 KB
			slice := pool.Get(size)
			buf := mem.NewBuffer(slice, pool)
			buf.Free()
		}
	})
}
```
Result on master vs this PR.
```sh
goos: linux
goarch: amd64
pkg: google.golang.org/grpc/mem
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
                  │   old.txt   │               new.txt               │
                  │   sec/op    │   sec/op     vs base                │
Split/split-48      418.2n ± 0%   263.9n ± 1%  -36.89% (p=0.000 n=10)
Split/no-split-48   221.1n ± 1%   208.5n ± 0%   -5.70% (p=0.000 n=10)
geomean             304.1n        234.6n       -22.86%

                  │   old.txt    │                 new.txt                 │
                  │     B/op     │    B/op     vs base                     │
Split/split-48      64.00 ± 0%      0.00 ± 0%  -100.00% (p=0.000 n=10)
Split/no-split-48   0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                        ²               ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

                  │   old.txt    │                 new.txt                 │
                  │  allocs/op   │ allocs/op   vs base                     │
Split/split-48      1.000 ± 0%     0.000 ± 0%  -100.00% (p=0.000 n=10)
Split/no-split-48   0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                        ²               ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
```

The effect on local gRPC benchmarks is negligible since the
`SplitUnsafe` function isn't called very frequently.
```sh
$ go run benchmark/benchresult/main.go unary-before unary-after       
unary-networkMode_Local-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurr
entCalls_120-reqSize_16000B-respSize_16000B-compressor_off-channelz_false-preloader_false-clientReadBufferSize_-1-c
lientWriteBufferSize_-1-serverReadBufferSize_-1-serverWriteBufferSize_-1-sleepBetweenRPCs_0s-connections_1-recvBuff
erPool_simple-sharedWriteBuffer_false
               Title       Before        After Percentage
            TotalOps      2985694      3024364     1.30%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     74784.94     74784.99     0.00%
           Allocs/op       133.67       133.89     0.00%
             ReqT/op 6369480533.33 6451976533.33     1.30%
            RespT/op 6369480533.33 6451976533.33     1.30%
            50th-Lat   2.410033ms    2.40116ms    -0.37%
            90th-Lat   3.145118ms   3.081771ms    -2.01%
            99th-Lat   3.563055ms   3.629663ms     1.87%
             Avg-Lat   2.410529ms   2.379513ms    -1.29%
           GoVersion     go1.24.8     go1.24.8
         GrpcVersion   1.78.0-dev   1.78.0-dev
```

RELEASE NOTES:
* mem: Improve pooling of `buffer` objects on using `SplitUnsafe`.
nschloe pushed a commit to live-clones/forgejo that referenced this pull request May 28, 2026
…jo) (#12794)

This PR contains the following updates:

| Package | Change | [Age](https://docs.renovatebot.com/merge-confidence/) | [Confidence](https://docs.renovatebot.com/merge-confidence/) |
|---|---|---|---|
| [google.golang.org/grpc](https://github.com/grpc/grpc-go) | `v1.75.0` → `v1.79.3` | ![age](https://developer.mend.io/api/mc/badges/age/go/google.golang.org%2fgrpc/v1.79.3?slim=true) | ![confidence](https://developer.mend.io/api/mc/badges/confidence/go/google.golang.org%2fgrpc/v1.75.0/v1.79.3?slim=true) |

---

### gRPC-Go has an authorization bypass via missing leading slash in :path
[CVE-2026-33186](https://nvd.nist.gov/vuln/detail/CVE-2026-33186) / [GHSA-p77j-4mvh-x3m3](GHSA-p77j-4mvh-x3m3) / [GO-2026-4762](https://pkg.go.dev/vuln/GO-2026-4762)

<details>
<summary>More information</summary>

#### Details
##### Impact
_What kind of vulnerability is it? Who is impacted?_

It is an **Authorization Bypass** resulting from **Improper Input Validation** of the HTTP/2 `:path` pseudo-header.

The gRPC-Go server was too lenient in its routing logic, accepting requests where the `:path` omitted the mandatory leading slash (e.g., `Service/Method` instead of `/Service/Method`). While the server successfully routed these requests to the correct handler, authorization interceptors (including the official `grpc/authz` package) evaluated the raw, non-canonical path string. Consequently, "deny" rules defined using canonical paths (starting with `/`) failed to match the incoming request, allowing it to bypass the policy if a fallback "allow" rule was present.

**Who is impacted?**
This affects gRPC-Go servers that meet both of the following criteria:
1. They use path-based authorization interceptors, such as the official RBAC implementation in `google.golang.org/grpc/authz` or custom interceptors relying on `info.FullMethod` or `grpc.Method(ctx)`.
2. Their security policy contains specific "deny" rules for canonical paths but allows other requests by default (a fallback "allow" rule).

The vulnerability is exploitable by an attacker who can send raw HTTP/2 frames with malformed `:path` headers directly to the gRPC server.

##### Patches
_Has the problem been patched? What versions should users upgrade to?_

Yes, the issue has been patched. The fix ensures that any request with a `:path` that does not start with a leading slash is immediately rejected with a `codes.Unimplemented` error, preventing it from reaching authorization interceptors or handlers with a non-canonical path string.

Users should upgrade to the following versions (or newer):
* **v1.79.3**
* The latest **master** branch.

It is recommended that all users employing path-based authorization (especially `grpc/authz`) upgrade as soon as the patch is available in a tagged release.

##### Workarounds
_Is there a way for users to fix or remediate the vulnerability without upgrading?_

While upgrading is the most secure and recommended path, users can mitigate the vulnerability using one of the following methods:

##### 1. Use a Validating Interceptor (Recommended Mitigation)
Add an "outermost" interceptor to your server that validates the path before any other authorization logic runs:

```go
func pathValidationInterceptor(ctx context.Context, req any, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (any, error) {
    if info.FullMethod == "" || info.FullMethod[0] != '/' {
        return nil, status.Errorf(codes.Unimplemented, "malformed method name")
    }
    return handler(ctx, req)
}

// Ensure this is the FIRST interceptor in your chain
s := grpc.NewServer(
    grpc.ChainUnaryInterceptor(pathValidationInterceptor, authzInterceptor),
)
```

##### 2. Infrastructure-Level Normalization
If your gRPC server is behind a reverse proxy or load balancer (such as Envoy, NGINX, or an L7 Cloud Load Balancer), ensure it is configured to enforce strict HTTP/2 compliance for pseudo-headers and reject or normalize requests where the `:path` header does not start with a leading slash.

##### 3. Policy Hardening
Switch to a "default deny" posture in your authorization policies (explicitly listing all allowed paths and denying everything else) to reduce the risk of bypasses via malformed inputs.

#### Severity
- CVSS Score: 9.1 / 10 (Critical)
- Vector String: `CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N`

#### References
- [https://github.com/grpc/grpc-go/security/advisories/GHSA-p77j-4mvh-x3m3](https://github.com/grpc/grpc-go/security/advisories/GHSA-p77j-4mvh-x3m3)
- [https://nvd.nist.gov/vuln/detail/CVE-2026-33186](https://nvd.nist.gov/vuln/detail/CVE-2026-33186)
- [https://github.com/grpc/grpc-go](https://github.com/grpc/grpc-go)

This data is provided by [OSV](https://osv.dev/vulnerability/GHSA-p77j-4mvh-x3m3) and the [GitHub Advisory Database](https://github.com/github/advisory-database) ([CC-BY 4.0](https://github.com/github/advisory-database/blob/main/LICENSE.md)).
</details>

---

### Authorization bypass in gRPC-Go via missing leading slash in :path in google.golang.org/grpc
[CVE-2026-33186](https://nvd.nist.gov/vuln/detail/CVE-2026-33186) / [GHSA-p77j-4mvh-x3m3](GHSA-p77j-4mvh-x3m3) / [GO-2026-4762](https://pkg.go.dev/vuln/GO-2026-4762)

<details>
<summary>More information</summary>

#### Details
Authorization bypass in gRPC-Go via missing leading slash in :path in google.golang.org/grpc

#### Severity
Unknown

#### References
- [https://github.com/grpc/grpc-go/security/advisories/GHSA-p77j-4mvh-x3m3](https://github.com/grpc/grpc-go/security/advisories/GHSA-p77j-4mvh-x3m3)

This data is provided by [OSV](https://osv.dev/vulnerability/GO-2026-4762) and the [Go Vulnerability Database](https://github.com/golang/vulndb) ([CC-BY 4.0](https://github.com/golang/vulndb#license)).
</details>

---

### Release Notes

<details>
<summary>grpc/grpc-go (google.golang.org/grpc)</summary>

### [`v1.79.3`](https://github.com/grpc/grpc-go/releases/tag/v1.79.3): Release 1.79.3

[Compare Source](grpc/grpc-go@v1.79.2...v1.79.3)

### Security

- server: fix an authorization bypass where malformed :path headers (missing the leading slash) could bypass path-based restricted "deny" rules in interceptors like `grpc/authz`. Any request with a non-canonical path is now immediately rejected with an `Unimplemented` error. ([#&#8203;8981](grpc/grpc-go#8981))

### [`v1.79.2`](https://github.com/grpc/grpc-go/releases/tag/v1.79.2): Release 1.79.2

[Compare Source](grpc/grpc-go@v1.79.1...v1.79.2)

### Bug Fixes

- stats: Prevent redundant error logging in health/ORCA producers by skipping stats/tracing processing when no stats handler is configured. ([#&#8203;8874](grpc/grpc-go#8874))

### [`v1.79.1`](https://github.com/grpc/grpc-go/releases/tag/v1.79.1): Release 1.79.1

[Compare Source](grpc/grpc-go@v1.79.0...v1.79.1)

### Bug Fixes

- grpc: Remove the `-dev` suffix from the User-Agent header. ([#&#8203;8902](grpc/grpc-go#8902))

### [`v1.79.0`](https://github.com/grpc/grpc-go/releases/tag/v1.79.0): Release 1.79.0

[Compare Source](grpc/grpc-go@v1.78.0...v1.79.0)

### API Changes

- mem: Add experimental API `SetDefaultBufferPool` to change the default buffer pool. ([#&#8203;8806](grpc/grpc-go#8806))
  - Special Thanks: [@&#8203;vanja-p](https://github.com/vanja-p)
- experimental/stats: Update `MetricsRecorder` to require embedding the new `UnimplementedMetricsRecorder` (a no-op struct) in all implementations for forward compatibility. ([#&#8203;8780](grpc/grpc-go#8780))

### Behavior Changes

- balancer/weightedtarget: Remove handling of `Addresses` and only handle `Endpoints` in resolver updates. ([#&#8203;8841](grpc/grpc-go#8841))

### New Features

- experimental/stats: Add support for asynchronous gauge metrics through the new `AsyncMetricReporter` and `RegisterAsyncReporter` APIs. ([#&#8203;8780](grpc/grpc-go#8780))
- pickfirst: Add support for weighted random shuffling of endpoints, as described in [gRFC A113](grpc/proposal#535).
  - This is enabled by default, and can be turned off using the environment variable `GRPC_EXPERIMENTAL_PF_WEIGHTED_SHUFFLING`. ([#&#8203;8864](grpc/grpc-go#8864))
- xds: Implement `:authority` rewriting, as specified in [gRFC A81](https://github.com/grpc/proposal/blob/master/A81-xds-authority-rewriting.md). ([#&#8203;8779](grpc/grpc-go#8779))
- balancer/randomsubsetting: Implement the `random_subsetting` LB policy, as specified in [gRFC A68](https://github.com/grpc/proposal/blob/master/A68-random-subsetting.md). ([#&#8203;8650](grpc/grpc-go#8650))
  - Special Thanks: [@&#8203;marek-szews](https://github.com/marek-szews)

### Bug Fixes

- credentials/tls: Fix a bug where the port was not stripped from the authority override before validation. ([#&#8203;8726](grpc/grpc-go#8726))
  - Special Thanks: [@&#8203;Atul1710](https://github.com/Atul1710)
- xds/priority: Fix a bug causing delayed failover to lower-priority clusters when a higher-priority cluster is stuck in `CONNECTING` state. ([#&#8203;8813](grpc/grpc-go#8813))
- health: Fix a bug where health checks failed for clients using legacy compression options (`WithDecompressor` or `RPCDecompressor`). ([#&#8203;8765](grpc/grpc-go#8765))
  - Special Thanks: [@&#8203;sanki92](https://github.com/sanki92)
- transport: Fix an issue where the HTTP/2 server could skip header size checks when terminating a stream early. ([#&#8203;8769](grpc/grpc-go#8769))
  - Special Thanks: [@&#8203;joybestourous](https://github.com/joybestourous)
- server: Propagate status detail headers, if available, when terminating a stream during request header processing. ([#&#8203;8754](grpc/grpc-go#8754))
  - Special Thanks: [@&#8203;joybestourous](https://github.com/joybestourous)

### Performance Improvements

- credentials/alts: Optimize read buffer alignment to reduce copies. ([#&#8203;8791](grpc/grpc-go#8791))
- mem: Optimize pooling and creation of `buffer` objects.  ([#&#8203;8784](grpc/grpc-go#8784))
- transport: Reduce slice re-allocations by reserving slice capacity. ([#&#8203;8797](grpc/grpc-go#8797))

### [`v1.78.0`](https://github.com/grpc/grpc-go/releases/tag/v1.78.0): Release 1.78.0

[Compare Source](grpc/grpc-go@v1.77.0...v1.78.0)

### Behavior Changes

- client: Align URL validation with Go 1.26+ to now reject target URLs with unbracketed colons in the hostname. ([#&#8203;8716](grpc/grpc-go#8716))
  - Special Thanks: [@&#8203;neild](https://github.com/neild)
- transport/client : Return status code `Unknown` on malformed grpc-status. ([#&#8203;8735](grpc/grpc-go#8735))
- - xds/resolver:
  - Drop previous route resources and report an error when no matching virtual host is found.
  - Only log LDS/RDS configuration errors following a successful update and retain the last valid resource to prevent transient failures. ([#&#8203;8711](grpc/grpc-go#8711))

### New Features

- stats/otel: Add backend service label to weighted round robin metrics as part of A89. ([#&#8203;8737](grpc/grpc-go#8737))
- stats/otel: Add subchannel metrics (without the disconnection reason) to eventually replace the pickfirst metrics. ([#&#8203;8738](grpc/grpc-go#8738))
- client: Wait for all pending goroutines to complete when closing a graceful switch balancer. ([#&#8203;8746](grpc/grpc-go#8746))
  - Special Thanks: [@&#8203;twz123](https://github.com/twz123)
- client: Add `experimental.AcceptCompressors` so callers can restrict the `grpc-accept-encoding` header advertised for a call. ([#&#8203;8718](grpc/grpc-go#8718))
  - Special Thanks: [@&#8203;iblancasa](https://github.com/iblancasa)

### Bug Fixes

- xds: Fix a bug in `StringMatcher` where regexes would match incorrectly when ignore\_case is set to true. ([#&#8203;8723](grpc/grpc-go#8723))
- client:
  - Change connectivity state to CONNECTING when creating the name resolver (as part of exiting IDLE).
  - Change connectivity state to TRANSIENT\_FAILURE if name resolver creation fails (as part of exiting IDLE).
  - Change connectivity state to IDLE after idle timeout expires even when current state is TRANSIENT\_FAILURE.
  - Fix a bug that resulted in `OnFinish` call option not being invoked for RPCs where stream creation failed. ([#&#8203;8710](grpc/grpc-go#8710))
- xdsclient: Fix a race in the xdsClient that could lead to resource-not-found errors. ([#&#8203;8627](grpc/grpc-go#8627))

### Performance Improvements

- mem: Round up to nearest 4KiB for pool allocations larger than 1MiB. ([#&#8203;8705](grpc/grpc-go#8705))
  - Special Thanks: [@&#8203;cjc25](https://github.com/cjc25)

### [`v1.77.0`](https://github.com/grpc/grpc-go/releases/tag/v1.77.0): Release 1.77.0

[Compare Source](grpc/grpc-go@v1.76.0...v1.77.0)

### API Changes

- mem: Replace the `Reader` interface with a struct for better performance and maintainability. ([#&#8203;8669](grpc/grpc-go#8669))

### Behavior Changes

- balancer/pickfirst: Remove support for the old `pick_first` LB policy via the environment variable `GRPC_EXPERIMENTAL_ENABLE_NEW_PICK_FIRST=false`. The new `pick_first` has been the default since `v1.71.0`. ([#&#8203;8672](grpc/grpc-go#8672))

### Bug Fixes

- xdsclient: Fix a race condition in the ADS stream implementation that could result in `resource-not-found` errors, causing the gRPC client channel to move to `TransientFailure`. ([#&#8203;8605](grpc/grpc-go#8605))
- client: Ignore HTTP status header for gRPC streams. ([#&#8203;8548](grpc/grpc-go#8548))
- client: Set a read deadline when closing a transport to prevent it from blocking indefinitely on a broken connection. ([#&#8203;8534](grpc/grpc-go#8534))
  - Special Thanks: [@&#8203;jgold2-stripe](https://github.com/jgold2-stripe)
- client: Fix a bug where default port 443 was not automatically added to addresses without a specified port when sent to a proxy.
  - Setting environment variable `GRPC_EXPERIMENTAL_ENABLE_DEFAULT_PORT_FOR_PROXY_TARGET=false` disables this change; please file a bug if any problems are encountered as we will remove this option soon. ([#&#8203;8613](grpc/grpc-go#8613))
- balancer/pickfirst: Fix a bug where duplicate addresses were not being ignored as intended. ([#&#8203;8611](grpc/grpc-go#8611))
- server: Fix a bug that caused overcounting of channelz metrics for successful and failed streams. ([#&#8203;8573](grpc/grpc-go#8573))
  - Special Thanks: [@&#8203;hugehoo](https://github.com/hugehoo)
- balancer/pickfirst: When configured, shuffle addresses in resolver updates that lack endpoints. Since gRPC automatically adds endpoints to resolver updates, this bug only affects custom LB policies that delegate to `pick_first` but don't set endpoints. ([#&#8203;8610](grpc/grpc-go#8610))
- mem: Clear large buffers before re-using. ([#&#8203;8670](grpc/grpc-go#8670))

### Performance Improvements

- transport: Reduce heap allocations to reduce time spent in garbage collection. ([#&#8203;8624](grpc/grpc-go#8624), [#&#8203;8630](grpc/grpc-go#8630), [#&#8203;8639](grpc/grpc-go#8639), [#&#8203;8668](grpc/grpc-go#8668))
- transport: Avoid copies when reading and writing Data frames. ([#&#8203;8657](grpc/grpc-go#8657), [#&#8203;8667](grpc/grpc-go#8667))
- mem: Avoid clearing newly allocated buffers. ([#&#8203;8670](grpc/grpc-go#8670))

### New Features

- outlierdetection: Add metrics specified in [gRFC A91](https://github.com/grpc/proposal/blob/master/A91-outlier-detection-metrics.md). ([#&#8203;8644](grpc/grpc-go#8644))
  - Special Thanks: [@&#8203;davinci26](https://github.com/davinci26), [@&#8203;PardhuKonakanchi](https://github.com/PardhuKonakanchi)
- stats/opentelemetry: Add support for optional label `grpc.lb.backend_service` in per-call metrics ([#&#8203;8637](grpc/grpc-go#8637))
- xds: Add support for JWT Call Credentials as specified in [gRFC A97](https://github.com/grpc/proposal/blob/master/A97-xds-jwt-call-creds.md). Set environment variable `GRPC_EXPERIMENTAL_XDS_BOOTSTRAP_CALL_CREDS=true` to enable this feature. ([#&#8203;8536](grpc/grpc-go#8536))
  - Special Thanks: [@&#8203;dimpavloff](https://github.com/dimpavloff)
- experimental/stats: Add support for up/down counters. ([#&#8203;8581](grpc/grpc-go#8581))

### [`v1.76.0`](https://github.com/grpc/grpc-go/releases/tag/v1.76.0): Release 1.76.0

[Compare Source](grpc/grpc-go@v1.75.1...v1.76.0)

### Dependencies

- Minimum supported Go version is now 1.24 ([#&#8203;8509](grpc/grpc-go#8509))
  - Special Thanks: [@&#8203;kevinGC](https://github.com/kevinGC)

### Bug Fixes

- client: Return status `INTERNAL` when a server sends zero response messages for a unary or client-streaming RPC. ([#&#8203;8523](grpc/grpc-go#8523))
- client: Fail RPCs with status `INTERNAL` instead of `UNKNOWN` upon receiving http headers with status 1xx and  `END_STREAM` flag set. ([#&#8203;8518](grpc/grpc-go#8518))
  - Special Thanks: [@&#8203;vinothkumarr227](https://github.com/vinothkumarr227)
- pick\_first: Fix race condition that could cause pick\_first to get stuck in `IDLE` state on backend address change. ([#&#8203;8615](grpc/grpc-go#8615))

### New Features

- credentials: Add `credentials/jwt` package providing file-based JWT PerRPCCredentials (A97). ([#&#8203;8431](grpc/grpc-go#8431))
  - Special Thanks: [@&#8203;dimpavloff](https://github.com/dimpavloff)

### Performance Improvements

- client: Improve HTTP/2 header size estimate to reduce re-allocations. ([#&#8203;8547](grpc/grpc-go#8547))
- encoding/proto: Avoid redundant message size calculation when marshaling. ([#&#8203;8569](grpc/grpc-go#8569))
  - Special Thanks: [@&#8203;rs-unity](https://github.com/rs-unity)

### [`v1.75.1`](https://github.com/grpc/grpc-go/releases/tag/v1.75.1): Release 1.75.1

[Compare Source](grpc/grpc-go@v1.75.0...v1.75.1)

### Bug Fixes

- transport: Fix a data race while copying headers for stats handlers in the std lib http2 server transport. ([#&#8203;8519](grpc/grpc-go#8519))
- xdsclient:
  - Fix a data race caused while reporting load to LRS. ([#&#8203;8483](grpc/grpc-go#8483))
  - Fix regression preventing empty node IDs when creating an LRS client. ([#&#8203;8483](grpc/grpc-go#8483))
- server: Fix a regression preventing streams from being cancelled or timed out when blocked on flow control. ([#&#8203;8528](grpc/grpc-go#8528))

</details>

---

### Configuration

📅 **Schedule**: (UTC)

- Branch creation
  - ""
- Automerge
  - Between 12:00 AM and 03:59 AM (`* 0-3 * * *`)

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Mend Renovate](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xOTUuMSIsInVwZGF0ZWRJblZlciI6IjQzLjE5NS4xIiwidGFyZ2V0QnJhbmNoIjoiZm9yZ2VqbyIsImxhYmVscyI6WyJkZXBlbmRlbmN5LXVwZ3JhZGUiLCJ0ZXN0L25vdC1uZWVkZWQiXX0=-->

Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/12794
Reviewed-by: Mathieu Fenniak <mfenniak@noreply.codeberg.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: Transport Includes HTTP/2 client/server and HTTP server handler transports and advanced transport features. Type: Performance Performance improvements (CPU, network, memory, etc)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants