Skip to content

perf(ast): optimize Set.Intersect and Set.Diff#8167

Merged
sspaink merged 2 commits intoopen-policy-agent:mainfrom
thevilledev:perf/optimise-set-diff-intersect
Jan 6, 2026
Merged

perf(ast): optimize Set.Intersect and Set.Diff#8167
sspaink merged 2 commits intoopen-policy-agent:mainfrom
thevilledev:perf/optimise-set-diff-intersect

Conversation

@thevilledev
Copy link
Copy Markdown
Contributor

@thevilledev thevilledev commented Jan 1, 2026

Why the changes in this PR are needed?

Currently the Set.Intersect and Set.Diff methods have two inefficiencies:

  1. Operations do not require ordered iteration, yet the keys are sorted.
  2. Terms are collected to an intermediate slice, causing a re-hash and re-insert once the set is generated.

This PR optimises both, showing 10-20% less memory usage in these operations.

What are the changes in this PR?

Build result sets directly using newset() and insert() instead of collecting terms into an intermediate slice and eventually passing to NewSet(). This eliminates:

  • an intermediate slice allocation
  • redundant hash computations when NewSet() re-inserts all terms
  • unnecessary sorting via sortedKeys() since membership checks don't require order

Notes to assist PR review:

When users access the result via Iter(), String(), Foreach(), etc., these methods call sortedKeys() which sorts in-place on first access. The old code sorted during construction and on access.

Benchmarked with:

go test -bench="BenchmarkSetIntersection" -benchmem -run=^$ -count=10 ./v1/ast/...

Benchstat output:

cpu: Apple M1 Pro
                                    │ master.out  │               fix.out               │
                                    │   sec/op    │    sec/op     vs base               │
SetIntersection/5-8                   341.7n ± 2%   347.6n ±  6%       ~ (p=0.645 n=10)
SetIntersection/50-8                  2.731µ ± 3%   2.662µ ± 10%       ~ (p=0.353 n=10)
SetIntersection/500-8                 27.30µ ± 2%   26.42µ ±  3%  -3.21% (p=0.011 n=10)
SetIntersection/5000-8                433.0µ ± 8%   422.6µ ±  1%  -2.40% (p=0.007 n=10)
SetIntersectionDifferentSize/4-8      279.7n ± 1%   289.6n ± 23%  +3.54% (p=0.024 n=10)
SetIntersectionDifferentSize/50-8     327.0n ± 7%   313.5n ± 12%       ~ (p=0.063 n=10)
SetIntersectionDifferentSize/500-8    326.7n ± 1%   312.6n ±  8%  -4.32% (p=0.022 n=10)
SetIntersectionDifferentSize/5000-8   352.2n ± 1%   333.5n ±  1%  -5.32% (p=0.000 n=10)
geomean                               1.812µ        1.773µ        -2.11%
                                    │  master.out  │                fix.out                 │
                                    │     B/op     │     B/op      vs base                  │
SetIntersection/5-8                     352.0 ± 0%     304.0 ± 0%  -13.64% (p=0.000 n=10)
SetIntersection/50-8                  2.086Ki ± 0%   1.680Ki ± 0%  -19.48% (p=0.000 n=10)
SetIntersection/500-8                 26.15Ki ± 0%   22.15Ki ± 0%  -15.30% (p=0.000 n=10)
SetIntersection/5000-8                224.4Ki ± 0%   184.4Ki ± 0%  -17.82% (p=0.000 n=10)
SetIntersectionDifferentSize/4-8        288.0 ± 0%     288.0 ± 0%        ~ (p=1.000 n=10) ¹
SetIntersectionDifferentSize/50-8       336.0 ± 0%     304.0 ± 0%   -9.52% (p=0.000 n=10)
SetIntersectionDifferentSize/500-8      336.0 ± 0%     304.0 ± 0%   -9.52% (p=0.000 n=10)
SetIntersectionDifferentSize/5000-8     336.0 ± 0%     304.0 ± 0%   -9.52% (p=0.000 n=10)
geomean                               1.595Ki        1.403Ki       -12.03%
¹ all samples are equal
                                    │ master.out │               fix.out                │
                                    │ allocs/op  │ allocs/op   vs base                  │
SetIntersection/5-8                   5.000 ± 0%   4.000 ± 0%  -20.00% (p=0.000 n=10)
SetIntersection/50-8                  7.000 ± 0%   6.000 ± 0%  -14.29% (p=0.000 n=10)
SetIntersection/500-8                 7.000 ± 0%   6.000 ± 0%  -14.29% (p=0.000 n=10)
SetIntersection/5000-8                21.00 ± 0%   20.00 ± 0%   -4.76% (p=0.000 n=10)
SetIntersectionDifferentSize/4-8      4.000 ± 0%   4.000 ± 0%        ~ (p=1.000 n=10) ¹
SetIntersectionDifferentSize/50-8     5.000 ± 0%   4.000 ± 0%  -20.00% (p=0.000 n=10)
SetIntersectionDifferentSize/500-8    5.000 ± 0%   4.000 ± 0%  -20.00% (p=0.000 n=10)
SetIntersectionDifferentSize/5000-8   5.000 ± 0%   4.000 ± 0%  -20.00% (p=0.000 n=10)
geomean                               6.328        5.413       -14.46%
¹ all samples are equal

Further comments:

The Union() method has a similar issue, but I'll introduce a proposal for it in a separate PR.

Build result sets directly using newset() and insert() instead of
collecting terms into an intermediate slice and eventually passing
to NewSet(). This eliminates:

- an intermediate slice allocation
- redundant hash computations when NewSet() re-inserts all terms
- unnecessary sorting via sortedKeys() since membership checks
  don't require order

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
@netlify
Copy link
Copy Markdown

netlify bot commented Jan 1, 2026

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit ec77880
🔍 Latest deploy log https://app.netlify.com/projects/openpolicyagent/deploys/695d4c82681ca50008cea0e0
😎 Deploy Preview https://deploy-preview-8167--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@thevilledev thevilledev marked this pull request as ready for review January 1, 2026 11:53
Copy link
Copy Markdown
Member

@anderseknert anderseknert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks, Ville 👍

@sspaink sspaink merged commit 3d49f45 into open-policy-agent:main Jan 6, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants