Switch UUIDs to UUIDv7#4666
Conversation
|
Do you have any information where querying these UUIDs would be relevant and changing would benefit performance? As far as I can see, these UUIDs are not stored anywhere, and only used for OTEL traces, and event IDs (which needed something unique). Other uses are as part of tests, which just needed a sample value. I think v7 is even (although very marginal) slower; package main
import (
"testing"
"github.com/google/uuid"
)
func BenchmarkUUIDv4(b *testing.B) {
b.ReportAllocs()
for range b.N {
uuid.NewString()
}
}
func BenchmarkUUIDv7(b *testing.B) {
b.ReportAllocs()
for range b.N {
uuid.Must(uuid.NewV7()).String()
}
} |
|
@thaJeztah Good point. The main thing I'm looking for is UUIDv7s for event ids. Reason being we store these in our Postgres db after processing incoming webhooks. If you check out the links I shared above you'll see that insertion performance with Postgres is 30% better with UUIDv7s vs UUIDv4s (i.e. more or less as fast as bigints): The only reason I replaced all occurences with the v7 logic was for consistency. But another option could be leaving everything as is and just using v7s for event IDs? Since they're probably the only thing people store in their own backends. |
|
Oops sorry for the delay, I missed your last comment!
Thanks, with that context, this change makes a lot more sense to me! Sorry if my earlier comment came across bad; mostly trying to avoid code changes for "theoretical cases" - I've run into those in various projects, so context matters! Based on this, I think the change looks reasonable. Some quick digging; if my information is correct, UUIDv7 has less entropy (62-74 bit vs 122 bit) - not sure if that matters here, but just in case it's a concern.
It's probably fine to keep the changes as-is, even if not strictly needed for all. The only concern I had was the utility package (but mostly from a perspective that this project used to have a cc @milosgajdos - in case you have thoughts |
No worries at all! Totally understand.
Yeah there's a bit less entropy because of the timestamp data but they're still considered unique. The chance of collisions is infinitesimally small. The aim with v7 was to solve the index performance issues while maintaining uniqueness. Here's a case study which outlines the real world benefits. A couple of years old but a good read: https://buildkite.com/resources/blog/goodbye-integers-hello-uuids/ |
milosgajdos
left a comment
There was a problem hiding this comment.
Im fine with these changes. LGTM. Thanks
PTAL @thaJeztah
|
Actually @binaryfire mind squashing commits, please |
Ah, yes; LGTM after it's squashed |
Signed-off-by: Raj Siva-Rajah <raj@zapzap.cloud>
|
Commits have been squashed. |
|
@thaJeztah PTAL |
thaJeztah
left a comment
There was a problem hiding this comment.
thanks for the nudge
LGTM, thank you!

This PR switches UUIDs to UUIDv7. UUIDv7s are time ordered which makes them more efficient to store and query.
Here are some UUIDv4 vs UUIDv7 benchmarks with Postgres:
There's no downside to switching - these are still valid UUIDs so the change is fully backwards compatible. Being able to store things like events with UUIDv7 IDs will be beneficial to everyone.
Also, this makes it easier to change the UUID version in the future.
Closes: #4665