Skip to content

runtime: make random number generation produce 64 random bits#1799

Closed
akonradi wants to merge 2 commits intoenvoyproxy:masterfrom
akonradi:rand-bugfix
Closed

runtime: make random number generation produce 64 random bits#1799
akonradi wants to merge 2 commits intoenvoyproxy:masterfrom
akonradi:rand-bugfix

Conversation

@akonradi
Copy link
Copy Markdown
Contributor

@akonradi akonradi commented Oct 3, 2017

Fixes #1798

The random number generator uses the std::ranlux48 generator, which
produces 48-bit random numbers. This breaks the hashing HTTP load
balancer, which assumes that all 64 bits are random; without this patch,
it routes all requests to one upstream host.

Signed-off-by: Alex Konradi <akonradi@google.com>
uint64_t random() override { return threadLocalGenerator()(); }
uint64_t random() override {
auto& gen = threadLocalGenerator();
return (gen() << 48) | gen();
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 Oct 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not free and I'm not hugely in favor of calling this twice every time we want a random number. Can you explain the underlying issue more?

Copy link
Copy Markdown
Contributor Author

@akonradi akonradi Oct 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hashing load balancer distributes its N hosts over 1024 buckets, then for each request, selects the nearest bucket to the hash value for the request. If no hash policy matches, the "hash" is randomly generated. A randomly-generated hash, though, has 0's as the top 16 bits. Since the "nearest" bucket is the one that has the smallest value greater than the request's has, the result is that the lowest-valued bucket is always selected, so only one host is ever routed to.

I agree that this is not free, and is probably not the best way to fix this particular degenerate case. It is, however, consistent with the declared API. I don't know whether there are other places where we rely on having 64 random bits, though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If perf is a concern what do you think of splitting/renaming? If we had a random48() and random64() it'd solve the perf problem and increase clarity

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming is an option. Not sure if it's worth doing a bulk rename through the entire code base, but could add a new random64() function or something.

Another option is to switch the underlying pseudo random generator to one that spits out 64 bits. I'm not super familiar with the algorithms but we could look into this.

Another option is to special case the ring hash LB and just swap the returned bytes (just flip each 32-bit quad). That requires knowledge of the underlying algorithm but is then localized to the actual problem site.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason to select std::ranlux48 in the first place? Or was this arbitrary?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did some research and this was recommended as a good balance of randomness and speed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of what is done elsewhere, we should either rename/add or change the documentation for random() because it's misleading.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup agreed. Can you maybe look and see if there is a different algorithm in std:: that we can use to get 64-bits. Also, there might be something in boringssl. cc @PiotrSikora

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added generic RandomGeneratorImpl::random_bytes() as part work on UUID, but it was later simplified and inlined into RandomGeneratorImpl::uuid().

We might want to revisit that code (if it's not slower than ranlux48) instead of rolling our own pseudo-randomness.

There are no checks in Envoy that test for statistical randomness but
this doesn't break any of the functionality tests.

Signed-off-by: Alex Konradi <akonradi@google.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems reasonable to me but all of these parameters basically mean nothing to me without doing a bunch of research. Is there anyone out there that is qualified to review quickly?

@mattklein123
Copy link
Copy Markdown
Member

Not hearing anyone excited about reviewing this. Will take a look in a bit. :)

@mattklein123
Copy link
Copy Markdown
Member

@rshriram is going to inquire with some math people at IBM research.

@alyssawilk
Copy link
Copy Markdown
Contributor

alyssawilk commented Oct 5, 2017 via email

@mattklein123
Copy link
Copy Markdown
Member

@akonradi I think you meant to close this one. :)

@akonradi akonradi deleted the rand-bugfix branch October 9, 2017 17:06
mathetake pushed a commit that referenced this pull request Mar 3, 2026
**Description**
This PR adds more struct definitions and their corresponding JSONMarshal
and JSONUnmarshal methods according to the Anthropic Messages API. These
are metrics that are provided in the latest Anthropic messages API.
Currently, these metrics/metadata are not being tracked for tracing
purposes. Eventually, these metrics should be used in openinference
tracing as seen in

https://github.com/envoyproxy/ai-gateway/blob/fefb039dc230c2da481254d88b27ed034bcf45a4/internal/tracing/openinference/anthropic/messages.go#L154

It does not introduce any tracing as that should be opened in a new
issue. This means that another issue should be opened to include
tracking these metrics.

**Related Issues/PRs (if applicable)**

Fixes #1799 

**Special notes for reviewers (if applicable)**
Claude code was used to write most of the code here, but I manually
checked all of code that it generated. One thing that I noticed is that
some of the previous API schema for Anthropic implemented in this
repository were moved to beta status by Anthropic. This is something
that the maintainers should discuss.

1.
https://platform.claude.com/docs/en/api/messages#body-messages-content
2. https://platform.claude.com/docs/en/api/messages/create
3.
https://platform.claude.com/docs/en/api/beta/messages#body-messages-content
4. https://platform.claude.com/docs/en/api/beta/messages/create

---------

Signed-off-by: Chang Min <changminbark@gmail.com>
Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>
Co-authored-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants