Feat/improve success rate of new connections#3508
Feat/improve success rate of new connections#3508cyningsun wants to merge 6 commits intoredis:ndyakov/CAE-1088-resp3-notification-handlersfrom
Conversation
…l are likely done in async flow
|
@cyningsun thank you for opening this, will check it with @htemelski-redis and write back. |
| err = ctx.Err() | ||
| return nil, err |
There was a problem hiding this comment.
Shouldn't it call p.freeTurn here as well? Just took a brief look over the PR, haven't done detailed review yet.
There was a problem hiding this comment.
After starting the go-routine, freeTurn() is delegated to this new go-routine
There was a problem hiding this comment.
Ok, understood, but I am wondering how this will be different from the current approach, since the turn will be available only when the connection is either created or fails?
There was a problem hiding this comment.
Yes, you're correct that turn remains the same: Only becomes available after a connection creation attempt concludes (either successfully or with a definitive failure).
There was a problem hiding this comment.
Ok, I think now I understand the idea.
If for whatever reason we fail after the async dialer has started, we will error out to the command, but the potential opened tcp connection will try to be stored back in the pool (if there is place). Can't we optimize this a bit more. Two things we don't know but can help us here:
- Are the other dialers executing at the moment? If so, why don't we provide the initialized connection to someone else that is currently waiting for its dialer to complete?
- Is there place in the pool? This doesn't need to be exactly correct, but if there isn't place in the pool when initially starting the dialer, we can assume there won't be when we complete it, so we can fail early and don't wait for the tcp connection to be created.
WDYT?
There was a problem hiding this comment.
- Regarding your first point (connection sharing): I had similar thoughts about implementing a connection sharing or dial coalescing mechanism to prevent duplicate work. It's a fantastic idea. However, one concern that came to my mind by the time I started this PR is about fairness and potential starvation. If we put the successful connection into a channel for all waiting dialers to consume, the lack of priority in Go channels might mean that the earliest request isn't necessarily the first to get the connection. To address this fairness concern, I believe we could implement using a queue instead of a simple channel. This would ensure that waiters are served in a first-come-first-served manner, preventing any starvation issues. I will implement this approach in the next iteration of the PR.
- Regarding your second point (early failure): In current pool design, the number of turns equals the PoolSize, and acquiring a turn is a prerequisite for starting a dialer. This means that by the time a dialer is started, we've already secured a place for the future connection in the pool. So, logically, a successfully created connection should always have a slot to be stored. I'm concerned I might be missing something here — could you please help me understand if my reasoning is correct or if there's a gap in my understanding of the current implementation?
There was a problem hiding this comment.
@cyningsun to the second point:
In this PR and in the case that is implemented here you are correct, if we have a turn, there should be a slot in the pool to keep the connection. However, if the MaxIdleConns is set the connection won't be kept in the pool.
go-redis/internal/pool/pool.go
Lines 643 to 665 in cb3af08
What we can do, to not waste this connection, is to keep it for some time without checking the
MaxIdleConns on put, but rather cleanup the idles in checkMinIdle (which should be triggered on popIdle?
Overall, if we have reached creation of a new connection it should be the case that there are no usable idles at this time.
There was a problem hiding this comment.
@ndyakov Ah, thank you for the clarification! I will work on integrating this strategy into the implementation.
|
@cyningsun sorry, this was automatically closed when I merged the PR it was based on. Feel free to open it against master. |
|
@ndyakov Got it, thanks for the update! I'll work on getting a new version prepared and open a PR against master when it's ready. |
This is a draft PR to resolve #3497, based on #3418 as requested by @ndyakov. It continues the discussion from #3502