Retry follow task when remote connection queue full#55314
Retry follow task when remote connection queue full#55314dnhatn merged 4 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-distributed (:Distributed/CCR) |
|
|
||
| // this setting is intentionally not registered, it is only used in tests | ||
| public static final Setting<Integer> REMOTE_MAX_CONNECTION_QUEUE_SIZE = | ||
| Setting.intSetting("cluster.remote.max_connection_queue_size", 100, Setting.Property.NodeScope); |
There was a problem hiding this comment.
I don't think there was a lot of thought to the connection listener limit. If there is a strong reason to increase it past 100 we could probably do that. Also does does this name make sense? We only allow a single connection round at a time. Should the name be cluster.remote.max_pending_connection_listeners?
There was a problem hiding this comment.
Should the name be cluster.remote.max_pending_connection_listeners?
++. I renamed it in f9c807f.
I don't think there was a lot of thought to the connection listener limit. If there is a strong reason to increase it past 100 we could probably do that.
Yeah, I think we chose this value quite arbitrarily. I think it's fine to increase this value as we should not have many concurrent remote searches, and CCR will retry on this error anyway. I've increased this to 1000. WDYT?
|
@tbrooks8 Thanks for reviewing. |
If more than 100 shard-follow tasks are trying to connect to the remote cluster, then some of them will abort with "connect listener queue is full". This is because we retry on ESRejectedExecutionException, but not on RejectedExecutionException.
If more than 100 shard-follow tasks are trying to connect to the remote cluster, then some of them will abort with "connect listener queue is full". This is because we retry on ESRejectedExecutionException, but not on RejectedExecutionException.
If more than 100 shard-follow tasks are trying to connect to the remote cluster, then some of them will abort with "connect listener queue is full". This is because we retry on ESRejectedExecutionException, but not on RejectedExecutionException.
If more than 100 shard-follow tasks are trying to connect to the remote cluster, then some of them will abort with "connect listener queue is full". This is because we retry on ESRejectedExecutionException, but not on RejectedExecutionException. Backport of #55314
If more than 100 shard-follow tasks are trying to connect to the remote cluster, then some of them will abort with "connect listener queue is full". This is because we retry on ESRejectedExecutionException, but not on RejectedExecutionException.