Skip to content

replication: add remote peer connection timeout#8380

Merged
locker merged 1 commit intotarantool:masterfrom
sergepetrenko:gh-7294-long-replica-connect-fix
Mar 2, 2023
Merged

replication: add remote peer connection timeout#8380
locker merged 1 commit intotarantool:masterfrom
sergepetrenko:gh-7294-long-replica-connect-fix

Conversation

@sergepetrenko
Copy link
Collaborator

We use coio_connect() to connect the replica to a remote peer. It implies no timeout, and does a non-blocking connect() to the peer and then waits for the socket to become writable indefinitely.

When the remote peer changes its IP address, connect() might try connecting to the old address for as long as ~ 2 minutes (given the default tcp_syn_retries value of 6).

This blocks replica from trying to reconnect to the updated address and is pretty inconvenient.

Let's use coio_connect_timeout() instead and use
replication_disconnect_timeout() as a timeout, like everywhere else in master-replica communication.

Closes #7294

NO_DOC=bugfix

We use coio_connect() to connect the replica to a remote peer. It
implies no timeout, and does a non-blocking connect() to the peer and
then waits for the socket to become writable indefinitely.

When the remote peer changes its IP address, connect() might try
connecting to the old address for as long as ~ 2 minutes (given the
default tcp_syn_retries value of 6).

This blocks replica from trying to reconnect to the updated address and
is pretty inconvenient.

Let's use coio_connect_timeout() instead and use
replication_disconnect_timeout() as a timeout, like everywhere else in
master-replica communication.

Closes tarantool#7294

NO_DOC=bugfix
@coveralls
Copy link

Coverage Status

Coverage: 85.558% (+0.02%) from 85.541% when pulling cd29e98 on sergepetrenko:gh-7294-long-replica-connect-fix into a6fae42
on tarantool:master
.

@locker locker removed their assignment Mar 1, 2023
Copy link
Collaborator

@Gerold103 Gerold103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch!

@locker locker assigned locker and unassigned Gerold103 Mar 2, 2023
@locker locker added the full-ci Enables all tests for a pull request label Mar 2, 2023
@locker locker merged commit 0486a48 into tarantool:master Mar 2, 2023
@locker
Copy link
Member

locker commented Mar 2, 2023

Cherry-picked to 2.11.

@sergepetrenko sergepetrenko deleted the gh-7294-long-replica-connect-fix branch March 6, 2023 10:26
mkostoevr added a commit to mkostoevr/tarantool that referenced this pull request Oct 27, 2025
It's been used by the applier originally, but in commit 0486a48
("replication: add remote peer connection timeout"), the function is
only used in the coio test. Let's drop it completely and rename the
`coio_connect_timeout` to `coio_connect`.

Follows-up tarantool#8380

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
mkostoevr added a commit to mkostoevr/tarantool that referenced this pull request Nov 11, 2025
It's been used by the applier originally, but in commit 0486a48
("replication: add remote peer connection timeout"), the function is
only used in the coio test. Let's drop it completely and rename the
`coio_connect_timeout` to `coio_connect`.

Follows-up tarantool#8380

NO_DOC=refactoring
NO_CHANGELOG=refactoring
locker pushed a commit that referenced this pull request Nov 18, 2025
It's been used by the applier originally, but in commit 0486a48
("replication: add remote peer connection timeout"), the function is
only used in the coio test. Let's drop it completely and rename the
`coio_connect_timeout` to `coio_connect`.

Follows-up #8380

NO_DOC=refactoring
NO_CHANGELOG=refactoring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full-ci Enables all tests for a pull request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Long reconnection to the replica that changed the IP address after reboot

4 participants