-
Notifications
You must be signed in to change notification settings - Fork 24.6k
Description
Describe the bug
I'm not sure if this is right place to report this issue, because it seems like a problem with Redis clients. But the same issue is present in all clients that I've checked (Lettuce, Redisson, Jedis, go-redis).
In a case of a sudden connection loss Redis clients are not able detect network problems, and will be listening for Pub/Sub messages on a broken TCP connection for hours, making Pub/Sub unusable.
To reproduce
- Start a Redis on Host A
- Connect to a Pub/Sub using one of the Redis clients from Host B
- Block all traffic on Host A to a Redis server using iptables or other tool
- Redis client will not discover that the connection is lost.
- Now restart Redis on Host A, and restore network traffic.
- Redis client will be listening on connection that no longer exist on the server-side.
I've managed to reproduce this behavior using three different Java clients, and go-redis. Ticket for Lettuce with more details: redis/lettuce#1428
Expected behavior
Redis clients subscribed to a Pub/Sub should be able to detect a broken network connection, and reconnect when necessary.
Additional information
The undocumented workaround for this issue is to tweak OS parameters on a client's host: SO_KEEPALIVE, TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT.
It's similar to what redis-cli client is doing in application layer:
Line 908 in 1c71038
| anetKeepAlive(NULL, context->fd, REDIS_CLI_KEEPALIVE_INTERVAL); |
Line 95 in efb6495
| int anetKeepAlive(char *err, int fd, int interval) |
Is there is any other way of making reliable Pub/Sub subscriptions without changing OS parameters?
Shouldn't all Redis clients change socket parameters in application layer like redis-cli?