-
Notifications
You must be signed in to change notification settings - Fork 4k
SharedResourceHolder should roughly handle exceptions during close #6002
Description
@xCASx reported a bug where gRPC would get "hung" after a point, which included the following stack trace:
Jul 22, 2019 7:37:47 PM io.grpc.internal.LogExceptionRunnable run
SEVERE: Exception while executing runnable io.grpc.internal.SharedResourceHolder$2@73cfe40e
io.grpc.netty.shaded.io.netty.channel.ChannelException: eventfd_write() failed: Bad file descriptor
at io.grpc.netty.shaded.io.netty.channel.epoll.Native.eventFdWrite(Native Method)
at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.wakeup(EpollEventLoop.java:167)
at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:603)
at io.grpc.netty.shaded.io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:163)
at io.grpc.netty.shaded.io.grpc.netty.Utils$DefaultEventLoopGroupResource.close(Utils.java:346)
at io.grpc.netty.shaded.io.grpc.netty.Utils$DefaultEventLoopGroupResource.close(Utils.java:318)
at io.grpc.internal.SharedResourceHolder$2.run(SharedResourceHolder.java:145)
at io.grpc.internal.LogExceptionRunnable.run(LogExceptionRunnable.java:43)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
If we look in SharedResourceHolder, if resource.close(instance) fails throws instances.remove(resource) is not run. While we shouldn't encourage exceptions to be thrown during close, we would like it to be able to recover eventually. In this case, any future resource fetches will get the partially-closed resource, which will immediately fail.
grpc-java/core/src/main/java/io/grpc/internal/SharedResourceHolder.java
Lines 145 to 146 in d7b9438
| resource.close(instance); | |
| instances.remove(resource); |
@xCASx, a workaround would be to keep the client objects alive for as long as possible. We generally encourage that for performance, but as long as one client object is alive we won't attempt to shut down this executor.