-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Extremely many client threads are created though acvtives is set #1932
Description
Duubo version 2.5.3
I encounter a problem that some times my client will OOM due to too many threads. From the dump logs, I can see over 10,000 DubboClientHandler Threads in my consumer and they are in blocked state.
I copied one pieace of the log like
--------------- P R O C E S S ---------------
Java Threads: ( => current thread )
0x00007fbe84c99800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1582" daemon [_thread_new, id=4697, stack(0x00007fb2b3274000,0x00007fb2b3375000)]
0x00007fbe84c97800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1581" daemon [_thread_new, id=4696, stack(0x00007fb2b3375000,0x00007fb2b3476000)]
0x00007fbe84c95800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1580" daemon [_thread_blocked, id=4695, stack(0x00007fb2b3476000,0x00007fb2b3577000)]
0x00007fbe84c93800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1579" daemon [_thread_blocked, id=4694, stack(0x00007fb2b3577000,0x00007fb2b3678000)]
=>0x00007fbe84c91800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1578" daemon [_thread_new, id=4693, stack(0x00007fb2b3678000,0x00007fb2b3779000)]
0x00007fbe84c8f800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1577" daemon [_thread_blocked, id=4692, stack(0x00007fb2b3779000,0x00007fb2b387a000)]
0x00007fbe84c8d800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1576" daemon [_thread_blocked, id=4691, stack(0x00007fb2b387a000,0x00007fb2b397b000)]
0x00007fbe84c8b800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1575" daemon [_thread_blocked, id=4690, stack(0x00007fb2b397b000,0x00007fb2b3a7c000)]
0x00007fbe84c89800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1574" daemon [_thread_blocked, id=4689, stack(0x00007fb2b3a7c000,0x00007fb2b3b7d000)]
0x00007fbe84c87000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1573" daemon [_thread_blocked, id=4688, stack(0x00007fb2b3b7d000,0x00007fb2b3c7e000)]
0x00007fbe84c85000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1572" daemon [_thread_blocked, id=4687, stack(0x00007fb2b3c7e000,0x00007fb2b3d7f000)]
0x00007fbe84c83000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1571" daemon [_thread_blocked, id=4686, stack(0x00007fb2b3d7f000,0x00007fb2b3e80000)]
0x00007fbe84c81000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1570" daemon [_thread_blocked, id=4685, stack(0x00007fb2b3e80000,0x00007fb2b3f81000)]
0x00007fbe84c7f000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1569" daemon [_thread_blocked, id=4684, stack(0x00007fb2b3f81000,0x00007fb2b4082000)]
0x00007fbe84c7d000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1568" daemon [_thread_blocked, id=4683, stack(0x00007fb2b4082000,0x00007fb2b4183000)]
0x00007fbe84c7b000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1567" daemon [_thread_blocked, id=4682, stack(0x00007fb2b4183000,0x00007fb2b4284000)]
0x00007fbe84c79000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1566" daemon [_thread_blocked, id=4681, stack(0x00007fb2b4284000,0x00007fb2b4385000)]
0x00007fbe84c77000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1565" daemon [_thread_blocked, id=4680, stack(0x00007fb2b4385000,0x00007fb2b4486000)]
0x00007fbe84c75000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1564" daemon [_thread_blocked, id=4679, stack(0x00007fb2b4486000,0x00007fb2b4587000)]
0x00007fbe84c73000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1563" daemon [_thread_blocked, id=4678, stack(0x00007fb2b4587000,0x00007fb2b4688000)]
0x00007fbe84c71000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1562" daemon [_thread_blocked, id=4677, stack(0x00007fb2b4688000,0x00007fb2b4789000)]
0x00007fbe84c6e800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1561" daemon [_thread_blocked, id=4676, stack(0x00007fb2b4789000,0x00007fb2b488a000)]
0x00007fbe84c6c800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1560" daemon [_thread_blocked, id=4675, stack(0x00007fb2b488a000,0x00007fb2b498b000)]
0x00007fbe84c6a800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1559" daemon [_thread_blocked, id=4674, stack(0x00007fb2b498b000,0x00007fb2b4a8c000)]
0x00007fbe84c68800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1558" daemon [_thread_blocked, id=4673, stack(0x00007fb2b4a8c000,0x00007fb2b4b8d000)]
..........
As you can see, there are more than 1000 threads for only one dubbo providers instance 10.12.156.10 and most of them are in blocked state.
But actually, I have set the actives = 20, why there are still so many threads here? Is this a bug?
side = consumer
methods = addUnicastMessage,computeSlot,generateInboxIndex,getMessageList,getShardingConfig,addMultiMessage,deleteMessage,addTaskMessage
dubbo = 2.5.3
loadbalance = roundrobin
pid = 5052
check = false
interface = com.oppo.push.inbox.api.service.InboxService
actives = 20
revision = 1.0.0-20180529.054252-15
application = push-open-ndispatcher
category = consumers
timestamp = 1528851951549
Update 2018/06/20
Finnaly, I got a jstack dump by jstack my process every one minute and then I collect the latest one before my process crashes.
Indeed, I found tens of thousands of threads are created and they are all in waiting condition, waiting lockback.
"DubboClientHandler-10.12.158.6:9006-thread-1202" #36020 daemon prio=5 os_prio=0 tid=0x00007f5fd0607800 nid=0x716b waiting on condition [0x00007f5a1a454000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000004402fdca8> (a java.util.concurrent.locks.ReentrantLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at ch.qos.logback.core.OutputStreamAppender.subAppend(OutputStreamAppender.java:210)
at ch.qos.logback.core.rolling.RollingFileAppender.subAppend(RollingFileAppender.java:228)
at ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:100)
at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:84)
at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:48)
at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:270)
at ch.qos.logback.classic.Logger.callAppenders(Logger.java:257)
at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:421)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:383)
at ch.qos.logback.classic.Logger.log(Logger.java:765)
at org.apache.log4j.Category.differentiatedLog(Category.java:193)
at org.apache.log4j.Category.log(Category.java:297)
at com.alibaba.dubbo.common.logger.log4j.Log4jLogger.warn(Log4jLogger.java:70)
at com.alibaba.dubbo.common.logger.support.FailsafeLogger.warn(FailsafeLogger.java:107)
at com.alibaba.dubbo.remoting.exchange.support.DefaultFuture.received(DefaultFuture.java:242)
at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.handleResponse(HeaderExchangeHandler.java:96)
at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.received(HeaderExchangeHandler.java:177)
at com.alibaba.dubbo.remoting.transport.DecodeHandler.received(DecodeHandler.java:52)
at com.alibaba.dubbo.remoting.transport.dispatcher.ChannelEventRunnable.run(ChannelEventRunnable.java:82)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
And I look into the code , it seems there are a netty server in the client which listen response from the server, and after the time out message is received, it will log a message, which is executing in a thread pool.
But the client is a CACHED THREAD POOL which has no limit! So in such case, when there are a lot of time out message receive in the same time, client will suffer from large number of threads, which may cause OOM.
My logback configuration is :
<appender name="file_dubbo" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>/data/logs/opush-ts-dispatcher/opush_dispatcher_task_dubbo.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<fileNamePattern>/data/logs/opush-ts-dispatcher/opush_dispatcher_task_dubbo.%i.log</fileNamePattern>
<minIndex>1</minIndex>
<maxIndex>10</maxIndex>
</rollingPolicy>
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<maxFileSize>50MB</maxFileSize>
</triggeringPolicy>
<encoder>
<pattern>%-20(%d{HH:mm:ss.SSS} [%thread]) %-5level %logger{80} - %msg%n</pattern>
</encoder>
</appender>