Skip to content

Extremely many client threads are created though acvtives is set #1932

@Jaskey

Description

@Jaskey

Duubo version 2.5.3

I encounter a problem that some times my client will OOM due to too many threads. From the dump logs, I can see over 10,000 DubboClientHandler Threads in my consumer and they are in blocked state.

I copied one pieace of the log like
--------------- P R O C E S S ---------------

Java Threads: ( => current thread )
  0x00007fbe84c99800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1582" daemon [_thread_new, id=4697, stack(0x00007fb2b3274000,0x00007fb2b3375000)]
  0x00007fbe84c97800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1581" daemon [_thread_new, id=4696, stack(0x00007fb2b3375000,0x00007fb2b3476000)]
  0x00007fbe84c95800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1580" daemon [_thread_blocked, id=4695, stack(0x00007fb2b3476000,0x00007fb2b3577000)]
  0x00007fbe84c93800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1579" daemon [_thread_blocked, id=4694, stack(0x00007fb2b3577000,0x00007fb2b3678000)]
=>0x00007fbe84c91800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1578" daemon [_thread_new, id=4693, stack(0x00007fb2b3678000,0x00007fb2b3779000)]
  0x00007fbe84c8f800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1577" daemon [_thread_blocked, id=4692, stack(0x00007fb2b3779000,0x00007fb2b387a000)]
  0x00007fbe84c8d800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1576" daemon [_thread_blocked, id=4691, stack(0x00007fb2b387a000,0x00007fb2b397b000)]
  0x00007fbe84c8b800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1575" daemon [_thread_blocked, id=4690, stack(0x00007fb2b397b000,0x00007fb2b3a7c000)]
  0x00007fbe84c89800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1574" daemon [_thread_blocked, id=4689, stack(0x00007fb2b3a7c000,0x00007fb2b3b7d000)]
  0x00007fbe84c87000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1573" daemon [_thread_blocked, id=4688, stack(0x00007fb2b3b7d000,0x00007fb2b3c7e000)]
  0x00007fbe84c85000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1572" daemon [_thread_blocked, id=4687, stack(0x00007fb2b3c7e000,0x00007fb2b3d7f000)]
  0x00007fbe84c83000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1571" daemon [_thread_blocked, id=4686, stack(0x00007fb2b3d7f000,0x00007fb2b3e80000)]
  0x00007fbe84c81000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1570" daemon [_thread_blocked, id=4685, stack(0x00007fb2b3e80000,0x00007fb2b3f81000)]
  0x00007fbe84c7f000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1569" daemon [_thread_blocked, id=4684, stack(0x00007fb2b3f81000,0x00007fb2b4082000)]
  0x00007fbe84c7d000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1568" daemon [_thread_blocked, id=4683, stack(0x00007fb2b4082000,0x00007fb2b4183000)]
  0x00007fbe84c7b000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1567" daemon [_thread_blocked, id=4682, stack(0x00007fb2b4183000,0x00007fb2b4284000)]
  0x00007fbe84c79000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1566" daemon [_thread_blocked, id=4681, stack(0x00007fb2b4284000,0x00007fb2b4385000)]
  0x00007fbe84c77000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1565" daemon [_thread_blocked, id=4680, stack(0x00007fb2b4385000,0x00007fb2b4486000)]
  0x00007fbe84c75000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1564" daemon [_thread_blocked, id=4679, stack(0x00007fb2b4486000,0x00007fb2b4587000)]
  0x00007fbe84c73000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1563" daemon [_thread_blocked, id=4678, stack(0x00007fb2b4587000,0x00007fb2b4688000)]
  0x00007fbe84c71000 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1562" daemon [_thread_blocked, id=4677, stack(0x00007fb2b4688000,0x00007fb2b4789000)]
  0x00007fbe84c6e800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1561" daemon [_thread_blocked, id=4676, stack(0x00007fb2b4789000,0x00007fb2b488a000)]
  0x00007fbe84c6c800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1560" daemon [_thread_blocked, id=4675, stack(0x00007fb2b488a000,0x00007fb2b498b000)]
  0x00007fbe84c6a800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1559" daemon [_thread_blocked, id=4674, stack(0x00007fb2b498b000,0x00007fb2b4a8c000)]
  0x00007fbe84c68800 JavaThread "DubboClientHandler-10.12.156.10:9006-thread-1558" daemon [_thread_blocked, id=4673, stack(0x00007fb2b4a8c000,0x00007fb2b4b8d000)]
..........

As you can see, there are more than 1000 threads for only one dubbo providers instance 10.12.156.10 and most of them are in blocked state.

But actually, I have set the actives = 20, why there are still so many threads here? Is this a bug?

side = consumer
methods = addUnicastMessage,computeSlot,generateInboxIndex,getMessageList,getShardingConfig,addMultiMessage,deleteMessage,addTaskMessage
dubbo = 2.5.3
loadbalance = roundrobin
pid = 5052
check = false
interface = com.oppo.push.inbox.api.service.InboxService
actives = 20
revision = 1.0.0-20180529.054252-15
application = push-open-ndispatcher
category = consumers
timestamp = 1528851951549


Update 2018/06/20

Finnaly, I got a jstack dump by jstack my process every one minute and then I collect the latest one before my process crashes.

Indeed, I found tens of thousands of threads are created and they are all in waiting condition, waiting lockback.

"DubboClientHandler-10.12.158.6:9006-thread-1202" #36020 daemon prio=5 os_prio=0 tid=0x00007f5fd0607800 nid=0x716b waiting on condition [0x00007f5a1a454000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000004402fdca8> (a java.util.concurrent.locks.ReentrantLock$FairSync)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at ch.qos.logback.core.OutputStreamAppender.subAppend(OutputStreamAppender.java:210)
	at ch.qos.logback.core.rolling.RollingFileAppender.subAppend(RollingFileAppender.java:228)
	at ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:100)
	at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:84)
	at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:48)
	at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:270)
	at ch.qos.logback.classic.Logger.callAppenders(Logger.java:257)
	at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:421)
	at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:383)
	at ch.qos.logback.classic.Logger.log(Logger.java:765)
	at org.apache.log4j.Category.differentiatedLog(Category.java:193)
	at org.apache.log4j.Category.log(Category.java:297)
	at com.alibaba.dubbo.common.logger.log4j.Log4jLogger.warn(Log4jLogger.java:70)
	at com.alibaba.dubbo.common.logger.support.FailsafeLogger.warn(FailsafeLogger.java:107)
	at com.alibaba.dubbo.remoting.exchange.support.DefaultFuture.received(DefaultFuture.java:242)
	at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.handleResponse(HeaderExchangeHandler.java:96)
	at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.received(HeaderExchangeHandler.java:177)
	at com.alibaba.dubbo.remoting.transport.DecodeHandler.received(DecodeHandler.java:52)
	at com.alibaba.dubbo.remoting.transport.dispatcher.ChannelEventRunnable.run(ChannelEventRunnable.java:82)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

And I look into the code , it seems there are a netty server in the client which listen response from the server, and after the time out message is received, it will log a message, which is executing in a thread pool.

But the client is a CACHED THREAD POOL which has no limit! So in such case, when there are a lot of time out message receive in the same time, client will suffer from large number of threads, which may cause OOM.

My logback configuration is :


    <appender name="file_dubbo" class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>/data/logs/opush-ts-dispatcher/opush_dispatcher_task_dubbo.log</file>

        <rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
            <fileNamePattern>/data/logs/opush-ts-dispatcher/opush_dispatcher_task_dubbo.%i.log</fileNamePattern>
            <minIndex>1</minIndex>
            <maxIndex>10</maxIndex>
        </rollingPolicy>

        <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
            <maxFileSize>50MB</maxFileSize>
        </triggeringPolicy>
        <encoder>
            <pattern>%-20(%d{HH:mm:ss.SSS} [%thread]) %-5level %logger{80} - %msg%n</pattern>
        </encoder>
    </appender>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions