Skip to content

DubboProtocol.getSharedClient 方法并发环境下 可能出现 NPE 问题 #6444

@lxk-kk

Description

@lxk-kk
  • I have searched the issues of this repository and believe that this is not a duplicate.
  • I have checked the FAQ of this repository and believe that this is not a duplicate.

Environment

  • Dubbo version: 2.7.7 及其之前的版本

Steps to reproduce this issue

  1. 贴出源码
private final ConcurrentMap<String, Object> locks = new ConcurrentHashMap<>();
    private List<ReferenceCountExchangeClient> getSharedClient(URL url, int connectNum) {
        String key = url.getAddress();
        // 略去无关代码

        // 【1】
        locks.putIfAbsent(key, new Object());

       // 【2】
        synchronized (locks.get(key)) {
            clients = referenceClientMap.get(key);
            if (checkClientCanUse(clients)) {
                batchClientRefIncr(clients);
                return clients;
            }
            connectNum = Math.max(connectNum, 1);
            if (CollectionUtils.isEmpty(clients)) {
                clients = buildReferenceCountExchangeClientList(url, connectNum);
                referenceClientMap.put(key, clients);
            } else {
                for (int i = 0; i < clients.size(); i++) {
                    ReferenceCountExchangeClient referenceCountExchangeClient = clients.get(i);
                    if (referenceCountExchangeClient == null || referenceCountExchangeClient.isClosed()) {
                        clients.set(i, buildReferenceCountExchangeClient(url));
                        continue;
                    }
                    referenceCountExchangeClient.incrementAndGetCount();
                }
            }

            // 【3】
            locks.remove(key);
            return clients;
        }
    }

  1. 问题描述:
  • 注意上述 【1】、【2】、【3】处代码
  • 假设这种情况:并发环境下,线程 A、B 持有相同的 key,执行到该方法时。如果 A 执行到 同步块中,并在 locks.remove(key) 真正 remove 之前,线程 B 刚好执行到 locks.putIfAbsent(key, new Object())
  • 当 B 执行完该语句后,在 B 执行 locks.get(key) 之前,正好 A remove 完成。
  • 此时,B 执行到 locks.get(key) 将拿到 null,此时 抛出 NPE
  1. 问题提出:
  • 查看过历史 issue 后,使用 ConcurrentHashMap 解决了历史版本中 key.intern() 潜在性的死锁问题,但是随之也可能会有上述的 NPE 问题,所以这是被允许的吗?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions