Skip to content

Server-side CPU Affinity problem caused by driver #645

@greg-pendlebury

Description

@greg-pendlebury

Driver version or jar name

6.1.5 or greater. We have only tested up to 6.2.2.1 though

SQL Server version

SQL Server 2012

Client operating system

Windows Server 2012 R2

Java/JVM version

1.8.0_u152 Hotspot

Problem description

It looks like #160 (introduced in 6.1.5 onwards) introduces some highly undesirable behaviour regarding CPU affinity on servers with two NUMA nodes. That change introduces two connections being opened in sequence and one of them is then closed. The server side of the connection is pinned to a NUMA node in a round robin fashion, but because every second connection is closed it results in a horrible resource allocation imbalance on the remaining connections.

Our Java application (JBoss EAP 7) opens 300 connections at maximum and maintains them under heavy use in a connection pool. We have observed all connections allocated to a single node, or nearly all... sometimes a random connection when the pool is scaling up (like a DBA trying to work out what the hell is going on) will alter the pattern slightly and all new connections then flip to the other node.

The server is then essentially restrained to operate at half capacity because half of the CPUs are idle.

We have reproduced in JMeter as well, using different driver versions and 'multiSubnetFailover=true'. This problem does not occur on versions 6.1.4 and below. Our assumption is #160 is the cause, but perhaps that is incorrect.

Expected behavior and actual behavior

The connections should have been distributed evenly across NUMA nodes... they were instead mostly on a single node. Perfmon shows us that when people are not actively investigating by establishing their own connections it will perfectly pin all 300 connections to a single NUMA node.

Repro code

  1. Use 'multiSubnetFailover=true' in your JDBC connection string.
  2. Connect to a server with two NUMA nodes.
  3. Maintain all open connections for the duration of the test.
  4. Run some needlessly expensive queries to exercise the CPUs.
  5. Observe half of the CPUs on the machine saturate and half remain idle.
  6. DBAs monitor affinity on the server like so:
--Displays connections per NUMA node for 'Microsoft JDBC Driver 6.1' client interfaces
select c.node_affinity, s.client_interface_name,  count(*) as connections
from sys.dm_exec_connections c
join sys.dm_exec_sessions s on s.session_id = c.session_id
where s.client_interface_name = 'Microsoft JDBC Driver 6.1'
group by node_affinity,  s.client_interface_name 

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions