Do not mandate direct bytes in SslHandler queue#9656
Conversation
Motivation: Currently when the SslHandler coalesces outbound bytes it always allocates a direct byte buffer. This does not make sense if the JDK engine is being used as the bytes will have to be copied back to heap bytes for the engine to operate on them. Modifications: Inspect engine type when coalescing outbound bytes and allocate heap buffer if heap bytes are preferred by the engine. Result: Improved performance for JDK engine. Better performance in environments without direct buffer pooling.
|
Can one of the admins verify this patch? |
|
@netty-bot test this please |
|
@tbrooks8 out of interest, do you have any benchmark results you can share? |
|
@johnou unfortunately not. The reason this came to my attention was not because of an obvious performance regression, but because I was seeing direct buffers being allocated in an environment without direct buffer pooling enabled. Netty normal does its best to ensure pooled direct buffers are enabled before using them. As I investigated more I saw this was only happening in the SslHandler. The reasoning behind my decision to always use a heap buffer for the JDK engine, is that if you pass the JDK engine a direct buffer it will internally allocate a byte array and copy the data to that for |
|
@tbrooks8 great catch! |
Motivation: Currently when the SslHandler coalesces outbound bytes it always allocates a direct byte buffer. This does not make sense if the JDK engine is being used as the bytes will have to be copied back to heap bytes for the engine to operate on them. Modifications: Inspect engine type when coalescing outbound bytes and allocate heap buffer if heap bytes are preferred by the engine. Result: Improved performance for JDK engine. Better performance in environments without direct buffer pooling.
|
@tbrooks8 btw are you sure the copy to heap byte array is still a thing in java11 and later ? |
|
@normanmaurer Obviously it is pretty complicated to read through the JDK code with all the indirection. But here is what I currently see (JDK 12): The SSLEngine will eventually call to public final int doFinal(ByteBuffer input, ByteBuffer output)
throws ShortBufferException, IllegalBlockSizeException,
BadPaddingException {
checkCipherState();
if ((input == null) || (output == null)) {
throw new IllegalArgumentException("Buffers must not be null");
}
if (input == output) {
throw new IllegalArgumentException("Input and output buffers must "
+ "not be the same object, consider using buffer.duplicate()");
}
if (output.isReadOnly()) {
throw new ReadOnlyBufferException();
}
chooseFirstProvider();
return spi.engineDoFinal(input, output);
}This calls The protected int engineDoFinal(ByteBuffer input, ByteBuffer output)
throws ShortBufferException, IllegalBlockSizeException,
BadPaddingException {
return bufferCrypt(input, output, false);
}
/**
* Implementation for encryption using ByteBuffers. Used for both
* engineUpdate() and engineDoFinal().
*/
private int bufferCrypt(ByteBuffer input, ByteBuffer output,
boolean isUpdate) throws ShortBufferException,
IllegalBlockSizeException, BadPaddingException {
if ((input == null) || (output == null)) {
throw new NullPointerException
("Input and output buffers must not be null");
}
int inPos = input.position();
int inLimit = input.limit();
int inLen = inLimit - inPos;
if (isUpdate && (inLen == 0)) {
return 0;
}
int outLenNeeded = engineGetOutputSize(inLen);
if (output.remaining() < outLenNeeded) {
throw new ShortBufferException("Need at least " + outLenNeeded
+ " bytes of space in output buffer");
}
boolean a1 = input.hasArray();
boolean a2 = output.hasArray();
int total = 0;
byte[] inArray, outArray;
if (a2) { // output has an accessible byte[]
outArray = output.array();
int outPos = output.position();
int outOfs = output.arrayOffset() + outPos;
if (a1) { // input also has an accessible byte[]
inArray = input.array();
int inOfs = input.arrayOffset() + inPos;
if (isUpdate) {
total = engineUpdate(inArray, inOfs, inLen, outArray, outOfs);
} else {
total = engineDoFinal(inArray, inOfs, inLen, outArray, outOfs);
}
input.position(inLimit);
} else { // input does not have accessible byte[]
inArray = new byte[getTempArraySize(inLen)];
do {
int chunk = Math.min(inLen, inArray.length);
if (chunk > 0) {
input.get(inArray, 0, chunk);
}
int n;
if (isUpdate || (inLen > chunk)) {
n = engineUpdate(inArray, 0, chunk, outArray, outOfs);
} else {
n = engineDoFinal(inArray, 0, chunk, outArray, outOfs);
}
total += n;
outOfs += n;
inLen -= chunk;
} while (inLen > 0);
}
output.position(outPos + total);
} else { // output does not have an accessible byte[]
if (a1) { // but input has an accessible byte[]
inArray = input.array();
int inOfs = input.arrayOffset() + inPos;
if (isUpdate) {
outArray = engineUpdate(inArray, inOfs, inLen);
} else {
outArray = engineDoFinal(inArray, inOfs, inLen);
}
input.position(inLimit);
if (outArray != null && outArray.length != 0) {
output.put(outArray);
total = outArray.length;
}
} else { // input also does not have an accessible byte[]
inArray = new byte[getTempArraySize(inLen)];
do {
int chunk = Math.min(inLen, inArray.length);
if (chunk > 0) {
input.get(inArray, 0, chunk);
}
int n;
if (isUpdate || (inLen > chunk)) {
outArray = engineUpdate(inArray, 0, chunk);
} else {
outArray = engineDoFinal(inArray, 0, chunk);
}
if (outArray != null && outArray.length != 0) {
output.put(outArray);
total += outArray.length;
}
inLen -= chunk;
} while (inLen > 0);
}
}
return total;
}It looks to me that the So I think my understanding is correct. And I have seen in debugging allocations happening when working with AES. But I can investigate more this week if you think there is something I am missing. |
|
Slightly related, but Conscrypt had a very similar issue a few years back, which never ended up getting resolved: google/conscrypt#317 |
Motivation:
Currently when the SslHandler coalesces outbound bytes it always
allocates a direct byte buffer. This does not make sense if the JDK
engine is being used as the bytes will have to be copied back to heap
bytes for the engine to operate on them.
Modifications:
Inspect engine type when coalescing outbound bytes and allocate heap
buffer if heap bytes are preferred by the engine.
Result:
Improved performance for JDK engine. Better performance in environments
without direct buffer pooling.