MessageFramer uses
bufferAllocator.allocate(maxFrameSize)
to produce chunks that the transport can write. Currently these chunks are always 4096 bytes regardless of transport or payload size.
Transports cannot coalesce these chunks and typically end up writing each one in a syscall. For large payloads this really hurts performance. Instead we should delegate to the buffer allocator and ask it to produce a chunk up to the payload size. E.g.
bufferAllocator.allocate(len)
Experimentation with HBase has shown that for a 51k payload this reduces syscalls on write by 3x.