-
Notifications
You must be signed in to change notification settings - Fork 930
Performance of streaming API #746
Description
I have an application which receives msgpacked data in chunks of approx. 20 bytes. I can not guarantee that message boundaries are preserved, meaning that e.g. a msgpack array which has a length of 35 bytes may be split in two messages. In order to handle this, I use code like this:
static size_t refill(msgpack_unpacker *upk) {
msgpack_unpacker_reserve_buffer(upk, BUFFERSIZE);
// receive data ...
uint8_t *buf = (uint8_t *) msgpack_unpacker_buffer(upk);
memcpy(buf, reveived_data, received_length);
msgpack_unpacker_buffer_consumed(upk, received_length);
return received_length;
}
do {
ret = msgpack_unpacker_next(&upk, &result);
if(MSGPACK_UNPACK_SUCCESS != ret && MSGPACK_UNPACK_EXTRA_BYTES != ret) {
refill(&upk);
}
} while(ret != MSGPACK_UNPACK_SUCCESS && ret != MSGPACK_UNPACK_EXTRA_BYTES);
I attached a reproducible example, which simulates the receiving of data by copying from a buffer.
msgpack-perf.tar.gz
Applying google-pprof to the attached example, I can see that 90% of the runtime are spend in msgpack_unpacker_next, with 42% in msgpack_unpacker_release_zone. Apparently, msgpack_unpacker_next frees and re-allocates the msgpack_zone on every invocation.
Is there any possibility to speed this up? It seems unnecessarily slow.
The attached tar archive contains my example program, a script to compile and run it and the profiling data I collected, including the pdf representation of the data.