Conversation
|
Thanks for this! |
|
|
||
| #[bench] | ||
| fn flatbuffers_populate_with_args(b: &mut Bencher) { | ||
| let mut msg = flatbuffers::FlatBufferBuilder::new(); |
There was a problem hiding this comment.
I think this needs to use black_box so that it is not optimized by the compiler. If not the results might look unfair.
There was a problem hiding this comment.
@pickfire I have tried adding black_box and didn't change a thing at all, and since other benches didn't use black_box, I figured, I don't need them here. You have a better understanding of the topic, so it would be great if you can help and PR the suggested changes.
There was a problem hiding this comment.
After looking into the API. The benchmarks are just not fair that the flatbuffers return slice there. Should be something like:
#[bench]
fn flatbuffers_serialize(b: &mut Bencher) {
let mut msg = flatbuffers::FlatBufferBuilder::new();
let log = protocol_flatbuffers::populate_log_with_builder(&mut msg);
msg.finish(log, None);
let mut buf = Vec::new();
buf.copy_from_slice(msg.finished_data());
b.bytes = buf.len() as u64;
b.iter(|| {
buf.clear();
msg.reset();
msg.finish_minimal(log);
buf.copy_from_slice(msg.finished_data());
});
}The b.iter() part should put the serialization process but not just to get the serialized data, meaning to be writing to a Vec instead of just getting the &[u8] that is processed. Any idea how the serialization part could be separated?
I tested it like above but got...
---- flatbuffers_serialize stdout ----
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `472`: destination and source slices have different lengths', src/libcore/slice/mod.rs:1965:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
I feel like capnp_serialize is already incorrect as it shouldn't be faster than clone if I am correct. I hope @dtolnay can help here.
There was a problem hiding this comment.
As far as I understand, Cap'n'Proto and Flatbuffers explicitly target zero-copy on serialization and deserialization, and thus, those steps should take no time at all, just like it is the case with FlatBuffers. Basically, the data in memory is ready to be sent or read as is at any given point in time, and thus, it is "faster" than memcpy (clone), and it is unfair to include memcpy into the benchmark.
There was a problem hiding this comment.
@frol But the thing is that curretly flatbuffers benchmark is not fair, it does not do anything which it should do (serialization in this case).
b.iter(|| {
msg.finished_data()
});If that's the case, why not rename flatbuffers_serialize to flatbuffers_get_finished_data?
|
|
||
| #[bench] | ||
| fn flatbuffers_populate_with_builder(b: &mut Bencher) { | ||
| let mut msg = flatbuffers::FlatBufferBuilder::new(); |
There was a problem hiding this comment.
I think this needs to use black_box so that it is not optimized by the compiler. If not the results might look unfair.
|
|
||
| #[bench] | ||
| fn flatbuffers_serialize(b: &mut Bencher) { | ||
| let mut msg = flatbuffers::FlatBufferBuilder::new(); |
There was a problem hiding this comment.
I think this needs to use black_box so that it is not optimized by the compiler. If not the results might look unfair.
|
|
||
| #[bench] | ||
| fn flatbuffers_deserialize(b: &mut Bencher) { | ||
| let mut msg = flatbuffers::FlatBufferBuilder::new(); |
There was a problem hiding this comment.
I think this needs to use black_box so that it is not optimized by the compiler. If not the results might look unfair.
This PR is based on the PR #6.