Micro-optimize vec.with_c_str for short vectors#9352
Conversation
src/libstd/c_str.rs
Outdated
There was a problem hiding this comment.
You write small but I think you mean large, but does the doc have to spill the guts here? Either the small or the large string case, the CString or buffer is just a temporary and the pointer will be invalid after the closure exits, that's all the user should need to know
|
This is ready for review again. It also adds a couple other things. First, it bumps the size of the buffer to 128 bytes. Next, it adds the micro-optimization to |
src/libstd/c_str.rs
Outdated
There was a problem hiding this comment.
This still mentions Unsafe. Also, could you expand the documentation to explain what happens when there is actually an interior null?
|
@cmr: I'm not yet ready to agree, but I'll factor out the |
src/libstd/c_str.rs
Outdated
There was a problem hiding this comment.
Can this avoid unsafe code and use:
let mut buf = ...;
vec::bytes::copy_memory(buf, self.as_bytes());
buf[self.len()] = 0;
do buf.as_imm_buf |buf, _| {
// etc
}?
This now makes it unsafe to save the pointer returned by .with_c_str as that pointer now may be pointing at a stack allocated array. I arbitrarily chose 32 bytes as the length of the stack vector, and so it might not be the most optimal size. before: test c_str::bench::bench_with_c_str_long ... bench: 539 ns/iter (+/- 91) test c_str::bench::bench_with_c_str_medium ... bench: 97 ns/iter (+/- 2) test c_str::bench::bench_with_c_str_short ... bench: 70 ns/iter (+/- 5) after: test c_str::bench::bench_with_c_str_long ... bench: 542 ns/iter (+/- 13) test c_str::bench::bench_with_c_str_medium ... bench: 53 ns/iter (+/- 6) test c_str::bench::bench_with_c_str_short ... bench: 19 ns/iter (+/- 0)
The documentation for `.with_c_str()` already says that the pointer will be deallocated before returning from the function.
This matches @graydon's recommendation.
before: test c_str::bench::bench_with_c_str_unchecked_long ... bench: 361 ns/iter (+/- 9) test c_str::bench::bench_with_c_str_unchecked_medium ... bench: 75 ns/iter (+/- 2) test c_str::bench::bench_with_c_str_unchecked_short ... bench: 60 ns/iter (+/- 9) after: test c_str::bench::bench_with_c_str_unchecked_long ... bench: 362 ns/iter (+/- test c_str::bench::bench_with_c_str_unchecked_medium ... bench: 30 ns/iter (+/- 7) test c_str::bench::bench_with_c_str_unchecked_short ... bench: 12 ns/iter (+/- 4)
This also fixes a bug in `vec.with_c_str_unchecked` where we were not calling `.to_c_str_unchecked()` for long strings.
One of the downsides with `c_str` is that it always forces an allocation, and so this could add unnecessary overhead to various calls. This PR implements one of the suggestions @graydon made in #8296 for `vec.with_c_str` in that for a short string can use a small stack array instead of a malloced array for our temporary c string. This ends up being twice as fast for small strings. There are two things to consider before landing this commit though. First off, I arbitrarily picked the stack array as 32 bytes, and I'm not sure if this a reasonable amount or not. Second, there is a risk that someone can keep a reference to the interior stack pointer, which could cause mayhem if someone were to dereference the pointer. Since we also can easily grab and return interior pointers to vecs though, I don't think this is that much of an issue.
One of the downsides with
c_stris that it always forces an allocation, and so this could add unnecessary overhead to various calls. This PR implements one of the suggestions @graydon made in #8296 forvec.with_c_strin that for a short string can use a small stack array instead of a malloced array for our temporary c string. This ends up being twice as fast for small strings.There are two things to consider before landing this commit though. First off, I arbitrarily picked the stack array as 32 bytes, and I'm not sure if this a reasonable amount or not. Second, there is a risk that someone can keep a reference to the interior stack pointer, which could cause mayhem if someone were to dereference the pointer. Since we also can easily grab and return interior pointers to vecs though, I don't think this is that much of an issue.