Optimize common path of Once::doit#14174
Merged
bors merged 1 commit intorust-lang:masterfrom May 15, 2014
Merged
Conversation
Contributor
|
The message on this PR doesn't describe what's actually being landed (both the PR description and the commit message). Can you please rewrite it to actually describe what's going on? You can put the extra information back in a comment for posterity's sake, but I'd rather not have the message suggest that it's being marked as |
Optimize `Once::doit`: perform optimistic check that initializtion is
already completed. `load` is much cheaper than `fetch_add` at least
on x86_64.
Verified with this test:
```
static mut o: one::Once = one::ONCE_INIT;
unsafe {
loop {
let start = time::precise_time_ns();
let iters = 50000000u64;
for _ in range(0, iters) {
o.doit(|| { println!("once!"); });
}
let end = time::precise_time_ns();
let ps_per_iter = 1000 * (end - start) / iters;
println!("{} ps per iter", ps_per_iter);
// confuse the optimizer
o.doit(|| { println!("once!"); });
}
}
```
Test executed on Mac, Intel Core i7 2GHz. Result is:
* 20ns per iteration without patch
* 4ns per iteration with this patch applied
Once.doit could be even faster (800ps per iteration), if `doit` function
was split into a pair of `doit`/`doit_slow`, and `doit` marked as
`#[inline]` like this:
```
#[inline(always)]
pub fn doit(&self, f: ||) {
if self.cnt.load(atomics::SeqCst) < 0 {
return
}
self.doit_slow(f);
}
fn doit_slow(&self, f: ||) { ... }
```
Contributor
Author
|
Updated the patch and PR description to make it clear, that |
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
May 15, 2014
Closes rust-lang#14210 (Make Vec.truncate() resilient against failure in Drop) Closes rust-lang#14206 (Register new snapshots) Closes rust-lang#14205 (use sched_yield on linux and freebsd) Closes rust-lang#14204 (Add a crate for missing stubs from libcore) Closes rust-lang#14201 (Render not_found with an absolute path to the rust stylesheet) Closes rust-lang#14198 (update valgrind headers) Closes rust-lang#14174 (Optimize common path of Once::doit) Closes rust-lang#14162 (Print 'rustc' and 'rustdoc' as the command name for --version) Closes rust-lang#14145 (Better strict version hash (SVH) computation)
bors
added a commit
that referenced
this pull request
May 15, 2014
Submitting PR again, because I cannot reopen #13349, and github does not attach new patch to that PR. ======= Optimize `Once::doit`: perform optimistic check that initializtion is already completed. `load` is much cheaper than `fetch_add` at least on x86_64. Verified with this test: ``` static mut o: one::Once = one::ONCE_INIT; unsafe { loop { let start = time::precise_time_ns(); let iters = 50000000u64; for _ in range(0, iters) { o.doit(|| { println!("once!"); }); } let end = time::precise_time_ns(); let ps_per_iter = 1000 * (end - start) / iters; println!("{} ps per iter", ps_per_iter); // confuse the optimizer o.doit(|| { println!("once!"); }); } } ``` Test executed on Mac, Intel Core i7 2GHz. Result is: * 20ns per iteration without patch * 4ns per iteration with this patch applied Once.doit could be even faster (800ps per iteration), if `doit` function was split into a pair of `doit`/`doit_slow`, and `doit` marked as `#[inline]` like this: ``` #[inline(always)] pub fn doit(&self, f: ||) { if self.cnt.load(atomics::SeqCst) < 0 { return } self.doit_slow(f); } fn doit_slow(&self, f: ||) { ... } ```
lilyball
added a commit
to lilyball/rust
that referenced
this pull request
May 16, 2014
Use sync::one::Once to fetch the mach_timebase_info only once when running precise_time_ns(). This helps because mach_timebase_info() is surprisingly inefficient. Also fix the order of operations when applying the timebase to the mach absolute time value. This improves the time on my machine from ``` test tests::bench_precise_time_ns ... bench: 157 ns/iter (+/- 4) ``` to ``` test tests::bench_precise_time_ns ... bench: 38 ns/iter (+/- 3) ``` and it will get even faster once rust-lang#14174 lands.
bors
added a commit
that referenced
this pull request
May 16, 2014
…chton Use sync::one::Once to fetch the mach_timebase_info only once when running precise_time_ns(). This helps because mach_timebase_info() is surprisingly inefficient. Also fix the order of operations when applying the timebase to the mach absolute time value. This improves the time on my machine from ``` test tests::bench_precise_time_ns ... bench: 157 ns/iter (+/- 4) ``` to ``` test tests::bench_precise_time_ns ... bench: 38 ns/iter (+/- 3) ``` and it will get even faster once #14174 lands.
flip1995
pushed a commit
to flip1995/rust
that referenced
this pull request
Feb 27, 2025
… `.clone()` (rust-lang#14174) fixes rust-lang#12357 changelog: [`useless_asref`]: don't suggest to use `.clone()` if the target type doesn't implement the `Clone` trait
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Submitting PR again, because I cannot reopen #13349, and github does not attach new patch to that PR.
Optimize
Once::doit: perform optimistic check that initializtion isalready completed.
loadis much cheaper thanfetch_addat leaston x86_64.
Verified with this test:
Test executed on Mac, Intel Core i7 2GHz. Result is:
Once.doit could be even faster (800ps per iteration), if
doitfunctionwas split into a pair of
doit/doit_slow, anddoitmarked as#[inline]like this: