Skip to content

Conversation

@jeffdonahue
Copy link
Contributor

This PR allows one to add "random_seed: 4" (or any other uint) to the bottom of a solver prototxt to ensure results are exactly consistent across runs. Enabling this involved removing every use of rand in the codebase* and replacing with a use of the internal Caffe RNG. For the prefetch threads in the data layers, I've given them their own private RNG engine which is seeded by the current main thread's RNG when the thread is created, which I think should ensure robustness to differences in the interleaving of the execution of the main thread and prefetch thread.

After this PR no future Caffe code should call rand (or any other function from the stdlib relying on it).

*might have missed something as some can't be found just by grepping for rand(, e.g. the use of std::shuffle which I had to replace.

@shelhamer
Copy link
Member

Nice!

Can you add a lint check that greps for calls to rand() and company to keep
new code in line?

Le mardi 22 avril 2014, Jeff Donahue notifications@github.com a écrit :

This PR allows one to add "random_seed: 4" (or any other uint) to the
bottom of a solver prototxt to ensure results are exactly consistent across
runs. Enabling this involved removing every use of rand in the codebase*
and replacing with a use of the internal Caffe RNG. For the prefetch
threads in the data layers, I've given them their own private RNG engine
which is seeded by the current main thread's RNG when the thread is
created, which I think should ensure robustness to differences in the
interleaving of the execution of the main thread and prefetch thread.

After this PR no future Caffe code should call rand (or any other
function from the stdlib relying on it).

*might have missed something as some can't be find just by grepping for

rand(, e.g. the use of std::shuffle which I had to replace.

You can merge this Pull Request by running

git pull https://github.com/jeffdonahue/caffe solver-seed

Or view, comment on, or merge it at:

#352
Commit Summary

  • add random_seed field to SolverParameter and have solver use it --
  • add data layer crop tests
  • add tests for random crop sequence -- currently both fail
  • cleanup data_layer, add prefetch_rng_ field to it and use instead of
  • make seed test pass by setting up new layer to generate the 2nd
    sequence
  • test scale param
  • prefetch_rng in ImageDataLayer
  • prefetch_rng in window_data_layer
  • replace remaining uses of rand() with caffe_rng_rand()
  • replace std::shuffle with version using prefetch rng; improve unit
    test

File Changes

Patch Links:


Reply to this email directly or view it on GitHubhttps://github.com//pull/352
.

Evan Shelhamer

@rbgirshick
Copy link
Contributor

There are also random(), *rand48() and the *_r() variants to grep for (not
sure if any of those have been used in the course of Caffe development).

On Tue, Apr 22, 2014 at 5:02 PM, Evan Shelhamer notifications@github.comwrote:

Nice!

Can you add a lint check that greps for calls to rand() and company to
keep
new code in line?

Le mardi 22 avril 2014, Jeff Donahue notifications@github.com a écrit :

This PR allows one to add "random_seed: 4" (or any other uint) to the
bottom of a solver prototxt to ensure results are exactly consistent
across
runs. Enabling this involved removing every use of rand in the codebase*
and replacing with a use of the internal Caffe RNG. For the prefetch
threads in the data layers, I've given them their own private RNG engine
which is seeded by the current main thread's RNG when the thread is
created, which I think should ensure robustness to differences in the
interleaving of the execution of the main thread and prefetch thread.

After this PR no future Caffe code should call rand (or any other
function from the stdlib relying on it).

*might have missed something as some can't be find just by grepping for

rand(, e.g. the use of std::shuffle which I had to replace.

You can merge this Pull Request by running

git pull https://github.com/jeffdonahue/caffe solver-seed

Or view, comment on, or merge it at:

#352
Commit Summary

  • add random_seed field to SolverParameter and have solver use it --
  • add data layer crop tests
  • add tests for random crop sequence -- currently both fail
  • cleanup data_layer, add prefetch_rng_ field to it and use instead of
  • make seed test pass by setting up new layer to generate the 2nd
    sequence
  • test scale param
  • prefetch_rng in ImageDataLayer
  • prefetch_rng in window_data_layer
  • replace remaining uses of rand() with caffe_rng_rand()
  • replace std::shuffle with version using prefetch rng; improve unit
    test

File Changes

Patch Links:


Reply to this email directly or view it on GitHub<
https://github.com/BVLC/caffe/pull/352>
.

Evan Shelhamer


Reply to this email directly or view it on GitHubhttps://github.com//pull/352#issuecomment-41110876
.

http://www.cs.berkeley.edu/~rbg/

@jeffdonahue
Copy link
Contributor Author

@shelhamer lint already checks for uses of rand (note all the removals of NOLINT(runtime/threadsafe_fn) in the diff)

@rbgirshick thanks, I just grepped for these -- the only one I found was a call to random() which creates the init_key in the matcaffe wrapper, but I think this is probably fine since it doesn't affect any other behavior as far as I can tell (famous last words).

@jeffdonahue
Copy link
Contributor Author

@shelhamer decided not to be lazy and added a special caffe/random_fn rule to check for rand, rand_r, and random.

@shelhamer
Copy link
Member

Nice side-issue progress on pulling prefetch out of the data layers and reduce code duplication. Thanks for adding the caffe/random_fn rule–I like having lint check our style. Nearly done reviewing, so I'll merge soon.

@shelhamer
Copy link
Member

This all looks right and orderly and it tests fine. Note that I did not look for further, less obvious examples of rand()-like calls. If some slipped through, they can be fixed as they're found.

shelhamer added a commit that referenced this pull request Apr 24, 2014
Eliminate use of "rand" in code; enable deterministic results via solver seed
@shelhamer shelhamer merged commit c55ebd0 into BVLC:dev Apr 24, 2014
@jeffdonahue jeffdonahue deleted the solver-seed branch April 24, 2014 20:00
@shelhamer shelhamer mentioned this pull request May 20, 2014
mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
Eliminate use of "rand" in code; enable deterministic results via solver seed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants