Have RandomState instances unique to each batch index, can be used by e.g. random augmentation#4230
Have RandomState instances unique to each batch index, can be used by e.g. random augmentation#4230grafi-tt wants to merge 4 commits intochainer:masterfrom
Conversation
… e.g. random augmentation
25110c9 to
ca5387c
Compare
|
Thank you for sending the PR! I think this PR consists of several parts.
Could you consider to split the PR into these parts and send them as PRs respectively? |
|
@grafi-tt Is it possible to split this PR as mentioned by @delta2323? Thanks! |
|
@Crissman @delta2323 Sorry for late reply! To be precise, this PR consists of 5 parts:
Though I can reorganize the changes to those 5 commits soon, I'm wondering how to send those PRs because the changes 2, 3, 4 and 5 depend on the change 1. Should I send a PR for 1 first, and send the other PRs after it is merged? |
Looking at this sentence solely, we might be able to work |
|
One idea would to make a branch that implements 1 and 5 and a PR from the branch first. Then, create branches for 2, 3, and 4 respectively from the branch and send PRs (one for each branch). We'll review the PR for 1 and 5 first. It is also OK to separate 1 and 5, too. |
|
@delta2323 Thank you! I created PR #4448 which contains 1 and 5. |
|
Thank you! I'll take a look. |
I've made iterators store
numpy.random.RandomStateinstances those are bound to batch indices. They can be retrieved bychainer.iteartors.get_random_state().The implementation is trivial for
SerialIterator. ForMultithreadIterator, a set of the backup states (implemented as tuples) is required to supportresetandserializecorrectly. ForMultiprocessIterator, shared memory is necessary.This feature is very useful when you use
MultiprocessIterator. To perform random augmentation correctly with this iterator class, you need to set the random seed at the fetch of first batch, as the global random state remains the same on every forked processes.Making the random seed deterministic is rather hard, because for each part of a batch to be fetched, assignment of a process that does fetch is completely non-deterministic. You may mitigate the situation by reseeding the random state at every fetch, but it hurts the long periodicity of Mersenne Twistter.
overhead
If
chainer.iteartors.get_random_state()is not called, there is almost no overhead. If called, overhead forRandomState.get_state()and/orRandomState.set_state()are incurred; but a benchmark suggests it's still negligible.I performed the benchmark bellow on my Linux desktop machine, with i5-3550 (4 cores) processor. It just iterates over MNIST dataset with random flip augmentation. The size of batches is very small compared to most real workload, so the overhead of iteration is rate-limiting. Even on such a extreme setting, the measured performance impact is less than 10%.
The result is:
As I've thought this issue is more important, so I made it prior to #3754.