-
Notifications
You must be signed in to change notification settings - Fork 83
Description
MD simulations are often not reproducible.
2 reasons for that:
- Randomness in integrators: Langevin adds random contributions to the momenta.
- Non deterministic numerical operations: operations on GPU are often non deterministic: https://docs.pytorch.org/docs/stable/notes/randomness.html
With finite precision float number, operations are not truly commutative, so the order of the operations matters. GPU operations are optimized in such a way it is by default non deterministic.
Is it a problem? No because most of the time, MD is about ergodicity and sampling the correct distributions, the exact trajectory is not important.
But if you really care about reproducibility, you can use at the beginning of a script:
torch.use_deterministic_algorithms(True)
You will probably get an error from CuBlas saying:
RuntimeError: Deterministic behavior was enabled with either torch.use_deterministic_algorithms(True) or at::Context::setDeterministicAlgorithms(true), but this operation is not deterministic because it uses CuBLAS and you have CUDA >= 10.2. To enable deterministic behavior in this case, you must set an environment variable before running your PyTorch application: CUBLAS_WORKSPACE_CONFIG=:4096:8 or CUBLAS_WORKSPACE_CONFIG=:16:8. For more information, go to https://docs.nvidia.com/cuda/cublas/index.html#results-reproducibility
With that, a deterministic script is obtained by using:
torch.use_deterministic_algorithms(True)
os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
An example would be great to show the difference between non deterministic runs with Nosé Hoover (deterministic integrator).