"Randomly zeroes some of the elements of the input tensor. The elements to zero are randomized on every forward call." This is incorrect; the function also scales up by 1/(1-p), which the implementation correctly does.
"Randomly zeroes some of the elements of the input tensor. The elements to zero are randomized on every forward call."
This is incorrect; the function also scales up by 1/(1-p), which the implementation correctly does.