Many probability distributions require sampling from the Gamma distribution, including: Gamma, Beta, and Dirichlet.
Since Gamma samplers have complex control flow (for rejection sampling) and are seldom a bottleneck in probabilistic algorithms, it should suffice to implement a CPU-only implementation at first. What is more important than a CUDA implementation is a reparameterized sampler so that stochastic gradients can be propagated through the sampler (see paper and reference implementation by @naesseth).
Tasks
Map of modifications
aten/src/TH/THRandom.c/h random single numbers
aten/src/TH/generic/THTensorRandom.c/h random tensors
aten/src/ATen/Declarations.cwrap bindings for ATen
torch/csrc/generic/methods/TensorRandom.cwrap bindings for torch.Tensor
torch/autograd/variable.py - Variable
torch/distributions.py - Distributions
Many probability distributions require sampling from the Gamma distribution, including:
Gamma,Beta, andDirichlet.Since Gamma samplers have complex control flow (for rejection sampling) and are seldom a bottleneck in probabilistic algorithms, it should suffice to implement a CPU-only implementation at first. What is more important than a CUDA implementation is a reparameterized sampler so that stochastic gradients can be propagated through the sampler (see paper and reference implementation by @naesseth).
Tasks
random_gamma(requires_grad=False)random_gamma(requires_grad=True)Map of modifications
aten/src/TH/THRandom.c/hrandom single numbersaten/src/TH/generic/THTensorRandom.c/hrandom tensorsaten/src/ATen/Declarations.cwrapbindings for ATentorch/csrc/generic/methods/TensorRandom.cwrapbindings for torch.Tensortorch/autograd/variable.py- Variabletorch/distributions.py- Distributions