The distributions should be tested more extensively and generically. At a minimum, there should be tests for sampling, scoring, CUDA compatibility, and support where appropriate. Sampling and scoring tests should work by comparing our implementations against their counterparts from numpy/scipy.
Most of our existing distributions have most of these, but writing new tests for each new distribution by hand can be time-consuming and error-prone. We should follow webPPL's example closely (see test-samplers, test-scorers, and test-statistics here) and generate our tests automatically given a distribution class, a "type" (e.g. discrete or continuous), a ground-truth sample function, a ground-truth score function, and any other necessary information (e.g. ground-truth summary statistics).
For each distribution, we should have the following tests generated automatically from a configuration file:
Sampler
- Draw samples and ground-truth samples, compare with 2-sample tests
- Use specialized tests (e.g. KS, chi^2, permutation) for distributions that support them (e.g. 1-d distributions or discrete distributions)
Scorer
- Draw ground-truth samples and compare scorer with ground-truth scorer
- Draw samples, compare scorer with ground-truth scorer
Moments
- For distributions with analytically available moments, compute 3rd-party ground truth, otherwise estimate from 3rd-party ground-truth samples
- Compare empirical moments to ground-truth analytical or empirical moments
Batching/vectorization
- Sampling, scoring, moments: compare to list of non-batched
- Check broadcasting and resizing semantics
Correctness across types
- some distributions can have different return types, e.g. Categorical
- Compare scorer across different types for same scores
The distributions should be tested more extensively and generically. At a minimum, there should be tests for sampling, scoring, CUDA compatibility, and support where appropriate. Sampling and scoring tests should work by comparing our implementations against their counterparts from numpy/scipy.
Most of our existing distributions have most of these, but writing new tests for each new distribution by hand can be time-consuming and error-prone. We should follow webPPL's example closely (see test-samplers, test-scorers, and test-statistics here) and generate our tests automatically given a distribution class, a "type" (e.g. discrete or continuous), a ground-truth sample function, a ground-truth score function, and any other necessary information (e.g. ground-truth summary statistics).
For each distribution, we should have the following tests generated automatically from a configuration file:
Sampler
Scorer
Moments
Batching/vectorization
Correctness across types