Right now statistics are only computed on ingest. Consider the use case of building a new statistic, and deploying that. We should be able to run a MapReduce or some other distributed job to perform a scan over the entire data set and compute selected statistics. The method should be able to take multiple staticstics instances and compute them all simultaneously in one pass.
Existing statistics should also support a re-compute and remove operation as well.
Right now statistics are only computed on ingest. Consider the use case of building a new statistic, and deploying that. We should be able to run a MapReduce or some other distributed job to perform a scan over the entire data set and compute selected statistics. The method should be able to take multiple staticstics instances and compute them all simultaneously in one pass.
Existing statistics should also support a re-compute and remove operation as well.