Conversation
|
Can you add a comment to the top of the class that explains what it does? What you have in the description of the pull request would probably suffice :) |
|
No problem On Tue, Jun 30, 2015 at 9:09 PM, Chris Bennight notifications@github.com
|
There was a problem hiding this comment.
-1 is a special value that is used (anything less than 0) to mean "no max" ...this is something we should have documented in our javadoc of the interface but I see it being used here:https://github.com/ngageoint/geowave/blob/master/core/index/src/main/java/mil/nga/giat/geowave/core/index/sfc/tiered/TieredSFCIndexStrategy.java#L43
There was a problem hiding this comment.
also, I'd argue if the argument is 'maxEstimatedRangeDecomposition' that you'd take the floor instead of the ceiling to try to keep the result less than max rather than more than max, but I could see arguments either way I suppose
There was a problem hiding this comment.
Regarding floor vs ceiling: I thought about that also. It's a tradeoff. If you have a compound index strategy with 3 dimensions, the default maxEstimatedRangeDecomposition is 8 (2^d). The way this is currently done, getQueryRanges(...) for each sub-index gets called with the square root of the maxEstimatedRangeDecomposition. Using the floor would yield ~4 ranges (depending on how closely each sub-index respects the parameter) instead of ~9 ranges. It's a tradeoff between giving too coarse coverage of the indexedRange or returning too many ranges. Since the javadoc said that the parameter is more of a suggestion than a strict limit, I chose ceiling. I'm happy to switch it if you feel it makes more sense.
|
I made a few comments but really I think the only significant one is watching out for negatives on the max query range. We didn't really well document our use of -1 as a pass-through for no max query ranges, but its used elsewhere particularly to re-use code between the 2 getQueryRange methods to not use a capped maximum query range. So although its not directly in the call chain I think that would affect this CompoundIndexStrategy (its definitely in the tierednumericindexstrategy at least), I'd try to be consistent with its handling of negative values (and we really should have had it in the javadoc of the interface). |
Implemented CompoundIndexStrategy. For now it allows the composition of two NumericIndexStrategy objects into a single NumericIndexStrategy. This can easily be modified to support N sub-indeces. The dimensions of the CompoundIndexStrategy are the sum of the dimensions of the sub-indeces. The compound index prefixes the bytes from the first sub-index before any of the bytes from the second. Externally, the compound index can be treated like any other multi-dimensional index.
getQueryRanges(final MultiDimensionalNumericData indexedRange ,final int maxEstimatedRangeDecomposition) now handles negative values for maxEstimatedRangeDecomposition by performing a full decomposition. Both getMaxQueryRanges(...) and getInsertionIds(...) take the actual number of results from the call to the first sub-strategy into account when calling the second strategy. For example, if the first sub-strategy returned a single range, the second sub-strategy can use the full maxRangeDecomposition (since the product of the two will be maxRangeDecomposition). getQueryRanges(...) uses the ceiling for computing the maximum estimated range decomposition per strategy since the parameter is not strictly enforced. getInsertionIds(...) uses the floor for each strategy to reduce replication.
Implemented a Compound Index Strategy
Implemented CompoundIndexStrategy. For now it allows the
composition of two NumericIndexStrategy objects into a
single NumericIndexStrategy. This can easily be modified
to support N sub-indeces. The dimensions of the
CompoundIndexStrategy are the sum of the dimensions of
the sub-indeces. The compound index prefixes the bytes
from the first sub-index before any of the bytes from the
second. Externally, the compound index can be treated like
any other multi-dimensional index.