Implemented a Compound Index Strategy by meislerj · Pull Request #445 · locationtech/geowave

meislerj · 2015-06-30T23:29:04Z

Implemented CompoundIndexStrategy. For now it allows the
composition of two NumericIndexStrategy objects into a
single NumericIndexStrategy. This can easily be modified
to support N sub-indeces. The dimensions of the
CompoundIndexStrategy are the sum of the dimensions of
the sub-indeces. The compound index prefixes the bytes
from the first sub-index before any of the bytes from the
second. Externally, the compound index can be treated like
any other multi-dimensional index.

chrisbennight · 2015-07-01T01:09:43Z

Can you add a comment to the top of the class that explains what it does? What you have in the description of the pull request would probably suffice :)

meislerj · 2015-07-01T01:19:43Z

No problem

On Tue, Jun 30, 2015 at 9:09 PM, Chris Bennight notifications@github.com
wrote:

Can you add a comment to the top of the class that explains what it does?
What you have in the description of the pull request would probably suffice
:)

—
Reply to this email directly or view it on GitHub
#445 (comment).

rfecher · 2015-07-02T21:09:57Z

-1 is a special value that is used (anything less than 0) to mean "no max" ...this is something we should have documented in our javadoc of the interface but I see it being used here:https://github.com/ngageoint/geowave/blob/master/core/index/src/main/java/mil/nga/giat/geowave/core/index/sfc/tiered/TieredSFCIndexStrategy.java#L43

also, I'd argue if the argument is 'maxEstimatedRangeDecomposition' that you'd take the floor instead of the ceiling to try to keep the result less than max rather than more than max, but I could see arguments either way I suppose

Regarding floor vs ceiling: I thought about that also. It's a tradeoff. If you have a compound index strategy with 3 dimensions, the default maxEstimatedRangeDecomposition is 8 (2^d). The way this is currently done, getQueryRanges(...) for each sub-index gets called with the square root of the maxEstimatedRangeDecomposition. Using the floor would yield ~4 ranges (depending on how closely each sub-index respects the parameter) instead of ~9 ranges. It's a tradeoff between giving too coarse coverage of the indexedRange or returning too many ranges. Since the javadoc said that the parameter is more of a suggestion than a strict limit, I chose ceiling. I'm happy to switch it if you feel it makes more sense.

rfecher · 2015-07-02T21:25:10Z

I made a few comments but really I think the only significant one is watching out for negatives on the max query range. We didn't really well document our use of -1 as a pass-through for no max query ranges, but its used elsewhere particularly to re-use code between the 2 getQueryRange methods to not use a capped maximum query range. So although its not directly in the call chain I think that would affect this CompoundIndexStrategy (its definitely in the tierednumericindexstrategy at least), I'd try to be consistent with its handling of negative values (and we really should have had it in the javadoc of the interface).

Implemented CompoundIndexStrategy. For now it allows the composition of two NumericIndexStrategy objects into a single NumericIndexStrategy. This can easily be modified to support N sub-indeces. The dimensions of the CompoundIndexStrategy are the sum of the dimensions of the sub-indeces. The compound index prefixes the bytes from the first sub-index before any of the bytes from the second. Externally, the compound index can be treated like any other multi-dimensional index.

getQueryRanges(final MultiDimensionalNumericData indexedRange ,final int maxEstimatedRangeDecomposition) now handles negative values for maxEstimatedRangeDecomposition by performing a full decomposition. Both getMaxQueryRanges(...) and getInsertionIds(...) take the actual number of results from the call to the first sub-strategy into account when calling the second strategy. For example, if the first sub-strategy returned a single range, the second sub-strategy can use the full maxRangeDecomposition (since the product of the two will be maxRangeDecomposition). getQueryRanges(...) uses the ceiling for computing the maximum estimated range decomposition per strategy since the parameter is not strictly enforced. getInsertionIds(...) uses the floor for each strategy to reduce replication.

Implemented a Compound Index Strategy

rfecher reviewed Jul 2, 2015
View reviewed changes

meislerj added 3 commits July 7, 2015 19:10

added javadocs for class and all public methods

c2a29eb

meislerj force-pushed the compound-idx branch from e5ca5ab to d653e40 Compare July 7, 2015 23:47

rfecher added a commit that referenced this pull request Jul 8, 2015

Merge pull request #445 from ngageoint/compound-idx

5b106fd

Implemented a Compound Index Strategy

rfecher merged commit 5b106fd into master Jul 8, 2015

rfecher deleted the compound-idx branch July 8, 2015 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented a Compound Index Strategy#445

Implemented a Compound Index Strategy#445
rfecher merged 3 commits intomasterfrom
compound-idx

meislerj commented Jun 30, 2015

Uh oh!

chrisbennight commented Jul 1, 2015

Uh oh!

meislerj commented Jul 1, 2015

Uh oh!

rfecher Jul 2, 2015

Uh oh!

rfecher Jul 2, 2015

Uh oh!

meislerj Jul 6, 2015

Uh oh!

rfecher commented Jul 2, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

meislerj commented Jun 30, 2015

Uh oh!

chrisbennight commented Jul 1, 2015

Uh oh!

meislerj commented Jul 1, 2015

Uh oh!

rfecher Jul 2, 2015

Choose a reason for hiding this comment

Uh oh!

rfecher Jul 2, 2015

Choose a reason for hiding this comment

Uh oh!

meislerj Jul 6, 2015

Choose a reason for hiding this comment

Uh oh!

rfecher commented Jul 2, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants