Implement KDE and Raster Resize in Spark#1566
Conversation
| @@ -0,0 +1,49 @@ | |||
| package org.locationtech.geowave.analytic.kryo; | |||
| storeOptions.createInternalAdapterStore(), | ||
| storeOptions.createIndexStore()); | ||
| } | ||
| if (((minSplits != null) && (minSplits > -1)) || ((maxSplits != null) && (maxSplits > -1))) { |
There was a problem hiding this comment.
Should this be && instead of ||, or is it fine to specify only minSplits or maxSplits?
| @@ -0,0 +1,33 @@ | |||
| package org.locationtech.geowave.analytic.spark; | |||
| final RasterDataAdapter adapter, | ||
| final JavaRDD<GridCoverage> inputRDD) throws IOException { | ||
|
|
||
| writeRasterToGeoWave(sc, index, outputStoreOptions, adapter, inputRDD); |
There was a problem hiding this comment.
Is there a need for the private method to exist? It has the same signature as the public one.
There was a problem hiding this comment.
good catch, I think this went through several revisions and with what was left the extraneous method makes no sense
| @@ -0,0 +1,734 @@ | |||
| package org.locationtech.geowave.analytic.spark.kde; | |||
| @@ -0,0 +1,152 @@ | |||
| package org.locationtech.geowave.analytic.spark.kde.operations; | |||
| @@ -0,0 +1,248 @@ | |||
| package org.locationtech.geowave.analytic.spark.resize; | |||
| @@ -0,0 +1,67 @@ | |||
| package org.locationtech.geowave.core.store.cli.remote; | |||
| @@ -0,0 +1,37 @@ | |||
| package org.locationtech.geowave.core.store.entities; | |||
| buffer.put(vis2); | ||
| buffer.put(END_AND_BYTE); | ||
| buffer.put(CLOSE_PAREN_BYTE); | ||
| return buffer.array(); |
There was a problem hiding this comment.
Are we worried about something like vis1 being (a)&(b) as a result of a previous merge and vis2 being a? After a few calls of passing a as vis2 with the result of the previous call as vis1, you'd end up with something like ((((a)&(b))&(a))&(a))&(a). Perhaps it would be best to implement the merging logic into the VisibilityExpression class and only merge when vis2 is not already satisfied by vis1.
There was a problem hiding this comment.
thats a fair point - I actually just fixed the serialization from what was blatantly a buffer overflow in the previous implementation under-allocating the bytebuffer, but there's definitely room for improvement here. I think your suggestion is good, I'll fix it.
There was a problem hiding this comment.
after thinking a little more, there is a complication ... your suggestion would work if you can find one of the visibilities to be a single token so you can basically treat it as an authorization to evaluate, but doesn't extend to the case where both visibilities are boolean logic expressions
There was a problem hiding this comment.
I think the correct way of handling this is to always normalize all visibility expressions all the time following some consistent and repeatable criteria. Accumulo I think has a pretty good way of normalizing expressions: https://github.com/apache/accumulo/blob/2b1c4d008a7e868f0cb54f79c7b1613685b98921/core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java#L217-L247
I'd say we normalize expressions as they come in and before they are stored, and normalize expressions after they are merged. That would seem to solve the problem here in both the easy case where you can find one visibility thats a single token as well as the more complex generic case.
There was a problem hiding this comment.
however, I'll happily say that I think this one is outside of the scope of this particular PR and suggest it as a separate issue. I created #1574
No description provided.