Skip to content

Implement KDE and Raster Resize in Spark#1566

Merged
rfecher merged 22 commits intolocationtech:masterfrom
rfecher:sparkkde
Jun 7, 2019
Merged

Implement KDE and Raster Resize in Spark#1566
rfecher merged 22 commits intolocationtech:masterfrom
rfecher:sparkkde

Conversation

@rfecher
Copy link
Copy Markdown
Contributor

@rfecher rfecher commented May 24, 2019

No description provided.

@coveralls
Copy link
Copy Markdown

coveralls commented May 24, 2019

Coverage Status

Coverage increased (+2.003%) to 48.089% when pulling 80abdd6 on rfecher:sparkkde into 175b5e1 on locationtech:master.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-6.6%) to 39.533% when pulling dff67c8 on rfecher:sparkkde into 175b5e1 on locationtech:master.

@@ -0,0 +1,49 @@
package org.locationtech.geowave.analytic.kryo;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing header.

storeOptions.createInternalAdapterStore(),
storeOptions.createIndexStore());
}
if (((minSplits != null) && (minSplits > -1)) || ((maxSplits != null) && (maxSplits > -1))) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be && instead of ||, or is it fine to specify only minSplits or maxSplits?

@@ -0,0 +1,33 @@
package org.locationtech.geowave.analytic.spark;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing header.

final RasterDataAdapter adapter,
final JavaRDD<GridCoverage> inputRDD) throws IOException {

writeRasterToGeoWave(sc, index, outputStoreOptions, adapter, inputRDD);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need for the private method to exist? It has the same signature as the public one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I think this went through several revisions and with what was left the extraneous method makes no sense

@@ -0,0 +1,734 @@
package org.locationtech.geowave.analytic.spark.kde;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing header.

@@ -0,0 +1,152 @@
package org.locationtech.geowave.analytic.spark.kde.operations;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing header.

@@ -0,0 +1,248 @@
package org.locationtech.geowave.analytic.spark.resize;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing header.

@@ -0,0 +1,67 @@
package org.locationtech.geowave.core.store.cli.remote;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing header.

@@ -0,0 +1,37 @@
package org.locationtech.geowave.core.store.entities;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing header.

buffer.put(vis2);
buffer.put(END_AND_BYTE);
buffer.put(CLOSE_PAREN_BYTE);
return buffer.array();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we worried about something like vis1 being (a)&(b) as a result of a previous merge and vis2 being a? After a few calls of passing a as vis2 with the result of the previous call as vis1, you'd end up with something like ((((a)&(b))&(a))&(a))&(a). Perhaps it would be best to implement the merging logic into the VisibilityExpression class and only merge when vis2 is not already satisfied by vis1.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats a fair point - I actually just fixed the serialization from what was blatantly a buffer overflow in the previous implementation under-allocating the bytebuffer, but there's definitely room for improvement here. I think your suggestion is good, I'll fix it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after thinking a little more, there is a complication ... your suggestion would work if you can find one of the visibilities to be a single token so you can basically treat it as an authorization to evaluate, but doesn't extend to the case where both visibilities are boolean logic expressions

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the correct way of handling this is to always normalize all visibility expressions all the time following some consistent and repeatable criteria. Accumulo I think has a pretty good way of normalizing expressions: https://github.com/apache/accumulo/blob/2b1c4d008a7e868f0cb54f79c7b1613685b98921/core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java#L217-L247

I'd say we normalize expressions as they come in and before they are stored, and normalize expressions after they are merged. That would seem to solve the problem here in both the easy case where you can find one visibility thats a single token as well as the more complex generic case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

however, I'll happily say that I think this one is outside of the scope of this particular PR and suggest it as a separate issue. I created #1574

@rfecher rfecher merged commit a9830bb into locationtech:master Jun 7, 2019
@rfecher rfecher deleted the sparkkde branch June 7, 2019 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants