Skip to content

Add Kudu datastore implementation#1545

Merged
rfecher merged 29 commits intolocationtech:masterfrom
ChesleyTan:kudu-master
May 6, 2019
Merged

Add Kudu datastore implementation#1545
rfecher merged 29 commits intolocationtech:masterfrom
ChesleyTan:kudu-master

Conversation

@dannyqiu
Copy link
Copy Markdown
Contributor

@dannyqiu dannyqiu commented Apr 26, 2019

This pr adds an implementation of the Kudu datastore extension

Copy link
Copy Markdown
Contributor

@rfecher rfecher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple minor comments

Comment thread .travis.yml Outdated
# - NAME='Accumulo Server IT on Latest CDH Versions' MAVEN_PROFILES='accumulo-it-server,cloudera' BUILD_DOCS=false IT_ONLY=true
- NAME='Accumulo Server IT on Latest HDP Versions' MAVEN_PROFILES='accumulo-it-server,hortonworks' BUILD_DOCS=false IT_ONLY=true
- NAME='HBase Server IT on Latest CDH Versions' MAVEN_PROFILES='hbase-it-server,cloudera' BUILD_DOCS=false IT_ONLY=true
# - NAME='HBase Server IT on Latest CDH Versions' MAVEN_PROFILES='hbase-it-server,cloudera' BUILD_DOCS=false IT_ONLY=true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a new issue with CDH?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new xenial build environment, CDH hangs indefinitely.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added these back to the build matrix and made the CDH tests explicitly depend on the precise environment with oraclejdk8

Comment thread extensions/datastores/kudu/pom.xml Outdated
ChesleyTan and others added 28 commits May 3, 2019 13:06
Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
Signed-off-by: carolyntang <carolyntang1129@gmail.com>
Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
Signed-off-by: carolyntang <carolyntang1129@gmail.com>
Signed-off-by: cyzhan1118 <cz336@cornell.edu>
Signed-off-by: foolhb <hanbo1018@gmail.com>
Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
Signed-off-by: Danny Qiu <dqiu55@gmail.com>
Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
If the watchdog is initialized too early, it will hang if we attempt to
check the status of it, since it will call wait() to allow a process to
start. In the case we check for a process running, the watchdog will
hang because it did not start in the first place.

Signed-off-by: Danny Qiu <dqiu55@gmail.com>
Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
* Implement MetadataDeleter

* format

* Modify on primary id condition

Signed-off-by: carolyntang <carolyntang1129@gmail.com>
- Delete from data index using data ID as partition key
- Use scanner to perform deletions to support deleting using only a
subset of the primary key columns. This is inefficient and will need to
be optimized later.

Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
Add kudu metadata writer and reader;
Set default number of kudu replicas and partition buckets;
Fix the bug in kudu writer, use result of the row transformer

Signed-off-by: foolhb <hanbo1018@gmail.com>
Signed-off-by: cyzhan1118 <cz336@cornell.edu>
* Enable rest of Geowave basic IT

* Destroy kudu db files on tear down, create separate watchdogs

Signed-off-by: Danny Qiu <dqiu55@gmail.com>
* Adjust tile size in basic ITs to accomodate kudu cell size limitations

* Remove value column from row deletion predicates

* Remove outdated kudu configuration options

* Modify KuduOption ands KuduRequiredOptions, enable visibility

Signed-off-by: foolhb <hanbo1018@gmail.com>
Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
* Use openjdk8

* Attempt to speed up tests by caching downloads

Signed-off-by: Danny Qiu <dqiu55@gmail.com>
* Fix bug by writing only to data index columns

Signed-off-by: Danny Qiu <dqiu55@gmail.com>
Signed-off-by: cyzhan1118 <cz336@cornell.edu>
* Add back default max range decomposition

* Extract executeQuery logic and check for null nextRows

* Close Kudu session after deletion

* Use client pool to avoid creating multiple clients

* Use namespaced name for index and metadata tables

Signed-off-by: Danny Qiu <dqiu55@gmail.com>
* Add config service command for Kudu

* Update GeoServerIngestIT reference image for OpenJDK 8

* Decrease tile size for CustomCRSLandsatIT
Kudu's maximum cell size limits the tile size we can use for this test
to 64

Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
…ry ID (#25)

* Add support for data index queries in Kudu

* Add support for prefix matching on stats primary ID

Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
* add batch handler class and batch write

* fix error in kudu operations for batch write

* remove BatchWrite, add setAutoFlush() to KuduWriter

* check pending error for batch write

* call change mode only once

* fix error print

Signed-off-by: carolyntang <carolyntang1129@gmail.com>
Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
Signed-off-by: Chesley Tan <chesleytan97@gmail.com>
* Specify datastore client versions in top level pom.xml

* Update .travis.yml to run cdh tests in precise environment

* Add reference images for Oracle JDK 8

* Use oraclejdk8 in precise environment, condition IngestIT on JDK implementation

Signed-off-by: Danny Qiu <dqiu55@gmail.com>
@rfecher rfecher merged commit f13cb91 into locationtech:master May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants