Skip to content

Commit 27f9a9a

Browse files
committed
GEOWAVE-218
Merge with master documentation
1 parent b830f3f commit 27f9a9a

87 files changed

Lines changed: 5274 additions & 1411 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/content/035-adapters.adoc

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,3 +65,19 @@ chunk of data. No distributed filtering can be performed on this data except for
6565
client side filtering extensibility point can still be used if necessary. The Data Adapter has to provide methods to
6666
serialize and deserialize these items in the form of Field Readers and Writers, but it is not necessary to have these
6767
methods on the classpath of any Accumulo nodes.
68+
69+
==== Statistics
70+
71+
Adapters provide a set of statistics stored within a statistic store. The set of available statistics is specific to each adapter and
72+
the set of attributes for those data items managed by the adapter. Statistics include:
73+
74+
* Ranges over an attribute, including time.
75+
* Enveloping bounding box over all geometries.
76+
* Cardinality of the number of stored items.
77+
78+
Optional statistics include:
79+
80+
* Histograms over the range of values for an attribute.
81+
* Cardinality of discrete values of an attribute.
82+
83+

docs/content/041-statistics.adoc

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
[[statistics]]
2+
=== Statistics
3+
4+
Adapters provide a set of statistics stored within a statistic store. The set of available statistics is specific to each adapter and
5+
the set of attributes for those data items managed by the adapter. Statistics include:
6+
7+
* Ranges over an attribute, including time.
8+
* Enveloping bounding box over all geometries.
9+
* Cardinality of the number of stored items.
10+
* Histograms over the range of values for an attribute.
11+
* Cardinality of discrete values of an attribute.
12+
13+
Statistics are updated during data ingest and deletion. Range and bounding box statistics reflect the largest range over time.
14+
Those statistics are not updated during deletion. Cardinality-based statistics are updated upon deletion.
15+
16+
Statistics retain the same visibility constraints as the associated attributes. Thus, there is a set of statistics for each unique constraint.
17+
The statistics store answers each statistics inquiry for a given adapter with only those statistics matching the authorizations of the requester.
18+
The statistics store merges authorized statistics covering the same attribute.
19+
20+
image::stats_merge.png[scaledwidth="100%",alt="Statistics Merge"]
21+
22+
==== Statistics Table Structure in Accumulo
23+
24+
image::stats.png[scaledwidth="100%",alt="Statistics Structure"]
25+
26+
===== Re-Computation
27+
28+
Re-computation of statistics is required in three circumstances:
29+
30+
["arabic"]
31+
. As indexed items are removed from the adapter store, the range and envelope statistics may lose their accuracy if the removed item
32+
contains an attribute that represents the minimum or maximum value for the population.
33+
. New statistics added to the statistics store after data items are ingested. These new statistics do not reflect the entire population.
34+
. Software changes invalidate prior stored images of statistics.
35+
36+
A simple statistics tool is a command line tool to recompute all statistics for a given adapter. The tool is soon to be replaced by a more comprehensive and efficient tool.
37+
The tool removes all statistics for adapter, scans the entire data set and reconstructs to statistics. The tool is be executed within a JVM using any of the assembled JAR files.
38+
The arguments to the tool are as follow, presented in the exact order required.
39+
40+
* Zookeepers - Formatted as a comma-separated string: zookeeper1:port,zookeeper2:port
41+
* Accumulo Instance ID - The "instance" that the Accumulo cluster.
42+
* Accumulo Username - The nme of the connection user associated with a user account managed by Accumulo, not a system, etc.
43+
* Accumulo Password - This is an Accumulo controlled secret.
44+
* Geowave Namespace - This is _not_ an Accumulo namespace; rather think of it as a prefix Geowave uses for index table creation.
45+
* Geowave Adapter ID - The name of the adapter. This is the local name for the feature name managed by the Feature Data Adapter.
46+
This name matches the layer name in GeoServer.
47+
* Authorizations - Ideally, the requesting authorizations should encompass ALL authorizations of the system. The authorizations may be provided in a comma-separated list.
48+
49+
Make sure JAVA_HOME is set prior to invoking the following command.
50+
51+
java -cp /usr/local/geowave/ingest/geowave-ingest-tool.jar mil.nga.giat.geowave.accumulo.util.StatsTool "localhost:12342" "GeoWave" "root" "pAssWord" "test" "GpxTrack" "A,B&C"
52+
53+
54+

docs/content/070-geoserver.adoc

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,13 @@ As expected with Accumulo, operations on a single feature instances are atomic.
4040
Lock management supports life-limited locks on feature instances. There are only two supported lock managers: in memory
4141
and Zookeeper. Memory is suitable for single Geoserver instance installations.
4242

43+
===== Index Selection
44+
45+
Data written through WFS-T is indexed within a single index. The adapter inspects existing indices, finding one that matches
46+
the data requirements. A geo-temporal index is chosen for features with temporal attributes. The adapter creates a geo-spatial index
47+
upon failure of finding a suitable index. Geo-temporal index is not created, regardless of the existence of temporal attributes. Currently,
48+
geo-temporal indices lead to poor performance for queries requesting vectors over large spans of time.
49+
4350
==== Authorization Management
4451

4552
Authorization Management provides the set of credentials compared against the security labels attached to each cell.
@@ -100,3 +107,8 @@ The rule `.*` matches all properties. The more specific rule `geo.*` must be ord
100107

101108
The system extracts the JSON visibility string from a feature instance property named `GEOWAVE_VISIBILITY`. Selection
102109
of an alternate property is achieved by setting the associated attribute descriptor 'visibility' to the boolean value TRUE.
110+
111+
==== Statistics
112+
113+
The adapter captures statistics for each numeric, temporal and geo-spatial attribute. Statistics are used to constrain queries and
114+
answer inquiries by GeoServer for data ranges, as required for map requests and calibration of zoom levels in Open Layers.

docs/content/images/stats.png

28.9 KB
Loading
37 KB
Loading

geowave-accumulo/pom.xml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,6 @@
5050
<dependency>
5151
<groupId>com.google.code.findbugs</groupId>
5252
<artifactId>annotations</artifactId>
53-
<version>${findbugs.version}</version>
5453
</dependency>
5554
</dependencies>
5655
<build>

0 commit comments

Comments
 (0)