gdal vector clean-coverage

Added in version 3.12.

Adjust the boundaries of a polygonal dataset, removing gaps and overlaps.

Synopsis

Usage: gdal vector clean-coverage [OPTIONS] <INPUT> <OUTPUT>

Alter polygon boundaries to make shared edges identical, removing gaps and overlaps

Positional arguments:
  -i, --input <INPUT>                                  Input vector datasets [required] [not available in pipelines]
  -o, --output <OUTPUT>                                Output vector dataset [required] [not available in pipelines]

Common Options:
  -h, --help                                           Display help message and exit
  --json-usage                                         Display usage as JSON document and exit
  --config <KEY>=<VALUE>                               Configuration option [may be repeated]
  -q, --quiet                                          Quiet mode (no progress bar or warning message) [not available in pipelines]

Options:
  -l, --layer, --input-layer <INPUT-LAYER>             Input layer name(s) [may be repeated] [not available in pipelines]
  -f, --of, --format, --output-format <OUTPUT-FORMAT>  Output format ("GDALG" allowed) [not available in pipelines]
  --co, --creation-option <KEY>=<VALUE>                Creation option [may be repeated] [not available in pipelines]
  --lco, --layer-creation-option <KEY>=<VALUE>         Layer creation option [may be repeated] [not available in pipelines]
  --overwrite                                          Whether overwriting existing output dataset is allowed [not available in pipelines]
  --update                                             Whether to open existing dataset in update mode [not available in pipelines]
  --overwrite-layer                                    Whether overwriting existing output layer is allowed [not available in pipelines]
  --append                                             Whether appending to existing layer is allowed [not available in pipelines]
                                                       Mutually exclusive with --upsert
  --output-layer <OUTPUT-LAYER>                        Output layer name [not available in pipelines]
  --skip-errors                                        Skip errors when writing features [not available in pipelines]
  --active-layer <ACTIVE-LAYER>                        Set active layer (if not specified, all)
  --snapping-distance <SNAPPING-DISTANCE>              Distance tolerance for snapping nodes
  --merge-strategy <MERGE-STRATEGY>                    Algorithm to assign overlaps to neighboring polygons. MERGE-STRATEGY=longest-border|max-area|min-area|min-index
  --maximum-gap-width <MAXIMUM-GAP-WIDTH>              Maximum width of a gap to be closed

Advanced Options:
  --if, --input-format <INPUT-FORMAT>                  Input formats [may be repeated] [not available in pipelines]
  --oo, --open-option <KEY>=<VALUE>                    Open options [may be repeated] [not available in pipelines]
  --output-oo, --output-open-option <KEY>=<VALUE>      Output open options [may be repeated] [not available in pipelines]
  --upsert                                             Upsert features (implies 'append') [not available in pipelines]
                                                       Mutually exclusive with --append

Description

gdal vector clean-coverage modifies boundaries of a polygonal dataset, such that gaps and overlaps between features are removed and shared edges are defined using the same vertices. The resulting dataset will form a polygonal coverage that can be used with gdal vector simplify-coverage.

This command can also be used as a step of gdal vector pipeline, although it requires loading the entire dataset into memory at once.

Note

This command requires a GDAL build against the GEOS library (version 3.14 or greater).

GDALG output (on-the-fly / streamed dataset)

This program supports serializing the command line as a JSON file using the GDALG output format. The resulting file can then be opened as a vector dataset using the GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly / streamed way.

Note

However this algorithm is not natively streaming compatible. Consequently a in-memory temporary dataset will be generated, which may cause significant processing time at opening.

Program-Specific Options

--input-layer

Specifies the name of the layer to process. By default, all layers will be processed.

--maximum-gap-width <MAXIMUM-GAP-WIDTH>

Defines the largest area that should be considered a "gap" and merged into an adjacent polygon. Gaps will be merged unless a circle with radius larger than the specified tolerance can be inscribed within the gap. The default maximum gap width is zero, meaning that gaps are not closed.

../_images/gdal_vector_clean_coverage_close_gaps.svg

Polygon dataset before cleaning (left), after cleaning with default parameters (center), and after cleaning with --maximum-gap-width 1 (right).

--merge-strategy <MERGE-STRATEGY>

Method by which overlaps or gaps should be added to adjacent polygons. Options include: - longest-border (default): add areas to the polygon with which the longest border is shared - max-area: add areas to the largest adjacent polygon - min-area: add areas to the smallest adjacent polygon - min-index: add areas to the adjacent polygon that was read first

../_images/gdal_vector_clean_coverage_merge_max_area.svg

Polygon dataset before cleaning (left), after cleaning with "longest-border" merge strategy (center) and --merge-strategy max-area (right).

--output-layer

Specifies the name of the layer to which features will be written. By default, the names of the output layers will be the same as the names of the input layers.

--snapping-distance <SNAPPING-DISTANCE>

Controls the node snapping step, when nearby vertices are snapped together. By default, an automatic snapping distance is determined based on an analysis of the input. Set to zero to turn off all snapping.

../_images/gdal_vector_clean_coverage_snap_distance.svg

Polygon dataset before cleaning (left), after cleaning with default snapping distance (center), and a more aggressive --snapping-distance 0.2 (right). Note the movement in the upper-left corner of the polygon on the right.

Standard Options

Details
--active-layer <ACTIVE-LAYER>

Set the active layer. When it is specified, only the layer specified by its name will be subject to the processing. Other layers will be not modified. If this option is not specified, all layers will be subject to the processing.

--append

Whether appending features to existing layer(s) is allowed. This also creates the output dataset if it does not exist yet.

--co, --creation-option <NAME>=<VALUE>

Many formats have one or more optional dataset creation options that can be used to control particulars about the file created. For instance, the GeoPackage driver supports creation options to control the version.

May be repeated.

The dataset creation options available vary by format driver, and some simple formats have no creation options at all. A list of options supported for a format can be listed with the --formats command line option but the documentation for the format is the definitive source of information on driver creation options. See Vector drivers format specific documentation for legal creation options for each format.

Note that dataset creation options are different from layer creation options.

--if, --input-format <format>

Format/driver name to be attempted to open the input file(s). It is generally not necessary to specify it, but it can be used to skip automatic driver detection, when it fails to select the appropriate driver. This option can be repeated several times to specify several candidate drivers. Note that it does not force those drivers to open the dataset. In particular, some drivers have requirements on file extensions.

May be repeated.

--lco, --layer-creation-option <NAME>=<VALUE>

Many formats have one or more optional layer creation options that can be used to control particulars about the layer created. For instance, the GeoPackage driver supports layer creation options to control the feature identifier or geometry column name, setting the identifier or description, etc.

May be repeated.

The layer creation options available vary by format driver, and some simple formats have no layer creation options at all. A list of options supported for a format can be listed with the --formats command line option but the documentation for the format is the definitive source of information on driver creation options. See Vector drivers format specific documentation for legal creation options for each format.

Note that layer creation options are different from dataset creation options.

--oo, --open-option <NAME>=<VALUE>

Dataset open option (format specific).

May be repeated.

-f, --of, --format, --output-format <OUTPUT-FORMAT>

Which output vector format to use. Allowed values may be given by gdal --formats | grep vector | grep rw | sort

--output-open-option, --output-oo <NAME>=<VALUE>

Added in version 3.12.

Dataset open option for output dataset (format specific).

May be repeated.

--overwrite

Allow program to overwrite existing target file or dataset. Otherwise, by default, gdal errors out if the target file or dataset already exists.

--overwrite-layer

Whether overwriting the existing output vector layer is allowed.

--skip-errors

Added in version 3.12.

Whether failures to write feature(s) should be ignored. Note that this option sets the size of the transaction unit to one feature at a time, which may cause severe slowdown when inserting into databases.

--update

Whether to open an existing output dataset in update mode.

--upsert

Added in version 3.12.

Variant of --append where the OGRLayer::UpsertFeature() operation is used to insert or update features instead of appending with OGRLayer::CreateFeature().

This is currently implemented only in a few drivers: GPKG -- GeoPackage vector, Elasticsearch: Geographically Encoded Objects for Elasticsearch and MongoDBv3 (drivers that implement upsert expose the GDAL_DCAP_UPSERT capability).

The upsert operation uses the FID of the input feature, when it is set (and the FID column name is not the empty string), as the key to update existing features. It is crucial to make sure that the FID in the source and target layers are consistent.

For the GPKG driver, it is also possible to upsert features whose FID is unset or non-significant (the --unset-fid option of gdal vector edit can be used to ignore the FID from the source feature), when there is a UNIQUE column that is not the integer primary key.

Return status code

The program returns status code 0 in case of success, and non-zero in case of error (non-blocking errors emitted as warnings are considered as a successful execution).

Examples

Example 1: Create and then simplify a polygonal coverage

$ gdal vector pipeline read ne_10m_admin_0_countries.shp ! \
                       make-valid ! \
                       clean-coverage ! \
                       simplify-coverage --tolerance 1 ! \
                       write countries.shp