Skip to content

Conversation

@sgillies
Copy link
Member

@sgillies sgillies commented Jan 13, 2019

This PR changes the _CRS (and CRS) class to make WKT the canonical representation of spatial reference.

Previously, all spatial reference systems were converted to a PROJ4 representation with some degree of precision loss. Unfortunate users noticed that Rasterio could create datasets with unexpected and incorrect reference systems.

Now, no such conversion is done except when a user demands it by calling the new to_proj4() or to_dict() methods or when conversion is implicitly required when a user accesses the CRS instance using a PROJ4 key, for example: CRS.from_wkt(wkt)['proj'].

The output of rio-info is now consistent with gdalinfo, which is a good thing.

$ gdalinfo tests/data/test_esri_wkt.tif 
Driver: GTiff/GeoTIFF
Files: tests/data/test_esri_wkt.tif
Size is 2, 2
Coordinate System is:
PROJCS["USA_Contiguous_Albers_Equal_Area_Conic_USGS_version",
    GEOGCS["GCS_North_American_1983",
        DATUM["D_North_American_1983",
            SPHEROID["GRS_1980",6378137.0,298.257222101]],
        PRIMEM["Greenwich",0.0],
        UNIT["Degree",0.017453292519943295]],
    PROJECTION["Albers"],
    PARAMETER["false_easting",0.0],
    PARAMETER["false_northing",0.0],
    PARAMETER["central_meridian",-96.0],
    PARAMETER["standard_parallel_1",29.5],
    PARAMETER["standard_parallel_2",45.5],
    PARAMETER["latitude_of_origin",23.0],
    UNIT["Meter",1.0],
    VERTCS["NAVD_1988",
        VDATUM["North_American_Vertical_Datum_1988"],
        PARAMETER["Vertical_Shift",0.0],
        PARAMETER["Direction",1.0],
        UNIT["Centimeter",0.01]]]
...
$ rio info tests/data/test_esri_wkt.tif --crs
PROJCS["USA_Contiguous_Albers_Equal_Area_Conic_USGS_version",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.017453292519943295]],PROJECTION["Albers"],PARAMETER["false_easting",0.0],PARAMETER["false_northing",0.0],PARAMETER["central_meridian",-96.0],PARAMETER["standard_parallel_1",29.5],PARAMETER["standard_parallel_2",45.5],PARAMETER["latitude_of_origin",23.0],UNIT["Meter",1.0],VERTCS["NAVD_1988",VDATUM["North_American_Vertical_Datum_1988"],PARAMETER["Vertical_Shift",0.0],PARAMETER["Direction",1.0],UNIT["Centimeter",0.01]]]

I was worried that some users may depend on getting EPSG:xxxx strings from rio-info, so I kept that behavior when possible. See for example:

$ rio info tests/data/RGB.byte.tif --crs
EPSG:32618

But in the interpreter, no such conversion is made. The following matches what we get from gdalinfo.

$ rio insp tests/data/RGB.byte.tif
Rasterio 1.0.13 Interactive Inspector (Python 3.6.6)
Type "src.meta", "src.read(1)", or "help(src)" for more information.
>>> src.crs
CRS.from_wkt('PROJCS["UTM Zone 18, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]]]')

The side effects of this should be pretty small for anyone who can tolerate changes in the return values of CRS.__str__ and CRS.__repr__. These methods still return strings, so the API hasn't strictly changed, but they are different strings.

Perhaps more important is that in changing the base class from UserDict to Mapping (the abstract base class), CRS objects become immutable. Mutability was an accident of the original implementation and not something that we intended to be a practice. Will this be a problem for anyone?

I'd love to hear any comments you may have.

def test_read_esri_wkt():
with rasterio.open('tests/data/test_esri_wkt.tif') as src:
assert 'PROJCS["USA_Contiguous_Albers_Equal_Area_Conic_USGS_version",' in src.crs.wkt
assert 'GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",' in src.crs.wkt
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Esri dialect of WKT is preserved when reading from a file.


def test_is_projected():
assert CRS({'init': 'EPSG:3857'}).is_projected is True
assert CRS({'init': 'epsg:3857'}).is_projected is True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalized EPSG isn't strictly correct for case-sensitive filesystems.

_safe_osr_release(osr)
# No dialect morphing, if the dataset was created using software
# "speaking" the Esri dialect, we will read Esri WKT.
return CRS.from_wkt(wkt)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bye bye crufty old code 👋

GDALSetProjection(self._hds, wkt)

CPLFree(wkt)
_safe_osr_release(osr)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adieu 👋

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@sgillies sgillies changed the title Refactor of _CRS class to make WKT canonical Change _CRS class to make WKT canonical Jan 15, 2019
if epsg:
info['crs'] = 'EPSG:{}'.format(epsg)
else:
info['crs'] = src.crs.to_string()
Copy link
Member

@snowman2 snowman2 Jan 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the use of the to_epsg functionality here!



class _CRS(UserDict):
class _CRS(collections.Mapping):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can keep it as a UserDict and just set self.data in the __init__ similar to what you have here and it should behave the same as before without the additional methods you added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but we want the WKT to be the single source of truth. And I don't want a CRS object to have all the features of a UserDict.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 👍


try:
exc_wrap_int(OSRImportFromProj4(osr, <const char *>b_proj))
except CPLE_BaseError as exc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that limiting the use cases to only those that can be exported to PROJ.4 could potentially limit what users can do (for example - the engineering WKT in an issue I have seen in this repo). Thoughts about just setting data = {} in this case and not raising an exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but this constructor has to be backwards compatible, and never supported WKT. The from_wkt class method is the thing to use when you have WKT, and from_user_input class method is the most general one that takes anything. I'm going to work on eliminating use of the old constructor in the rasterio code base and examples.

_safe_osr_release(osr1)
_safe_osr_release(osr2)

def to_wkt(self, morph_to_esri_dialect=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this new method! Will be useful in the case where there may be different dialects for the WKT (WKT1, WKT2, WKT_ESRI, etc..)

if int(code) <= 0:
raise CRSError("EPSG codes are positive integers")
return cls(init="epsg:%s" % code, no_defs=True)
return cls.from_dict(init="epsg:{}".format(code))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based off of changes in the PROJ.4 library, I would actually recommend not using the PROJ.4 syntax and instead use the WKT syntax here and use the from_wkt function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snowman2 but what is the WKT syntax for "pull record xxxx from the epsg table?"? I think from_dict will be fine for a few versions of GDAL, there's no plan to remove OSRImportFromPROJ4 from the API. That said, I could simplify things by using from_proj4 here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the format that is not necessarily WKT, but works with the FromUserInput (i.e. epsg:4326). The reasoning behind my suggestion is that they behave slightly different in PROJ.4. And if PROJ.4 will be the engine behind GDAL, it may be slightly different there as well.

The difference is in the axis information:

>>> from pyproj import CRS
>>> CRS(init='epsg:4326')
<CRS: +init=epsg:4326>
Name: WGS 84
Ellipsoid:
- semi_major_metre: 6378137.00
- semi_minor_metre: 6356752.31
- inverse_flattening: 298.26
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Prime Meridian:
- longitude: 0.0000
- unit_name: degree
- unit_conversion_factor: 0.01745329
Axis Info:
- Longitude[lon] (east) EPSG:9101 (radian)
- Latitude[lat] (north) EPSG:9101 (radian)

>>> CRS('epsg:4326')
<CRS: epsg:4326>
Name: WGS 84
Ellipsoid:
- semi_major_metre: 6378137.00
- semi_minor_metre: 6356752.31
- inverse_flattening: 298.26
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Prime Meridian:
- longitude: 0.0000
- unit_name: degree
- unit_conversion_factor: 0.01745329
Axis Info:
- Geodetic latitude[Lat] (north) EPSG:9122 (degree)
- Geodetic longitude[Lon] (east) EPSG:9122 (degree)

The one without init in it behaves how the current version of GDAL behaves. However, I am not sure if that will continue to be the case in future versions.

@snowman2
Copy link
Member

Fantastic changes! I am excited about this. 👏

One thought I had as I was looking through the code is that there are a lot of conversion from string to OSRSpatialReference objects in the class and back. I was wondering if it would make sense to add a property on the _CRS class in _crs,pyx of the type OSRSpatialReference that would be created on __init__ that could be used export to the format the user wanted instead of re-creating it to export it out in the other methods?

@cratcliff
Copy link

Wonderful changes! I'm looking forward to using the new version. It will solve my problem having to continually keep track of the ESRI wkt or lookup the associated EPSG to ensure files have the correct coordinate system.

@sgillies
Copy link
Member Author

@snowman2 I'll look into that. Early creation of an OGRSpatialReferenceH would have all kinds of benefits, including early validation.

@sgillies
Copy link
Member Author

sgillies commented Jan 18, 2019

@snowman2 in #1602 I've got an implementation that uses OGRSpatialReferenceH instead of WKT. There were some architectural consequences, but I like those, too :) In a nutshell: _CRS has more limited scope and CRS does more of the lifting. Would be grateful for your comments.

* Rewrite of _CRS to keep an OGRSpatialReferenceH

Rewrite of CRS to use _CRS by composition, not inheritance

New OGRErr handling function exc_wrap_ogrerr

* Remove commented code

* Add back in the error stack check for ImportFromProj

Also fix up docstrings
@sgillies
Copy link
Member Author

This is done. Am scheduling a 1.0.14 release next.

@sgillies sgillies deleted the crs-canonical-wkt branch February 10, 2021 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants