Skip to content

Spatial join returns incorrect results with Shapely 1.8, due to shapely_compatible optimistically set to True #424

@nicolaslegrand91

Description

@nicolaslegrand91

Code Sample, a copy-pastable example

import geopandas as gpd
from shapely import wkt

# Define 2 geometries
point = wkt.loads('POINT (-3.61104 54.12069)')
polygon = wkt.loads('''
    Polygon ((-3.63156666666667 54.1267166666667,-3.53435 54.0763833333333,-3.5634 54.0383333333333,
    -3.58396666666667 54.0467,-3.60175 54.0521333333333,-3.62465 54.05735,-3.64296666666667
    54.0595333333333,-3.65568333333333 54.0740166666667,-3.6695 54.0932833333333,-3.65568333333333
    54.1054666666667,-3.63156666666667 54.1267166666667))
''')
print('\npoint within polygon?')
print(point.within(polygon))

# Geodataframe with 1 polygon
df = gpd.GeoDataFrame.from_dict({
    'id': [1],
    'geometry': [polygon],
})
# Geodataframe with 1 Point
poi = gpd.GeoDataFrame.from_dict({
    'id': [2],
    'geometry': [point],
})

print('\nSpatial join:')
print(gpd.sjoin(poi, df, predicate="within"))

Problem description

The spatial join behaviour (with GeoPandas PyGeos) has changed between Shapely 1.7.1 (correct behaviour) and Shapely 1.8 (incorrect behaviour).
Note that the point does not fall into the polygon. These are WGS84 coordinates, and the distance from the point to the polygon is ~400 meters.
image

With Shapely 1.7.1 (correct behaviour)
image

With Shapely 1.8 (incorrect behaviour)
Note that the point is still considered out of the polygon, but the spatial joins incorrectly considers the point to be in the polygon.
image

Expected Output

The expected output is the one with Shapely 1.7.1.
I think this comes from the fact that PyGeos & Shapely 1.8 are built on the same GEOS version, hence no conversions apply (the warnings does not appear). Maybe the formats are not exactly compatible.

Indeed, the problem disappears if I force Pygeos to perform the WKB conversion between Shapely & Pygeos geometries.
More precisely, if I set the boolean shapely_compatible to False directly in Pygeos source code, I get the expected behaviour.

shapely_compatible = True

More details on the relative issue on GeoPandas: geopandas/geopandas#2198

Note: the problem does not happen with pygeos 0.11.1, that is built on newer version of GEOS, but may happen again in the future if Shapely GEOS version is upgraded.

Output of geopandas.show_versions()

This is the output with shapely 1.8.0. Simply run pip install shapely==1.7.1 to test the working configuration.

Details

SYSTEM INFO

python : 3.9.5 (default, May 4 2021, 03:33:11) [Clang 12.0.0 (clang-1200.0.32.29)]
executable : /Users/nicolaslegrand/.virtualenvs/py39/bin/python
machine : macOS-10.16-x86_64-i386-64bit

GEOS, GDAL, PROJ INFO

GEOS : 3.9.1
GEOS lib : /usr/local/Cellar/geos/3.9.1/lib/libgeos_c.dylib
GDAL : 3.2.2
GDAL data dir: None
PROJ : 7.2.1
PROJ data dir: /Users/nicolaslegrand/.virtualenvs/py39/lib/python3.9/site-packages/pyproj/proj_dir/share/proj

PYTHON DEPENDENCIES

geopandas : 0.10.2
pandas : 1.2.5
fiona : 1.8.19
numpy : 1.21.3
shapely : 1.8.0
rtree : None
pyproj : 3.0.1
matplotlib : 3.4.3
mapclassify: None
geopy : 2.2.0
psycopg2 : 2.9.1 (dt dec pq3 ext lo64)
geoalchemy2: None
pyarrow : None
pygeos : 0.10.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions