-
Notifications
You must be signed in to change notification settings - Fork 7
feature request: --as_geodataframe magic argument to support geopandas GeoDataFrame results #90
Description
Is your feature request related to a problem? Please describe.
When using the bigquery %%bigquery cell magic, there is no ability to return the result as a GeoPandas GeoDataFrame when the expected results include GEOGRAPHY columns. This leads to repeated post-processing when performing geospatial analysis.
Describe the solution you'd like
An ideal solution would be to add an argument (e.g. --as_geodataframe) so that the magic itself could implicitly perform the conversion.
# this example would use the default behavior of GeoDataFrame and convert the column named 'geometry'
%%bigquery df --as_geodataframe
SELECT
col1,
col2,
geometry,
...
...
FROM ...
This could be done by leveraging the existing QueryJob.to_geodataframe method.
The same argument could be used to selected specific geometry column:
%%bigquery df --as_geodataframe geography_col
SELECT
col1,
col2,
geography_col,
...
...
FROM ...
Optional: The argument could also be a dictionary if more options need to be passed into QueryJob.to_geodataframe
Describe alternatives you've considered
Currently using a workaround function to post-process the resulting pandas DataFrame using both shapely and geopandas as follows:
import shapely
import geopandas as gpd
def pandas_to_geopandas(df, col='geometry'):
# example geometry column
col = 'geometry_column'
# use shapely library to parse WKT strings into shapely Geometry based objects
df[col] = df[col].dropna().apply(shapely.from_wkt)
# convert to a geopandas Dataframe
return gpd.GeoDataFrame(df, geometry=col, crs='EPSG:4326')However, using a magic argument as proposed previously would prevent the need to call this function. This could reduce the number of cells in notebooks that focus on geospatial analysis.
Additional context
References: