Add QgsVectorLayerUtils.fieldToDataArray(), QgsVectorLayer.field_to_numpy()#63532
Add QgsVectorLayerUtils.fieldToDataArray(), QgsVectorLayer.field_to_numpy()#63532nyalldawson merged 4 commits intoqgis:masterfrom
Conversation
Converts field values from an iterator to a binary array of data. The conversion is heavily optimised to provide fastest possible conversion to binary data. Only numeric data types are supported, other types will raise a Python TypeError exception.
|
@merydian what do you think? |
🍎 MacOS Qt6 buildsDownload MacOS Qt6 builds of this PR for testing. 🪟 Windows Qt6 buildsDownload Windows Qt6 builds of this PR for testing. |
55824fb to
199a220
Compare
Returns the values from a field as a numpy masked array. Heavily optimised to provide fantastic performance. Supports numeric fields only, other types raise a TypeError
199a220 to
73bd641
Compare
|
Great addition, everything looks good to me. I suppose this should be used in QgsVectorLayer.as_geopandas for better performance? Do you think implementing this for strings etc. would be worth considering? |
Eventually, yes -- but there's a few more steps first (see below)
It's not possible to use (variable length) strings in numpy arrays, as all the objects in the array must have equal size. (For this reason I think it'd be possible to support boolean/date/time fields in addition to numeric, but I haven't looked into that). What I'm thinking for next steps are:
|
|
Sounds good! Do you have any thoughts on how to split up this work, so that we don't do the same things simultaneously? Also just noticed that the methods in this PR are called to_numpy, whereas the other implementations are called as_numpy... |
How about I add the raw c++ API for fieldsToDataArray, and then you try building the python stuff on top of that?
Thanks, fixed |
|
On second thought, I'd like to try my hand at the whole implementation if that's fine with you? This PR would serve as a nice template I suppose. |
Go for it! Just reach out if you get stuck! |
QgsVectorLayerUtils.fieldToDataArray:
Converts field values from an iterator to a binary array of data. The conversion is heavily optimised to provide fastest possible
conversion to binary data. Only numeric data types are supported, other types will raise a Python TypeError exception.
QgsVectorLayer.field_to_numpy:
Returns the values from a field as a numpy masked array. Heavily optimised to provide fantastic performance. Supports numeric fields only, other types raise a TypeError.