Numba grid interpolator#885
Conversation
|
On large-scale apply_oe tests, for the presolve this seems to be about 2x faster and give solutions in the right range. For the main solve, things become incredibly slow. Could be a dimension thing, could be an implementation bug, or could be that I'm not holding settings correctly in all situations. Promising though! |
|
With this updated version, the serial test (shown above), gives approximately the same result. A batched ray-based implementation of the test shows a 2x speedup (delta indicative of ray overhead)....but this version no longer massively fails when coupled with ray. Testing on full scenes to get a handle on impact. |
|
Confirmed results in AV5 test case: Presolve: Main Solution: Analytical Line: Total Time: |
|
Confirmation was a bit strange. Using the same AV5 run call. I didn't see an appreciable difference for the presolve, but did see a consistent speedup with the main solve. With - Presolve - Main solve - |
|
@evan-greenbrg , which version of python are you on? I'm using 3.12. I ran a repeat test on the above, and reproduced the timings. I also saw that when running 5 different instances, the timings were repeatable within 0.02 spectra/s/core between runs (main solve) and 0.05 spectra/s/core between runs in the presolve. So version differences (including package pinning, python, etc.) is the best I can think of. For what it's worth, I'm also on numba 0.64.0. |
|
@pgbrodrick I tested on: Python: 3.10.16 I'll try with a python 3.11 or 3.12 install and re-run. Given that you are successfully reproducing and I'm not having any problems or slowdown on this version I'd be in favor of moving forward with it. Keeping the baseline |
|
I could definitely see this being a python 3.11+ difference. I do think I'd be keen to merge this in if others agree. |
|
I just finished running again with Python 3.12. I can reproduce your speed up: Presolve: 4.4151 spectra/s/core |
On a speedtest as defined below, gives:
Orig: 0.277
New: 0.0456
A factor of 6ish. TBD what that means for full isofit runs. Results will vary depending on grid size. Also TBD if this should be default.
Numba is not bound in pyproj, we may need a minimum.
Speedtest (mostly borrowed from test):