-
Notifications
You must be signed in to change notification settings - Fork 168
Convert ParticleSet to structure of arrays #678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This means initialisation of particles is very cheap from Python, and shouldn't be any worse from the C kernel. This gives a massive speedup, and now the majority of time is spent in the kernel, where you'd expect.
# Conflicts: # parcels/particlefile.py
Note, not all unit tests pass yet. Am still working on that
# Conflicts: # parcels/examples/example_globcurrent.py # tests/test_kernel_language.py
including flake8 fixes, and making ParticleSet object indexable and iterable again
…with small p for particle iterator
|
Thanks so much for all this development, @angus-g! I'd really like to try give it a spin in a real-life setup to investigate how much faster Parcels becomes! As you've seen, I did quite some development over the last few weeks. I've today finally fixed the ParticleFile, so that works too As I can see, there are two open issues:
The second open issue obviously is much more problematic than the first one. But we're almost there! |
For backward compatibility checks
# Conflicts: # parcels/particlefile.py
To try fix breaking CI on Windows Github Actions
# Conflicts: # parcels/particleset.py
Updating to 2019 version of windows server for GitHub actions
|
Thanks for rolling with this! In response to your second issue, you're correct that individual particles aren't initialised any more: the class is used as a structure to hold information about the required variables for which to create arrays. A lot of the work has been shifted to the particle set instead. I suppose this introduces a level of indirection that may make it more difficult to reason about what's going on. This seems like the sort of thing that needs to be solved with more metaprogramming, so that we can treat particles as individual (for initialisation and printing), whilst having them stored in the more efficient format. |
# Conflicts: # tests/test_fieldset.py
CKehl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed some points in the files that I found non-intuitive to understand, and where an explanation would be welcome. Outside of the explicit comments, the changes made look good and appropriate.
Reverting to windows-2016
|
Good call on the revert; it looks like anything written to stderr is considered an error in 2019? Weird! |
Undoing logger.info comment now that Windows CI seems fixed (related to e.g. microsoft/azure-pipelines-tasks#12173)
|
Just pushed through a little change in the recovery kernel logic: we can handle all the particles with the I'm able to initialise and sample about 7 million particles on 0.1 deg global surface velocities, though the actual advection is letting me down for speed now. |
|
@angus-g : I like the work you are recently doing on optimizing Parcels, and I would appreciate some talk via Slack, Mail or Skype to coordinate our optimization efforts a bit. I am preparing some branches at the moment for clean benchmarking of the source code and, afterwards, iterative hypothesis testing on some (code) optimization procedures and algorithmic changes. That benchmarking is also meant as an optimization guideline. Please contact me if you're in for a conversation, also on your goals and development- or research intents. Cheers, @CKehl |
|
@CKehl no problem, I’m happy to discuss via Slack, preferably |
|
How can I find you on Slack ? Alternatively, you can find me by searching for my full name. |
CKehl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the state variable access comments, the PR looks fine and ready-to-merge to me now.
|
Thanks @erikvansebille and @CKehl! |
This fixes a bug introduced in #678
This fixes a bug introduced in #678
After #678, `print(ParticleSet)` did not include the custom Variables anymore. This PR fixes that
This means initialisation of particles is very cheap from Python, and shouldn't be any worse from the C kernel. This gives a massive speedup. See #665 for other discussion.
As part of this change, the semantics of accessing properties on a
ParticleSethas changed:pset[idx].propertybecomespset.property[idx]. It would be possible to rearrange__getitem__to return some kind of accessor, like theParticleAccessorto make this backward-compatible, but I'm not sure of the efficiency implications of such a change. Speaking of which, I createdParticleAccessorbefore implementing__getattr__onParticleSet. It might be worth ripping this back out since it's somewhat unneeded.A few of the tests are (randomly) failing in JIT mode with segfaults. It looks like maybe something isn't being initialised correctly but it's being a little tricky to debug. I thought I'd just put the bulk of the work up here first.
There are a couple of other remaining tasks:
ParticleFileworking again. I hacked this in another branch at a6684db, but that was before the writing to temporary dictionaries, so it'd probably be worth just responding to the few errors that pop up, rather than trying to merge this other change across. I'll do this after dealing with the JIT segfaults.