Skip to content

Conversation

@angus-g
Copy link
Contributor

@angus-g angus-g commented Oct 20, 2019

This means initialisation of particles is very cheap from Python, and shouldn't be any worse from the C kernel. This gives a massive speedup. See #665 for other discussion.

As part of this change, the semantics of accessing properties on a ParticleSet has changed: pset[idx].property becomes pset.property[idx]. It would be possible to rearrange __getitem__ to return some kind of accessor, like the ParticleAccessor to make this backward-compatible, but I'm not sure of the efficiency implications of such a change. Speaking of which, I created ParticleAccessor before implementing __getattr__ on ParticleSet. It might be worth ripping this back out since it's somewhat unneeded.

A few of the tests are (randomly) failing in JIT mode with segfaults. It looks like maybe something isn't being initialised correctly but it's being a little tricky to debug. I thought I'd just put the bulk of the work up here first.

There are a couple of other remaining tasks:

  • get ParticleFile working again. I hacked this in another branch at a6684db, but that was before the writing to temporary dictionaries, so it'd probably be worth just responding to the few errors that pop up, rather than trying to merge this other change across. I'll do this after dealing with the JIT segfaults.
  • test all tutorials/examples (I imagine things like particle plotting probably won't work)

angus-g and others added 11 commits October 21, 2019 09:19
This means initialisation of particles is very cheap from Python, and
shouldn't be any worse from the C kernel. This gives a massive
speedup, and now the majority of time is spent in the kernel, where
you'd expect.
Note, not all unit tests pass yet. Am still working on that
# Conflicts:
#	parcels/examples/example_globcurrent.py
#	tests/test_kernel_language.py
including flake8 fixes, and making ParticleSet object indexable and
iterable again
@erikvansebille
Copy link
Member

Thanks so much for all this development, @angus-g! I'd really like to try give it a spin in a real-life setup to investigate how much faster Parcels becomes!

As you've seen, I did quite some development over the last few weeks. I've today finally fixed the ParticleFile, so that works too

As I can see, there are two open issues:

The second open issue obviously is much more problematic than the first one. But we're almost there!

@angus-g
Copy link
Contributor Author

angus-g commented Jan 29, 2020

Thanks for rolling with this! In response to your second issue, you're correct that individual particles aren't initialised any more: the class is used as a structure to hold information about the required variables for which to create arrays. A lot of the work has been shifted to the particle set instead. I suppose this introduces a level of indirection that may make it more difficult to reason about what's going on.

This seems like the sort of thing that needs to be solved with more metaprogramming, so that we can treat particles as individual (for initialisation and printing), whilst having them stored in the more efficient format.

Copy link
Contributor

@CKehl CKehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed some points in the files that I found non-intuitive to understand, and where an explanation would be welcome. Outside of the explicit comments, the changes made look good and appropriate.

@angus-g
Copy link
Contributor Author

angus-g commented Feb 12, 2020

Good call on the revert; it looks like anything written to stderr is considered an error in 2019? Weird!

erikvansebille and others added 2 commits February 12, 2020 11:48
Undoing logger.info comment now that Windows CI seems fixed (related to e.g. microsoft/azure-pipelines-tasks#12173)
@angus-g
Copy link
Contributor Author

angus-g commented Feb 20, 2020

Just pushed through a little change in the recovery kernel logic: we can handle all the particles with the Repeat state at once, which again removes a Python loop over all particles (e.g. in the case of the first execution step with deferred load, where all particles will be Repeat). The recovery kernels are still executed in a Python loop, which could be slow if there are many particles with errors.

I'm able to initialise and sample about 7 million particles on 0.1 deg global surface velocities, though the actual advection is letting me down for speed now.

@CKehl
Copy link
Contributor

CKehl commented Feb 21, 2020

@angus-g : I like the work you are recently doing on optimizing Parcels, and I would appreciate some talk via Slack, Mail or Skype to coordinate our optimization efforts a bit. I am preparing some branches at the moment for clean benchmarking of the source code and, afterwards, iterative hypothesis testing on some (code) optimization procedures and algorithmic changes. That benchmarking is also meant as an optimization guideline. Please contact me if you're in for a conversation, also on your goals and development- or research intents. Cheers, @CKehl

@angus-g
Copy link
Contributor Author

angus-g commented Feb 23, 2020

@CKehl no problem, I’m happy to discuss via Slack, preferably

@CKehl
Copy link
Contributor

CKehl commented Feb 24, 2020

How can I find you on Slack ? Alternatively, you can find me by searching for my full name.

Copy link
Contributor

@CKehl CKehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the state variable access comments, the PR looks fine and ready-to-merge to me now.

@erikvansebille erikvansebille merged commit be5dde9 into Parcels-code:master Mar 24, 2020
@angus-g
Copy link
Contributor Author

angus-g commented Mar 24, 2020

Thanks @erikvansebille and @CKehl!

@angus-g angus-g deleted the soa-storage branch March 24, 2020 21:57
erikvansebille added a commit that referenced this pull request Mar 25, 2020
erikvansebille added a commit that referenced this pull request Mar 25, 2020
@angus-g angus-g linked an issue Apr 8, 2020 that may be closed by this pull request
erikvansebille added a commit that referenced this pull request Apr 23, 2020
After #678, `print(ParticleSet)` did not include the custom Variables anymore. This PR fixes that
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Particle creation efficiency

3 participants