Skip to content

Speed up parsing simple unit strings with Unit#17004

Merged
mhvk merged 1 commit intoastropy:mainfrom
eerovaher:faster-simple-unit-parsing
Sep 30, 2024
Merged

Speed up parsing simple unit strings with Unit#17004
mhvk merged 1 commit intoastropy:mainfrom
eerovaher:faster-simple-unit-parsing

Conversation

@eerovaher
Copy link
Copy Markdown
Member

@eerovaher eerovaher commented Sep 12, 2024

Description

The updates do not slow down parsing strings representing composite units. The tradeoff is that they do slow down parsing simple strings with the unit parser parse() methods, but those are not meant to be called by users. They are public mostly to define our API for downstream developers.

Setup: $ ipython -i -c 'from astropy import units as u; from astropy.units.format import Generic, CDS, FITS, OGIP, VOUnit'

On main:

Python 3.12.3 (main, Jul 31 2024, 17:43:48) [GCC 13.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.27.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: for formatter in (None, Generic, CDS, FITS, OGIP, VOUnit):
   ...:     print(formatter)
   ...:     %timeit u.Unit("m", format=formatter)
   ...:     print()
   ...: 
None
1.24 μs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.generic.Generic'>
1.22 μs ± 12.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.cds.CDS'>
1.26 μs ± 19.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.fits.FITS'>
1.6 μs ± 19.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.ogip.OGIP'>
1.34 μs ± 6.97 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.vounit.VOUnit'>
1.6 μs ± 6.92 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In this pull request:

...
None
959 ns ± 8.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.generic.Generic'>
976 ns ± 1.83 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.cds.CDS'>
1.12 μs ± 4.52 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.fits.FITS'>
1.21 μs ± 1.78 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.ogip.OGIP'>
1.21 μs ± 0.884 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

<class 'astropy.units.format.vounit.VOUnit'>
1.33 μs ± 5.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
  • By checking this box, the PR author has requested that maintainers do NOT use the "Squash and Merge" button. Maintainers should respect this when possible; however, the final decision is at the discretion of the maintainer that merges the PR.

@eerovaher eerovaher added units Performance benchmark Run benchmarks for a PR labels Sep 12, 2024
@eerovaher eerovaher added this to the v7.0.0 milestone Sep 12, 2024
@eerovaher eerovaher requested review from mhvk and nstarman September 12, 2024 17:50
@github-actions
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

Copy link
Copy Markdown
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an excellent idea! But also a bit complex. How about extracting what is done to the top and doing it only for generic? I.e., something like,

            # seems more logical to switch order
            if isinstance(s, bytes):
                s = s.decode("ascii")

            # might as well speed this up too.
            f = unit_format.Generic if format is None else unit_format.get_format(format)

            # For Generic, try to short-circuit simple units.
            if (f is unit_format.Generic
                    and (unit := get_current_unit_registry().registry.get(s))):
                return unit

Of course, this could still keep the simplification you had in _do_parse, since really the short-circuit there is no longer needed.

@pllim
Copy link
Copy Markdown
Member

pllim commented Sep 12, 2024

ENCHMARKS NOT SIGNIFICANTLY CHANGED. 🤷‍♀️

@mhvk
Copy link
Copy Markdown
Contributor

mhvk commented Sep 12, 2024

BENCHMARKS NOT SIGNIFICANTLY CHANGED. 🤷‍♀️

The benefit is not as obvious here - it is a slight speed-up, but mostly is there to ensure that #16892 will not slow things down.

@eerovaher
Copy link
Copy Markdown
Member Author

I don't know what the definition of "significant" is in our benchmarks, but if you dig through the logs you will see that the difference between the relevant timings is much larger than their uncertainty, i.e. the change is significant:

https://github.com/astropy/astropy/actions/runs/10836066679/job/30068962593?pr=17004#step:7:636

https://github.com/astropy/astropy/actions/runs/10836066679/job/30068962593?pr=17004#step:7:1143

@eerovaher eerovaher force-pushed the faster-simple-unit-parsing branch from 92da66a to 6ca0c90 Compare September 13, 2024 13:35
@eerovaher
Copy link
Copy Markdown
Member Author

eerovaher commented Sep 13, 2024

Replying to the suggestions by @mhvk

            # seems more logical to switch order
            if isinstance(s, bytes):
                s = s.decode("ascii")

There was no significant speedup from placing this check after the shortcut.

            # might as well speed this up too.
            f = unit_format.Generic if format is None else unit_format.get_format(format)

Changing this part had no noticeable effect on performance, but getting rid of the redundant if format is None check does simplify the code.

            # For Generic, try to short-circuit simple units.
            if (f is unit_format.Generic
                    and (unit := get_current_unit_registry().registry.get(s))):
                return unit

My updates speed up parsing with all the astropy unit formatters. Trying to limit this shortcut to Generic will slow down the others because they would no longer benefit from the shortcut that they currently inherit from Generic._do_parse(). We could avoid that slowdown by keeping the shortcut also in _do_parse(), but that makes the code more complex and slows down parsing composite units, because they would try and fail to take the shortcut twice.

@mhvk
Copy link
Copy Markdown
Contributor

mhvk commented Sep 13, 2024

The main benefit I saw of my suggestion is that it does not rely on private methods at all. But it should also be (marginally) faster for the by-far most common case of format=None, both for simple units and for complex ones (latter because of not having to deal with a failed try/except). I don't think there is much need to worry about parsing in other formats - that is almost exclusively done during I/O where the reading of associated data sets dominates.

@pllim
Copy link
Copy Markdown
Member

pllim commented Sep 13, 2024

Re: #17004 (comment) -- Maybe @MridulS knows since he implemented the infrastructure.

@eerovaher
Copy link
Copy Markdown
Member Author

The following patch causes multiple test failures:

--- a/astropy/units/core.py
+++ b/astropy/units/core.py
@@ -2091,12 +2091,10 @@ class _UnitMetaClass(type):
             if isinstance(s, bytes):
                 s = s.decode("ascii")
 
-            try:
-                return f._parse_unit(s, detailed_exception=False)  # Try a shortcut
-            except (AttributeError, ValueError):
-                # No `f._parse_unit()` (AttributeError)
-                # or `s` was a composite unit or had weird Unicode in it (ValueError).
-                pass
+            if f is unit_format.Generic and (
+                unit := get_current_unit_registry().registry.get(s)
+            ):
+                return unit
 
             try:
                 return f.parse(s)

@mhvk
Copy link
Copy Markdown
Contributor

mhvk commented Sep 13, 2024

That's weird, I tested it yesterday. Will check.

@mhvk
Copy link
Copy Markdown
Contributor

mhvk commented Sep 13, 2024

The following patch causes multiple test failures:

The reason is that your other change is interfering; if starting from your PR, I remove the try/except you added in core.py, one also gets failures.

@eerovaher
Copy link
Copy Markdown
Member Author

My original suggestion was to move a shortcut from Generic._do_parse() to _UnitMetaClass. Are you suggesting we should keep the current shortcut in Generic._do_parse() and implement an additional new shortcut in _UnitMetaClass?

@mhvk
Copy link
Copy Markdown
Contributor

mhvk commented Sep 14, 2024

I'm actually not sure what is the best way right now. Certainly, the failure I mentioned above shows that current implementation is a bit broken - see #17010. I'll see if I can find some time over the weekend to solve that - that may also make clear what the best path forward is.

@mhvk
Copy link
Copy Markdown
Contributor

mhvk commented Sep 15, 2024

I found that removing the short-circuit showed quite a few bugs - those also show up if one tries to create composite units, and are fixed in #17011. I think this in itself makes the case stronger not to try too much short-circuiting in the parsers - they should just parse! So, I had a trial for moving it up in #17012.

A shortcut for non-composite units from the `Generic._do_parse()` method
(inherited by all `astropy` unit formats that can parse) is made more
effective by moving it to `Unit` because the shortcut is now attempted
earlier. This does slow down parsing simple strings with the unit parser
`parse()` methods, but those are not meant to be called by users
anyways. They are public mostly to define our API for downstream
developers.
@eerovaher eerovaher force-pushed the faster-simple-unit-parsing branch from 6ca0c90 to ecdd92b Compare September 30, 2024 15:23
Copy link
Copy Markdown
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again, this really is a fine & elegant solution. Let's just get it in!

@mhvk mhvk merged commit 04dfd24 into astropy:main Sep 30, 2024
@eerovaher eerovaher deleted the faster-simple-unit-parsing branch September 30, 2024 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark Run benchmarks for a PR Performance units

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants