Validate inputs and outputs from custom time formats by aarchiba · Pull Request #9375 · astropy/astropy

aarchiba · 2019-10-16T00:50:55Z

Currently implementing custom time formats is difficult because the jd1 and jd2 values they produce are not validated; and the values they return when asked to export jd1 and jd2 are constrained by peculiar restrictions. This code enforces that jd1 and jd2 are floats or arrays of floats with matching shapes. It also permits, and tests, a variety of peculiar return types.

The main headache on the return types is the issue of "boxing" scalars in zero-dimensional arrays and unboxing them.

mhvk · 2019-10-16T15:20:51Z

I'm not sure I like this - it means slicing, converting, and everything else will slow down because of the extra validation - much of the code counts on from_jd=True to be fast.

That said, I think there is room for improvement: the TimeFormat.set_jds() method could actually check that jd1 and jd2 are plain arrays with dtype=float64 and then do the setting; that way, super().set_jds() becomes useful.

Similarly, the base to_value() could check that whatever is passed through is an actual array (there are no other constraints on it).

aarchiba · 2019-10-16T15:48:54Z

I'm not sure I like this - it means slicing, converting, and everything else will slow down because of the extra validation - much of the code counts on from_jd=True to be fast.

That said, I think there is room for improvement: the TimeFormat.set_jds() method could actually check that jd1 and jd2 are plain arrays with dtype=float64 and then do the setting; that way, super().set_jds() becomes useful.

Similarly, the base to_value() could check that whatever is passed through is an actual array (there are no other constraints on it).

If speed is a concern, do you have benchmarks I can test to see if this makes a significant difference?

aarchiba · 2019-10-16T15:54:54Z

That said, I think there is room for improvement: the TimeFormat.set_jds() method could actually check that jd1 and jd2 are plain arrays with dtype=float64 and then do the setting; that way, super().set_jds() becomes useful.

Similarly, the base to_value() could check that whatever is passed through is an actual array (there are no other constraints on it).

The problem is that both set_jds and to_value are designed to be overridden by subclasses, so I can't use anything I put in there to validate values produced by custom formats. And, further unfortunately, the way set_jds is specified to work is by setting the jd1 and jd2 attributes on its object, so there's no way short of intercepting those to ensure that bogus values do not get set.

I will also point out that Time.jd1 and Time.jd2 are already properties that retrieve TimeFormat.jd1 and TimeFormat.jd2. As for the validation, np.broadcast is designed to be fast; it doesn't even copy data to create an apparently larger object, and I'm pretty sure it's implemented in C. The data type validation is all just inspecting metadata, and for whole arrays.

The messiest of the lot, the .value, doesn't occur at all when shoveling jd1 and jd2 around, it's only run when you extract some other format. And again, its a series of metadata checks followed by at most one unboxing and at most one type conversion - which are all there because we were already requiring those formats in our tests.

mhvk · 2019-10-16T16:13:31Z

No, speed is not my only concern; it is also that this becomes too complicated for my taste. E.g., I'm not sure there is a big benefit to allowing python float for jd1 and jd2.

Note that I don't quite see the problem with subclasses overriding set_jds - this code is mostly for new subclasses, so if we change how we use it ourselves and document that one should call super().set_jds(...), I think we're OK (though it will be annoying that the super call fails on older astropy... an argument to make sure anything we do is in place for 4.0, which is LTS).

p.s. I may have to take a bit of a break of reviewing Time issues - I really need to focus on Quantity issues...

aarchiba · 2019-10-16T16:26:23Z

No, speed is not my only concern; it is also that this becomes too complicated for my taste. E.g., I'm not sure there is a big benefit to allowing python float for jd1 and jd2.

I don't like it and didn't want to allow it but a number of tests started to fail when I tried making everything a float64, let alone enforcing array-ness. So our existing test suite (who knows about users?) relies on the existing ability for jd1 and jd2 to be python floats. That's not something that I introduced. Indeed I had planned not to.

Note that I don't quite see the problem with subclasses overriding set_jds - this code is mostly for new subclasses, so if we change how we use it ourselves and document that one should call super().set_jds(...), I think we're OK (though it will be annoying that the super call fails on older astropy... an argument to make sure anything we do is in place for 4.0, which is LTS).

There's quite a lot of difference between asking users to make a non-enforced code change (calling super, and presumably at the end?) and adding checks that the data is already in the right format that are transparent to users. In particular, asking users to call super() will not necessarily make writing custom time formats easier; it will provide another way for them to get it silently wrong. If you're asking them to change all their set_jd methods, why not ask them to call a new method that does the validation I currently have in properties? That's an easy change from my end.

Similarly, if you're going to change how they produce values, why not require them to use to_value instead of value, and have value always have the same implementation (call to_value with no arguments)? Well, okay, this will cause existing unchanged code to fail in mysterious ways, but so does adding a new requirement that they go through super().

p.s. I may have to take a bit of a break of reviewing Time issues - I really need to focus on Quantity issues...

No worries. I'll keep filing bugs and submitting PRs as long as I feel like I'm accomplishing something. I should probably get back to PINT and try to do some actual science with it anyway.

mhvk · 2019-10-16T18:21:34Z

t._time.jd1 ever being a python float now is (yet another?) bug...

aarchiba · 2019-10-16T18:58:08Z

t._time.jd1 ever being a python float now is (yet another?) bug...

Ah! In that case I can go squash it everywhere, including the tests that require it. That will simplify this code on both ends.

aarchiba · 2019-10-18T16:42:48Z

t._time.jd1 ever being a python float now is (yet another?) bug...

There is one test that checks that (without setting the writeable flag to false) Time objects made from arrays are writeable but those made from scalars are not. If I always box scalars into arrays, this property changes. Should I change the test, set the writeable flag automatically for zero-dimensional arrays, or, uh, maybe just for scalars that aren't arrays?

For now I think I'll just change the test. "All Time objects are writeable or not as reflected by their flag" is easy to explain.

aarchiba · 2019-10-18T18:28:01Z

Oh dear. Writability and broadcasting is a challenge. This may affect code without this PR too. I'd better write a test or two.

taldcroft · 2019-10-22T11:41:29Z

I just looked through this. In general the design philosophy has been that we test our own format classes to ensure compliance with internal requirements on jd1 and jd2, and expect developers of custom formats to do the same.

Having said that I can see an argument for being a little more helpful, but I do share the concerns of @mhvk about performance and complexity. The new _shaped_like_input is noticeably more difficult to parse (for me at least) in terms of functional work because of being intertwined with validation code.

Maybe my main question is whether it is possible to leave _shaped_like_input alone (at least for now) and only include the new validation in the jd1/jd2 setters. It seems that once the TimeFormat jd1, jd2 are validated and created, then most of the potential issues are caught. A custom format class has to really mess up to make an output value that would get caught by the new tests in _shape_like_input. In particular passing a function by forgetting to make value a property seems too far into the weeds of obscure problems, and is immediately obvious in testing.

The boxing seems reasonable to me, and I like that the scalar case behaves consistent with arrays for being writable. The new input validation in the jd1/2 setters looks good as well.

taldcroft · 2019-10-22T11:42:30Z

astropy/time/tests/test_custom_formats.py


    t = Time.now()
-    with pytest.raises(AttributeError):
+    with pytest.raises(TypeError):


Why these changes?

In this case the raising of AttributeError was a result of the existing implementation and came with an impenetrable message and backtrace. The new code is able to detect the problem and raise whichever exception seems suitable; in this case TypeError seemed appropriate. Only mis-implemented non-functional custom formats - like those I worked through trying to get one working - trigger this exception, so I do not expect to break any code by changing the exception; the previous exception was never chosen, just an artifact of the previous implementation.

taldcroft · 2019-10-22T11:42:44Z

astropy/utils/data.py

-    else:
-        stat = os.statvfs(path)
-        return stat.f_bavail * stat.f_frsize
+    return shutil.disk_usage(path).free


I don't know how that got in there - it's from PR #9182 . I'll clear it out.

aarchiba · 2019-10-22T13:53:38Z

Having said that I can see an argument for being a little more helpful, but I do share the concerns of @mhvk about performance and complexity. The new _shaped_like_input is noticeably more difficult to parse (for me at least) in terms of functional work because of being intertwined with validation code.

I don't really see how there is interleaving? The first part confirms that it's an array, and unboxes it if it's safe to do so. (Only safe to unbox if it's zero-dimensional and not masked.)

The second part de-numpifies various types, so that users get string types and floats instead of the numpy equivalents.

All of its functionality is driven by the needs of existing test cases, with the exception of the explicit checks that jd1/jd2 are arrays.

Maybe my main question is whether it is possible to leave _shaped_like_input alone (at least for now) and only include the new validation in the jd1/jd2 setters. It seems that once the TimeFormat jd1, jd2 are validated and created, then most of the potential issues are caught. A custom format class has to really mess up to make an output value that would get caught by the new tests in _shape_like_input. In particular passing a function by forgetting to make value a property seems too far into the weeds of obscure problems, and is immediately obvious in testing.

That test is in there because I made precisely that mistake and debugging it was fairly difficult, particularly in light of the paucity of documentation. The other validation is because I made several other errors while trying to implement a custom format that were also very hard to debug, not least because it was hard to tell whether the problem was upon construction and creation of jd1/jd2 or construction of the return values. I was tempted to put in another stage of validation that confirmed that the return value matched the shape of the jd1/jd2 arrays but considering possible return arrays (for example ymdhms) it was pretty awkward.

Of course most of the things I pass as return values from custom formats are preposterous; the point is that we can handle whatever type the custom format wants to use.

The old version is much less consistent about unboxing - it fails with a mysterious error unless the returned value is an array type, it unboxes datetimes whether they are masked or not, and it returns float64s and strings rather than numpy versions - unless the custom format provides the non-numpy versions. If you tried to document its behaviour in terms of what values it deals with and what types come out, it would be a complicated business. The new version is actually much simpler in the sense that it has many fewer special cases and it is much less dependent on exactly what the custom format returns as a value.

taldcroft

@aarchiba - I took another pass through this, mostly looking at the changes in _shaped_like_input. Overall I agree the new version is more straightforward and looks generally good. Since this is a very core function in Time, I'm quite conservative with changes since we cannot put 100% faith in our test coverage (despite best efforts). This is especially the case since it is designed to be user-extendable. So please have patience with me. 😄

taldcroft · 2019-10-23T11:26:15Z

astropy/time/core.py

-        if not self._time.jd1.shape and not np.ma.is_masked(value):
-            out = value[()] if value.dtype.fields else value.item()
-        return out
+        if not isinstance(self._time.jd1, np.ndarray):


Can you help me understand how can this test ever succeed given the new jd1 setter which enforces that it must be ndarray? I'm assuming code is not messing with self._time._jd1 (where basically all bets are off).

I think you're right. I have resisted the temptation to leave an assertion.

taldcroft · 2019-10-23T13:26:07Z

astropy/time/core.py

+            else:
+                out = value
+        # De-numpify types for the convenience of callers
+        if isinstance(out, np.float64):


There are generally no restrictions on the type of value produced by a time format. The existing Time API is that scalar values are always converted to a Python-native type where possible via item() and we should maintain that. For test cases consider:

@property def value(self): return self.jd1.astype(int) @property def value(self): out = self.jd1 + 1j * self.jd2 return out @property def value(self): out = np.empty(shape=self.jd1.shape, dtype='S10') out[()] = b'junk' return out @property def value(self): out = np.empty(shape=self.jd1.shape, dtype='O') out[()] = {'a': 1} return out

This is covered by test_custom_format_can_return_any_scalar.

Should this include datetime64? The pre-existing test test_datetime64_no_format demands that datetime64 be able emerge from a Time object, but .item() turns that into a python datetime. There's also record dtypes, which lost their field names and accessors when .item()ized. (Both of these lead to test failures, so I have special-cased them.)

taldcroft · 2019-10-23T13:36:59Z

astropy/time/core.py

+            return float(out)
+        elif isinstance(out, np.str_):
+            return str(out)
+        elif callable(out) and not hasattr(self._time.__class__.value, "fget"):


The documentation is quite clear that value needs to be a property, so I think this test is not appropriate here and ask that you remove it. A better place to do this validation is when a new format is added to the registry. Maybe you can use your experience to add other tests at that point?

It's a cleaner check there, thank you.

taldcroft · 2019-10-23T13:48:24Z

CHANGES.rst

 - Time formats that do not use ``val2`` now raise ValueError instead of
  silently ignoring a provided value. [#9373]

+- Custom time formats (implemented in subclasses of ``TimeFormat``) now have


This applies to all Time formats, not only "custom" ones. I usually think of "custom" being user-defined. The PR title should be updated as well.

bsipocz · 2019-10-23T20:35:19Z

@aarchiba - this needs now a rebase.

jd1 and jd2 must be arrays of floats, and if one is an array they will be broadcast together so that they are the same shape. If any of this is not true an exception will be raised. value can be more or less anything and if jd1/jd2 are zero-dimensional then value will be unboxed and converted to a conventional python type with a few exceptions. Masking prevents unboxing, and neither datetime64 nor record dtypes are converted to python equivalents. Finally the otherwise hard-to-debug error of failing to mark value as a property is caught at the time of construction of a new custom format class.

aarchiba · 2019-10-23T20:43:37Z

gerebaseerd

taldcroft

@aarchiba - Just teeny comments now. Overall this looks wonderful now, thanks for the work!!

I see that it seems there are real test fails on Windows. I didn't look into that in detail.

astropy/time/core.py

taldcroft · 2019-10-24T13:05:38Z

astropy/time/formats.py

                                  "to specified scale '{}', got error:\n{}"
                                  .format(self.name, self.epoch_scale,
-                                          self.scale, err))
+                                          self.scale, err)) from err


I didn't know about this syntax, but I like it!

It's new. The tracebacks can be long but they can be lifesavers especially when the problem shows up in CI. I haven't systematically gone through and applied it everywhere, but I couldn't resist adding it when I saw an opportunity.

Not relevant to astropy but six has a function raise_from to let you do this even on python2.

astropy/time/formats.py

aarchiba · 2019-10-24T14:32:47Z

I'm really not sure what's going on with the Windows failures. The reason Windows fails in ways Linux doesn't is because it doesn't have a long double type, so np.longdouble==np.float64. I don't know how that translates into the error we're seeing though. I'll keep looking. It may take several cycles through CI, though, as I don't really have a Windows machine I can try this on. I'll post something when it's sorted.

mhvk · 2019-10-24T17:01:08Z

Since longdouble will be allowed in #9361, do you want to just remove those tests? Alternatively, I'm playing with

    @pytest.mark.skipif(np.finfo(np.longdouble).eps >= np.finfo(float).eps,
                        reason="long double is the same as float")

p.s. The matplotlib errors seem unrelated. A rebase might get rid of those...

pllim · 2019-10-24T17:06:49Z

MPL tests are broken right now, just ignore them. A rebase won't fix it (yet).

aarchiba · 2019-10-24T20:56:36Z

Victory! Happy to squash and rebase if desired (sorry about all the microcommits).

bsipocz · 2019-10-24T21:02:25Z

@aarchiba - 6 commits for 200+ lines is great, so no need to squash.

taldcroft

Thanks @aarchiba - this is great improvement and I'm happy for your new perspective and inputs to the Time class!

aarchiba · 2019-10-26T01:14:26Z

No problem. Thanks for putting up with my stubbornness!

astropy-bot bot added the time label Oct 16, 2019

aarchiba mentioned this pull request Oct 16, 2019

Allow longdouble values through input validation #9368

Merged

aarchiba force-pushed the validate-jd1-jd2 branch from 242e11b to 9a41fbc Compare October 18, 2019 17:24

bsipocz added this to the v4.0 milestone Oct 21, 2019

bsipocz requested a review from taldcroft October 21, 2019 21:19

taldcroft reviewed Oct 22, 2019

View reviewed changes

taldcroft reviewed Oct 23, 2019

View reviewed changes

aarchiba force-pushed the validate-jd1-jd2 branch from f85962f to c9133f5 Compare October 23, 2019 20:43

aarchiba added 2 commits October 23, 2019 22:05

flake8 fix, rebase doesn't go through pre-commit

f2b3636

Fix and test for subtle broadcasting bug

f7a994a

taldcroft reviewed Oct 24, 2019

View reviewed changes

Cleanup and hopefully fix Windows

b3c40a1

Skip long double test if no long doubles

e2d9f22

Flip direction of skip test

051a2ba

taldcroft approved these changes Oct 25, 2019

View reviewed changes

taldcroft merged commit 9d4af76 into astropy:master Oct 25, 2019

Uh oh!

Conversation

aarchiba commented Oct 16, 2019

Uh oh!

mhvk commented Oct 16, 2019

Uh oh!

aarchiba commented Oct 16, 2019

Uh oh!

aarchiba commented Oct 16, 2019

Uh oh!

mhvk commented Oct 16, 2019

Uh oh!

aarchiba commented Oct 16, 2019

Uh oh!

mhvk commented Oct 16, 2019

Uh oh!

aarchiba commented Oct 16, 2019

Uh oh!

aarchiba commented Oct 18, 2019

Uh oh!

aarchiba commented Oct 18, 2019

Uh oh!

taldcroft commented Oct 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aarchiba commented Oct 22, 2019

Uh oh!

taldcroft left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsipocz commented Oct 23, 2019

Uh oh!

aarchiba commented Oct 23, 2019

Uh oh!

taldcroft left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aarchiba commented Oct 24, 2019

Uh oh!

mhvk commented Oct 24, 2019

Uh oh!

pllim commented Oct 24, 2019

Uh oh!

aarchiba commented Oct 24, 2019

Uh oh!

bsipocz commented Oct 24, 2019

Uh oh!

taldcroft left a comment

Choose a reason for hiding this comment

Uh oh!

aarchiba commented Oct 26, 2019

Uh oh!

taldcroft commented Oct 22, 2019 •

edited

Loading