Skip to content

Validate inputs and outputs from custom time formats#9375

Merged
taldcroft merged 6 commits intoastropy:masterfrom
aarchiba:validate-jd1-jd2
Oct 25, 2019
Merged

Validate inputs and outputs from custom time formats#9375
taldcroft merged 6 commits intoastropy:masterfrom
aarchiba:validate-jd1-jd2

Conversation

@aarchiba
Copy link
Contributor

Currently implementing custom time formats is difficult because the jd1 and jd2 values they produce are not validated; and the values they return when asked to export jd1 and jd2 are constrained by peculiar restrictions. This code enforces that jd1 and jd2 are floats or arrays of floats with matching shapes. It also permits, and tests, a variety of peculiar return types.

The main headache on the return types is the issue of "boxing" scalars in zero-dimensional arrays and unboxing them.

@mhvk
Copy link
Contributor

mhvk commented Oct 16, 2019

I'm not sure I like this - it means slicing, converting, and everything else will slow down because of the extra validation - much of the code counts on from_jd=True to be fast.

That said, I think there is room for improvement: the TimeFormat.set_jds() method could actually check that jd1 and jd2 are plain arrays with dtype=float64 and then do the setting; that way, super().set_jds() becomes useful.

Similarly, the base to_value() could check that whatever is passed through is an actual array (there are no other constraints on it).

@aarchiba
Copy link
Contributor Author

I'm not sure I like this - it means slicing, converting, and everything else will slow down because of the extra validation - much of the code counts on from_jd=True to be fast.

That said, I think there is room for improvement: the TimeFormat.set_jds() method could actually check that jd1 and jd2 are plain arrays with dtype=float64 and then do the setting; that way, super().set_jds() becomes useful.

Similarly, the base to_value() could check that whatever is passed through is an actual array (there are no other constraints on it).

If speed is a concern, do you have benchmarks I can test to see if this makes a significant difference?

@aarchiba
Copy link
Contributor Author

That said, I think there is room for improvement: the TimeFormat.set_jds() method could actually check that jd1 and jd2 are plain arrays with dtype=float64 and then do the setting; that way, super().set_jds() becomes useful.

Similarly, the base to_value() could check that whatever is passed through is an actual array (there are no other constraints on it).

The problem is that both set_jds and to_value are designed to be overridden by subclasses, so I can't use anything I put in there to validate values produced by custom formats. And, further unfortunately, the way set_jds is specified to work is by setting the jd1 and jd2 attributes on its object, so there's no way short of intercepting those to ensure that bogus values do not get set.

I will also point out that Time.jd1 and Time.jd2 are already properties that retrieve TimeFormat.jd1 and TimeFormat.jd2. As for the validation, np.broadcast is designed to be fast; it doesn't even copy data to create an apparently larger object, and I'm pretty sure it's implemented in C. The data type validation is all just inspecting metadata, and for whole arrays.

The messiest of the lot, the .value, doesn't occur at all when shoveling jd1 and jd2 around, it's only run when you extract some other format. And again, its a series of metadata checks followed by at most one unboxing and at most one type conversion - which are all there because we were already requiring those formats in our tests.

@mhvk
Copy link
Contributor

mhvk commented Oct 16, 2019

No, speed is not my only concern; it is also that this becomes too complicated for my taste. E.g., I'm not sure there is a big benefit to allowing python float for jd1 and jd2.

Note that I don't quite see the problem with subclasses overriding set_jds - this code is mostly for new subclasses, so if we change how we use it ourselves and document that one should call super().set_jds(...), I think we're OK (though it will be annoying that the super call fails on older astropy... an argument to make sure anything we do is in place for 4.0, which is LTS).

p.s. I may have to take a bit of a break of reviewing Time issues - I really need to focus on Quantity issues...

@aarchiba
Copy link
Contributor Author

No, speed is not my only concern; it is also that this becomes too complicated for my taste. E.g., I'm not sure there is a big benefit to allowing python float for jd1 and jd2.

I don't like it and didn't want to allow it but a number of tests started to fail when I tried making everything a float64, let alone enforcing array-ness. So our existing test suite (who knows about users?) relies on the existing ability for jd1 and jd2 to be python floats. That's not something that I introduced. Indeed I had planned not to.

Note that I don't quite see the problem with subclasses overriding set_jds - this code is mostly for new subclasses, so if we change how we use it ourselves and document that one should call super().set_jds(...), I think we're OK (though it will be annoying that the super call fails on older astropy... an argument to make sure anything we do is in place for 4.0, which is LTS).

There's quite a lot of difference between asking users to make a non-enforced code change (calling super, and presumably at the end?) and adding checks that the data is already in the right format that are transparent to users. In particular, asking users to call super() will not necessarily make writing custom time formats easier; it will provide another way for them to get it silently wrong. If you're asking them to change all their set_jd methods, why not ask them to call a new method that does the validation I currently have in properties? That's an easy change from my end.

Similarly, if you're going to change how they produce values, why not require them to use to_value instead of value, and have value always have the same implementation (call to_value with no arguments)? Well, okay, this will cause existing unchanged code to fail in mysterious ways, but so does adding a new requirement that they go through super().

p.s. I may have to take a bit of a break of reviewing Time issues - I really need to focus on Quantity issues...

No worries. I'll keep filing bugs and submitting PRs as long as I feel like I'm accomplishing something. I should probably get back to PINT and try to do some actual science with it anyway.

@mhvk
Copy link
Contributor

mhvk commented Oct 16, 2019

t._time.jd1 ever being a python float now is (yet another?) bug...

@aarchiba
Copy link
Contributor Author

t._time.jd1 ever being a python float now is (yet another?) bug...

Ah! In that case I can go squash it everywhere, including the tests that require it. That will simplify this code on both ends.

@aarchiba
Copy link
Contributor Author

t._time.jd1 ever being a python float now is (yet another?) bug...

There is one test that checks that (without setting the writeable flag to false) Time objects made from arrays are writeable but those made from scalars are not. If I always box scalars into arrays, this property changes. Should I change the test, set the writeable flag automatically for zero-dimensional arrays, or, uh, maybe just for scalars that aren't arrays?

For now I think I'll just change the test. "All Time objects are writeable or not as reflected by their flag" is easy to explain.

@aarchiba
Copy link
Contributor Author

Oh dear. Writability and broadcasting is a challenge. This may affect code without this PR too. I'd better write a test or two.

@bsipocz bsipocz added this to the v4.0 milestone Oct 21, 2019
@bsipocz bsipocz requested a review from taldcroft October 21, 2019 21:19
@taldcroft
Copy link
Member

taldcroft commented Oct 22, 2019

I just looked through this. In general the design philosophy has been that we test our own format classes to ensure compliance with internal requirements on jd1 and jd2, and expect developers of custom formats to do the same.

Having said that I can see an argument for being a little more helpful, but I do share the concerns of @mhvk about performance and complexity. The new _shaped_like_input is noticeably more difficult to parse (for me at least) in terms of functional work because of being intertwined with validation code.

Maybe my main question is whether it is possible to leave _shaped_like_input alone (at least for now) and only include the new validation in the jd1/jd2 setters. It seems that once the TimeFormat jd1, jd2 are validated and created, then most of the potential issues are caught. A custom format class has to really mess up to make an output value that would get caught by the new tests in _shape_like_input. In particular passing a function by forgetting to make value a property seems too far into the weeds of obscure problems, and is immediately obvious in testing.

The boxing seems reasonable to me, and I like that the scalar case behaves consistent with arrays for being writable. The new input validation in the jd1/2 setters looks good as well.


t = Time.now()
with pytest.raises(AttributeError):
with pytest.raises(TypeError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case the raising of AttributeError was a result of the existing implementation and came with an impenetrable message and backtrace. The new code is able to detect the problem and raise whichever exception seems suitable; in this case TypeError seemed appropriate. Only mis-implemented non-functional custom formats - like those I worked through trying to get one working - trigger this exception, so I do not expect to break any code by changing the exception; the previous exception was never chosen, just an artifact of the previous implementation.

else:
stat = os.statvfs(path)
return stat.f_bavail * stat.f_frsize
return shutil.disk_usage(path).free
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how that got in there - it's from PR #9182 . I'll clear it out.

@aarchiba
Copy link
Contributor Author

Having said that I can see an argument for being a little more helpful, but I do share the concerns of @mhvk about performance and complexity. The new _shaped_like_input is noticeably more difficult to parse (for me at least) in terms of functional work because of being intertwined with validation code.

I don't really see how there is interleaving? The first part confirms that it's an array, and unboxes it if it's safe to do so. (Only safe to unbox if it's zero-dimensional and not masked.)

The second part de-numpifies various types, so that users get string types and floats instead of the numpy equivalents.

All of its functionality is driven by the needs of existing test cases, with the exception of the explicit checks that jd1/jd2 are arrays.

Maybe my main question is whether it is possible to leave _shaped_like_input alone (at least for now) and only include the new validation in the jd1/jd2 setters. It seems that once the TimeFormat jd1, jd2 are validated and created, then most of the potential issues are caught. A custom format class has to really mess up to make an output value that would get caught by the new tests in _shape_like_input. In particular passing a function by forgetting to make value a property seems too far into the weeds of obscure problems, and is immediately obvious in testing.

That test is in there because I made precisely that mistake and debugging it was fairly difficult, particularly in light of the paucity of documentation. The other validation is because I made several other errors while trying to implement a custom format that were also very hard to debug, not least because it was hard to tell whether the problem was upon construction and creation of jd1/jd2 or construction of the return values. I was tempted to put in another stage of validation that confirmed that the return value matched the shape of the jd1/jd2 arrays but considering possible return arrays (for example ymdhms) it was pretty awkward.

Of course most of the things I pass as return values from custom formats are preposterous; the point is that we can handle whatever type the custom format wants to use.

The old version is much less consistent about unboxing - it fails with a mysterious error unless the returned value is an array type, it unboxes datetimes whether they are masked or not, and it returns float64s and strings rather than numpy versions - unless the custom format provides the non-numpy versions. If you tried to document its behaviour in terms of what values it deals with and what types come out, it would be a complicated business. The new version is actually much simpler in the sense that it has many fewer special cases and it is much less dependent on exactly what the custom format returns as a value.

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aarchiba - I took another pass through this, mostly looking at the changes in _shaped_like_input. Overall I agree the new version is more straightforward and looks generally good. Since this is a very core function in Time, I'm quite conservative with changes since we cannot put 100% faith in our test coverage (despite best efforts). This is especially the case since it is designed to be user-extendable. So please have patience with me. 😄

if not self._time.jd1.shape and not np.ma.is_masked(value):
out = value[()] if value.dtype.fields else value.item()
return out
if not isinstance(self._time.jd1, np.ndarray):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand how can this test ever succeed given the new jd1 setter which enforces that it must be ndarray? I'm assuming code is not messing with self._time._jd1 (where basically all bets are off).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. I have resisted the temptation to leave an assertion.

else:
out = value
# De-numpify types for the convenience of callers
if isinstance(out, np.float64):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are generally no restrictions on the type of value produced by a time format. The existing Time API is that scalar values are always converted to a Python-native type where possible via item() and we should maintain that. For test cases consider:

    @property
    def value(self):
        return self.jd1.astype(int)

    @property
    def value(self):
        out = self.jd1 + 1j * self.jd2
        return out

    @property
    def value(self):
        out = np.empty(shape=self.jd1.shape, dtype='S10')
        out[()] = b'junk'
        return out

    @property
    def value(self):
        out = np.empty(shape=self.jd1.shape, dtype='O')
        out[()] = {'a': 1}
        return out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covered by test_custom_format_can_return_any_scalar.

Should this include datetime64? The pre-existing test test_datetime64_no_format demands that datetime64 be able emerge from a Time object, but .item() turns that into a python datetime. There's also record dtypes, which lost their field names and accessors when .item()ized. (Both of these lead to test failures, so I have special-cased them.)

return float(out)
elif isinstance(out, np.str_):
return str(out)
elif callable(out) and not hasattr(self._time.__class__.value, "fget"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation is quite clear that value needs to be a property, so I think this test is not appropriate here and ask that you remove it. A better place to do this validation is when a new format is added to the registry. Maybe you can use your experience to add other tests at that point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a cleaner check there, thank you.

CHANGES.rst Outdated
- Time formats that do not use ``val2`` now raise ValueError instead of
silently ignoring a provided value. [#9373]

- Custom time formats (implemented in subclasses of ``TimeFormat``) now have
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to all Time formats, not only "custom" ones. I usually think of "custom" being user-defined. The PR title should be updated as well.

@bsipocz
Copy link
Member

bsipocz commented Oct 23, 2019

@aarchiba - this needs now a rebase.

jd1 and jd2 must be arrays of floats, and if one is an array they
will be broadcast together so that they are the same shape. If any of
this is not true an exception will be raised.

value can be more or less anything and if jd1/jd2 are zero-dimensional
then value will be unboxed and converted to a conventional python type
with a few exceptions. Masking prevents unboxing, and neither datetime64
nor record dtypes are converted to python equivalents.

Finally the otherwise hard-to-debug error of failing to mark value as a
property is caught at the time of construction of a new custom format
class.
@aarchiba
Copy link
Contributor Author

gerebaseerd

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aarchiba - Just teeny comments now. Overall this looks wonderful now, thanks for the work!!

I see that it seems there are real test fails on Windows. I didn't look into that in detail.

"to specified scale '{}', got error:\n{}"
.format(self.name, self.epoch_scale,
self.scale, err))
self.scale, err)) from err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about this syntax, but I like it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's new. The tracebacks can be long but they can be lifesavers especially when the problem shows up in CI. I haven't systematically gone through and applied it everywhere, but I couldn't resist adding it when I saw an opportunity.

Not relevant to astropy but six has a function raise_from to let you do this even on python2.

@aarchiba
Copy link
Contributor Author

I'm really not sure what's going on with the Windows failures. The reason Windows fails in ways Linux doesn't is because it doesn't have a long double type, so np.longdouble==np.float64. I don't know how that translates into the error we're seeing though. I'll keep looking. It may take several cycles through CI, though, as I don't really have a Windows machine I can try this on. I'll post something when it's sorted.

@mhvk
Copy link
Contributor

mhvk commented Oct 24, 2019

Since longdouble will be allowed in #9361, do you want to just remove those tests? Alternatively, I'm playing with

    @pytest.mark.skipif(np.finfo(np.longdouble).eps >= np.finfo(float).eps,
                        reason="long double is the same as float")

p.s. The matplotlib errors seem unrelated. A rebase might get rid of those...

@pllim
Copy link
Member

pllim commented Oct 24, 2019

MPL tests are broken right now, just ignore them. A rebase won't fix it (yet).

@aarchiba
Copy link
Contributor Author

Victory! Happy to squash and rebase if desired (sorry about all the microcommits).

@bsipocz
Copy link
Member

bsipocz commented Oct 24, 2019

@aarchiba - 6 commits for 200+ lines is great, so no need to squash.

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aarchiba - this is great improvement and I'm happy for your new perspective and inputs to the Time class!

@taldcroft taldcroft merged commit 9d4af76 into astropy:master Oct 25, 2019
@aarchiba
Copy link
Contributor Author

No problem. Thanks for putting up with my stubbornness!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants