-
Notifications
You must be signed in to change notification settings - Fork 100
DNS: Prototype for how to implement inherited FP accuracy tests #1402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I have re-expressed our fundamental accuracies, i.e. correctly rounded, absolute, and ULP, in terms of intervals. Then built up a framework to express other functions' accuracies in terms of these building blocks. This allows for composition of various blocks to express complex accuracies like inherited accuracies. I have shown how cos, sin, and / can be rewritten using this framework, and implemented tan which our current framework does not support. As is, all of the cases I have implemented are montonic, I believe this framework can be extended to support non-monotic functions via custome impl/range implementations in a straight forward manner. There is a bunch of duplicated patterns that I could refactor, and the specific usage of inheritence can be revisited ina productionized version of this. A production version of this would probably involve a signicant revamp of the existing test runner/compartor code, since this supplants a lot of the functionality in there and I just kinda bodge them together. This does not handle accepting both rounding modes for quantization, I have just forced quantization before sending it off, but our current implementation doesn't actually handle this consistently either as-is. I think adding this handling into my prototype would be easier, since it would just be expanding out how normal vs subnormal is handled.
dneto0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this hangs together really well. It's a nice framing of the problem: separating concerns and making good abstractions.
I was slowed down by lack of comments but I think I pieced it together.
I'm confused by the Abs builder's impl implementation.... Seems wrong. I think the others look good/ reasonable.
@ben-clayton should comment on how this replaces existing float comparator infra. I'm less familiar (i think it's evolved a lot).
src/webgpu/util/fp_interval.ts
Outdated
| private readonly builder; | ||
|
|
||
| public constructor() { | ||
| this.builder = new ULPFPIntervalBuilder(2.5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a detail, but the ULP only applies in a certain range. (I think it's the range of normal values?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The WGSL definition of ulp I don't think restricts to normal values. It just says floating point numbers.
zoddicus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank for the comments @dneto0
Given that overall approach appears to be sound, I will start working on a more formal design doc.
src/webgpu/util/fp_interval.ts
Outdated
| private readonly builder; | ||
|
|
||
| public constructor() { | ||
| this.builder = new ULPFPIntervalBuilder(2.5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The WGSL definition of ulp I don't think restricts to normal values. It just says floating point numbers.
- Renamed FPInterval to F32Interval - Integrated functional changes from Ben - Implemented handling of undefined accuracy per discussion with David - Added a bit of documentation
|
I have done a major rewrite of this, integrating in Ben's change to functional coding as well as other things that we discussed like handling inputs where accuracy isn't defined. PTAL |
src/webgpu/shader/execution/expression/binary/f32_arithmetic.spec.ts
Outdated
Show resolved
Hide resolved
ben-clayton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Design is looking much nicer! Thank you for the changes!
|
Previews, as seen when this build job started (3843173): |
|
PTAL, I have done a major update of this prototype, porting over all of the existing FP tests to use the interval mechanism and refactoring/reorganizing the code. I am going to start working on converting the test runner/float match code to be orientated around intervals, so we can get rid of the 'cute' interval comparator hacks. That I expect to be the last update for this prototype, other than maybe responses to review comments. For actually productionizing/landing this giant blob, I plan to break this up into a series of PRs and have them review separately. Specifically I am expecting to first land a patch with the core framework code + unit tests. Then a series of patches that convert tests and add unittests to the helpers, then finally a patch that removes the old framework code and replaces it with the interval-specific code. Then I can work on adding new tests. |
|
Previews, as seen when this build job started (b631343): |
This removes a bunch of dead code/logic related to floating point tests that is not needed when using intervals.
|
I removed FloatMatch and related complexity, still need to still promote intervals as a direct expectation, so that the IntervalComparator can be removed. |
dneto0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked in detail at f32_interval.ts, compare.ts, and spot checked a bunch of other things.
Overall this looks very good to me.
+1
src/webgpu/util/f32_interval.ts
Outdated
| } | ||
|
|
||
| /** Convert a point to an interval of absolute error around the point */ | ||
| export function absoluteErrorInterval(n: number, rng: number): F32Interval { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Please explain the 'rng'. I don't understand the mnemonic.
It's the absolute error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do when productionizing
src/webgpu/util/f32_interval.ts
Outdated
| } | ||
|
|
||
| /** For a point operation run its impl over the input that may be an interval */ | ||
| function runPointImpl(x: F32Interval, impl: PointToInterval): F32Interval { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth noting that using this function on an interval only makes sense when the impl function is monotonic over the input interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do when productionizing
|
Previews, as seen when this build job started (8e367da): |
This allows for specifying test cases using a bare interval instead of having to wrap it with a comparator. This also reduces the number of comparators implemented to just anyOf and implicit ones used for packing test cases.
|
PTA(final)L, I have moved intervals up to being a type of expectation, so that they don't need to be wrapped in a comparator. This I think is more or less the final form that I expect the production version of this to look like before I start adding in new tests and handling of non-monotonic functions. I am going to start breaking this up into reviewable/landable PRs, unless there is any significant objections. |
|
Previews, as seen when this build job started (298f3c4): |
ben-clayton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done a few skims over this, and I like it. Good job!
One question about quantizing.
|
|
||
| const makeCase = (x: number): Case => { | ||
| return makeUnaryF32Case(x, Math.ceil); | ||
| return makeUnaryF32IntervalCase(x, ceilInterval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to quantize to f32 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, makeUnaryF32IntervalCase is responsible for quantizing to f32. I moved the quantization to the builders, since it needs to be done for every case for all of the test types, and made it explicit in the framework where/how the conversion is done.
This is also done right before calculating the acceptance interval, so we get rid of situations where we were actually calculating the interval for the non-quanitized input, which implementation will never actually see and getting subtle bugs.
Adds in machinary for specifying where extrema occur for non-monotonic functions. This is achieved via seperating the API that is called by tests and the implementation to allow for an additional optional extrema function to be defined. This allows moving the atan2 complexity related to multiple return values @ y = 0 from the test into the framework implementation, which will help when using atan2 to declare other tests. I also cleaned up a bunch of little naming things that my IDE was bugging me about.
|
Added one last change that adds support for non-monotonic functions |
|
Previews, as seen when this build job started (2e732de): |
| function runBinaryImpl(x: F32Interval, y: F32Interval, impl: BinaryToInterval): F32Interval { | ||
| function runBinaryOp(x: F32Interval, y: F32Interval, op: BinaryToIntervalOp): F32Interval { | ||
| if (op.extrema !== undefined) { | ||
| [x, y] = op.extrema(x, y); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm confused. For atan2, when the passed-in y interval contains 0, the resulting y interval is just [0,0].
I'm not sure how that percolates through the rest of the calculation. Do we get the span over undefined values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{[0,0], x} will be passed into atan2, which will hit the special case for y === 0, which will create a span of {4096 * ulp(π), 4096 * ulp(0), 4096 * ulp(π)} (The zero term could be dropped since the other terms are on either side)
So the acceptance interval will be [-π, π].
Honestly atan2 is a hot mess of a trig function.
Taking a look at what Vulkan does. It looks like they are differentiating between the different quadrants based on the signs of y & x, and considering [0, 0] as undefined.
I think I need to rewrite this a bit given what Vulkan is doing and looking at the definition of atan2, I will put up another patch with corrected calculations that we can discuss in detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have this cleaned up now. I have ended up implementing:
y != 0 && x != 0 -> 4096 * ulp(Math.atan2(y, x))
y == 0 && x == 0 -> Infinite (undefined)
y == 0 && x != 0 -> [-π, π]
I think in theory we should be able to use x > 0 vs x < 0 to distinguish between positive/negative π cases, but everytime I try that either SwiftShader or Nvidia will start failing tests due to having the other sign for some x values.
|
Previews, as seen when this build job started (05cde31): |
This implements the F32Interval class and fundamental accuracy functions described in my prototype for reworking, gpuweb#1402, how floating point tests are written. This also adds in testing that did not appear in my prototype. There are a couple of functions that are not currently used that I have included in this patch that have suppressions for unused-vars on them. I included them in this patch since they are logically grouped with this code. These supressions will be removed in future patches. Issue gpuweb#792
This implements the F32Interval class and fundamental accuracy functions described in my prototype, #1402, for reworking how floating point tests are written. This also adds in testing that did not appear in my prototype. There are a couple of functions that are not currently used that I have included in this patch that have suppressions for unused-vars on them. I included them in this patch since they are logically grouped with this code. These surpressions will be removed in future patches. Review discovered a bug in oneULP's implementation, which is fixed in this PR. Issue #792
|
This has been fully implemented in |
I have re-expressed our fundamental accuracies, i.e. correctly
rounded, absolute, and ULP, in terms of intervals. Then built up a
framework to express other functions' accuracies in terms of these
building blocks.
This allows for composition of various blocks to express complex
accuracies like inherited accuracies.
I have shown how
cos,sin, and/tests can be rewritten usingthis framework, and implemented
tantests which our currentframework does not support.
As is, all of the cases I have implemented are monotonic. I believe
this framework can be extended to support non-monotonic functions via
custom impl/range implementations in a straightforward manner.
There is a bunch of duplicated patterns that I could refactor, and the
specific usage of inheritance can be revisited in a productionized
version of this.
A production version of this would probably involve a significant revamp
of the existing test runner/comparator code, since this supplants a lot
of the functionality in there and I just kinda bodge them together.
This does not handle accepting both rounding modes for quantization, I
have just forced quantization before sending it off, but our current
implementation doesn't actually handle this consistently either
as-is. I think adding this handling into my prototype would be easier,
since it would just be expanding out how normal vs subnormal is
handled.