Skip to content

Fidelity testing proposal#305

Merged
cdata merged 3 commits intomasterfrom
fidelity-test
Jan 25, 2019
Merged

Fidelity testing proposal#305
cdata merged 3 commits intomasterfrom
fidelity-test

Conversation

@cdata
Copy link
Contributor

@cdata cdata commented Jan 23, 2019

<model-viewer> fidelity testing proposal

This change proposes a regime for measuring, analyzing and comparing the fidelity of renders produced by <model-viewer> and renders produced by selected reference model viewer implementations, as well as aggregating analyses in order to track significant changes over time.

Highlights of the change include:

  • Fidelity checks are configured with JSON (related types)
  • References recorded as manually staged / saved screenshots (example)
    • As of this change, reference renderers include: Filament and iOS Quick Look
    • Filament's GLTF viewer was adapted for staging consistency, see this fork for details
    • An approximation of Quick Look's skybox has been created for staging <model-viewer>
  • I created a new model to be used as a litmus test for alpha blending behavior
  • A metric for computing perceived color distance is used to compare pixels between renders
    • The core implementation is adapted from a third-party library called pixelmatch
  • npm run check-fidelity reports the current working tree's fidelity relative to configured references
  • npm run compare-fidelity $GIT_REF reports the fidelity of a Git ref, announcing notable changes or regressions compared to the current working tree
  • When a CI build completes, npm run check-fidelity is now run at the end
    • Results also "compared" to most recent tagged release (no additional build step required)
    • The threshold for alerting of a fidelity regression is when there is >=10% increased dissonance in any test at any threshold compared to the last release
  • When a release tag is created, fidelity artifacts are checked in to gh-pages branch as it is deployed
  • A dev-time tool is introduced that visualizes render results next to references, and supports interactive, visual analysis of how any given render compares to another

TODO

  • Document procedure for capturing goldens
  • Document steps to create a new test scenario
  • Add brief description to each scenario describing what it tests

Potential future work

Examples

npm run check-fidelity
check-fidelity
npm run compare-fidelity $SOME_REF
compare-fidelity
Artifacts produced by check-fidelity
artifacts
Visual comparison tool
comparison-tool

Fixes #289

jsantell
jsantell previously approved these changes Jan 24, 2019
Copy link
Contributor

@jsantell jsantell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! This will be valuable to have as glTF matures and as the renderers we are interested in change subjectively, and as we improve our own rendering, it'll be great to have a log of the deltas.

Other note, I think we could use something like a small README in a few dirs (src/test/fidelity, test/fidelity or maybe just test/) as the project grows. Something that I'm curious about, for a contributor, are these any actionable steps they should take when submitting a PR? What do I do with the information of a delta? Or is this more only for our infrastructure, and we will take care of interpreting what these deltas mean?

},
plugins,
onwarn,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build times are already longer than I'd prefer, I don't think we need to rebuild the image comparison on every change during development.

Current build times:

time npm run dev

> @google/model-viewer@0.0.7 build /home/jsantell/Dev/model-viewer
> tsc && rollup -c

./lib/model-viewer.js → ./dist/model-viewer.js...
(!) Circular dependency: lib/model-viewer-base.js -> lib/three-components/Renderer.js -> lib/model-viewer-base.js
created ./dist/model-viewer.js in 2.5s

./lib/test/index.js → ./dist/unit-tests.js...
(!) Circular dependency: lib/model-viewer-base.js -> lib/three-components/Renderer.js -> lib/model-viewer-base.js
created ./dist/unit-tests.js in 1.9s

./examples/dependencies/index.js → ./examples/built/dependencies.js...
created ./examples/built/dependencies.js in 694ms

real	0m10.352s
user	0m18.311s
sys	0m0.663s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unintentional for it to rebuild on every change. Will fix if I see it doing that locally. But, the watch files are configured as:

include: '{lib/test/fidelity/**,lib/third_party/**}'

So it shouldn't be doing that 🤷‍♂️

As far as build times are concerned, we should always strive to reduce them. But, this item should only build once unless it is being actively dev'd on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed, it only builds these new targets when I change files that correspond to the configuration.

rollup.config.js Outdated
output: {
file: './dist/image-comparison-app.js',
sourcemap: true,
format: 'umd',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we gain anything from UMDifying these builds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

"resize-observer-polyfill": "^1.5.0",
"rimraf": "^2.6.2",
"rollup": "^0.66.0",
"rollup-plugin-cleanup": "^3.0.0-beta.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With these new dependencies, I can definitely see the image comparison being a separate (in repo?) module in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably. But, I think it should probably bake a bit before we consider splitting it out into its own thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

for (let y = 0; y < height; ++y) {
for (let x = 0; x < width; ++x) {
const index = y * width + x;
const position = index * 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think COMPONENTS_PER_PIXEL should be a constant (IMO one of the few things that can get away with being a magic number in some contexts, e.g. 255) but should be consistent

Copy link
Contributor Author

@cdata cdata Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to think about, but implicit meaning is almost always worse for readability. 255 at least has the context that a Uint8ClampedArray only holds uints in the range 0..255. 4 can't even say that much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smalls
smalls previously approved these changes Jan 24, 2019
`"${slug}/${key}" @ threshold ${threshold}`;
const percentage = `${(delta * 100).toFixed(2)}%`;

if (delta > ALERT_THRESHOLD) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be Math.abs(delta)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and "decreased" on the next line changed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because it's only bad if the delta is positive. Unless I flipped things around somewhere by accident... either way, the intended behavior is sign-dependent.

Copy link
Contributor

@smalls smalls Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right.. a decrease is good?

I might restructure so the code is (very bad pseudo-code and wording):

if (delta > ALERT) alert
else if (delta > 0) warn('the sources are more different');
else "great! the two sources are less different"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 WIll fix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right.. a decrease is good?

Negative deltas = more similarity than before

comparisonConstraints} decreased by ${percentage}!`);
} else if (Math.abs(delta) > 0) {
const changed =
delta > 0 ? 'decreased' : delta < 0 ? 'increased' : 'changed';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The third case seemed odd in code, although as I wrote it up it makes sense: delta==0, so changed='changed', so the output text is ".. changed by 0.00%".

The code would be clearer if instead of 'changed' this was 'didn''t change', maybe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, this is vestigial actually. Re-reading, the third condition can never be reached (a pre-condition of this path is that delta is not 0). Originally, it was logging every delta (even 0%), and I thought it was too noisy. I'll just remove the third condition.

const position = index * 4;
const delta =
colorDelta(candidateImage, goldenImage, position, position);
const bool = (delta < thresholdSquared ? 1 : 0) * 255;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bool short for Boolean, or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is short for boolean. Will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(originally shortened because it overlaps w/ the TypeScript type of the same name)

this.candidateContext = candidateCanvas.getContext('2d');
this.goldenCanvas = goldenCanvas;
this.goldenContext = goldenCanvas.getContext('2d');
this.booleanCanvas = booleanCanvas;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boolean (as in booleanCanvas) feels like a type, rather than a comparison algorithm - although I think you are referring to the comparison algorithm. Maybe a different name that'd be more descriptive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Open to suggestions. I'll try to think of a better name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming's hard.

The goal is to mark pixels that differ, without shading according to the amount of difference (which the deltaCanvas does), right?

Making it longer might help.. booleanComparisonCanvas, or comparisonAsBooleanCanvas (vs comparisonAsGradient, for the currently delta, maybe).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just toss a comment on the variable declaration, that'd work too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to change this to blackWhite* throughout

are intend to compare to include:

- [iOS Quick Look](https://developer.apple.com/arkit/gallery/)
- [Filament GLTF Viewer](https://github.com/google/filament/blob/master/samples/gltf_viewer.cpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think the preferred capitalization is "glTF" (even when in a title): https://www.khronos.org/gltf/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Will fix

<li>Top left: metallic 1, specular 1, roughness 0</li>
<li>Top right: metallic 1, specular 1, roughness 1</li>
<li>Bottom right: metallic 0, specular 0, roughness 0</li>
<li>Bottom left: metaalic 0, specular 0, roughness 0</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metallic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 will fix

smalls
smalls previously approved these changes Jan 25, 2019
@smalls
Copy link
Contributor

smalls commented Jan 25, 2019

Nice, I dig the README.md!

@cdata
Copy link
Contributor Author

cdata commented Jan 25, 2019

@jsantell in response to your questions, I have added a brief blurb on analyzing Travis build results to our wiki: https://github.com/GoogleWebComponents/model-viewer/wiki/Understanding-Travis-builds

@cdata cdata merged commit 1426922 into master Jan 25, 2019
@cdata cdata deleted the fidelity-test branch January 25, 2019 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants