-
Notifications
You must be signed in to change notification settings - Fork 29.8k
Description
Picking up from the conversation that began in #48598...
The above mentioned pull request was a manual engine roll to check in golden file changes originating from Skia. The changes were minuscule, affecting two golden files with a 0.08% pixel difference detected by Gold.
As discussed in the PR, manual engine rolls are undesirable, so I'd like to discuss potential changes that could solve the need for a manual engine roll.
The current pre-submit construct that led to the manual engine roll was introduced in October (#43371). With that change, golden files became associated with commits on the master branch of flutter/flutter, and as such, files can only be added/modified via changes applied to the flutter/flutter repository.
Tryjobs were enabled in December (#44474), adding the ability to generate and approve changes on the dashboard from open pull requests prior to landing on master.
Regarding flutter/engine, if a golden file has changed, the pre-submit check of the engine roll PR will fail and automatically close, stopping the engine roll until the current engine sheriff resolves the failure.
This is 😄 good when there is an unintentional change, which the engine sheriff can then revert and restore auto-rolling.
This is 😞 bad when an expected or trivial change is made that should be approved, but the bot closes the pull request - resulting in the need for a manual engine roll.
I've thought a lot about different options here and wanted to lay them out for feedback and input from better-informed parties. Some of theses are current non-solutions, along with why, but may be possible to change to suit our needs.
1. flutter/engine Gold instance
- While looking into solutions for web testing, @yjbanov had a flutter/engine instance of Gold spun up. It does not appear to be in use currently as web testing has been enabled in flutter/flutter using its associated Gold instance. @GaryQian suggested keeping multiple sets of golden files for both the engine and the repo.
- Drawbacks:
- This would not provide a means of checking in golden changes from the engine to the framework
- A method of reconciliation across instances would be needed to make this work, which I don't believe Gold is currently designed for/would probably be a lot of work on the Gold-side.
2. Testing individual flutter/engine PRs against flutter/flutter golden files
- Individual PRs in flutter/engine do not currently run golden file tests, making the engine roll the gatekeeper.
- These tests could be enabled using Gold's
checkfeature, which does not need to be associated with a commit or pull request in flutter/flutter. This would then require adding or retrieving thegoldctlbinary for engine testing. - Drawbacks
- Intended golden file changes would fail for a given PR with no means to check it in to the flutter/flutter instance. PR could not land.
- I think this would effectively kick the can down the road, introducing the same issue to the Skia and Dart rolls if they were to affect golden files.
3. Setting up the flutter/engine auto-roller bot to detect golden file changes
- Add a feature to the bot that detects when it has failed checks due to a golden file change. This could take the form of preventing the bot from automatically closing the PR, and instead alerting the engine roll sheriff. The sheriff could then approve/revert the staged golden file changes under Changelists on the dashboard and the roller could proceed.
- Drawbacks:
- I am not sure where the bot code lives, or what it would entail to have to make this type of change.
4. Setting Gold to pass changes below a certain tolerance level
- The change that necessitated a manual engine roll above was tiny. It may be possible to set Gold such that a change < 1% (or whatever value deemed appropriate) is passable and automatically approved.
- Drawbacks:
- This could leave us vulnerable to undesirable engine changes landing and not being aware of them. Regressions a la a thousand tiny cuts et cetera..
- Setting a tolerance limit could be unreliable.
- This does nothing to help larger intended changes land without a manual engine roll.
- Would apply to all repositories, not just flutter/engine
5. Expand Gold's --upload-only feature to pre-empt golden file changes
- The
goldctltool currently has an--upload-onlyflag that may be able to be used to add golden files from the engine. - Drawbacks:
- Currently, it only works with a valid commit on master for flutter/flutter
- In order to preempt the golden file change, the contributor would have to generate the images on all supported platforms. Gold usually takes care of this for us, and is starting to expand further to cover more platforms and configurations, making this harder to do effectively in a manual fashion. cc/ @blasten who is looking into expanding Gold further to device-specific golden files, and @chingjun + @yjbanov who have been working on various browsers' support.
- Any contributor that sought to upload golden files ahead of time, if made possible, would also need the
goldctltool and a method of authentication.
6. Gold's notification system can alert the engine roller sheriff
- @kjlubick is currently working on a notification system for Gold, wherein comments will be made on related changes when Gold receives them. (ref: https://bugs.chromium.org/p/skia/issues/detail?id=9006). This could be utilized such that the engine sheriff is alerted to address the golden file change, in whatever way is appropriate per the method of handling changes.
- Drawbacks:
- This alone would not solve the need for a manual engine roll. Recommended for use in conjunction with another solution. :)
7. Engine sheriff can ignore failing tests and triage after auto roller lands
- Gold has an ignore feature that can be used to disregard a failing test. If an engine roll were to introduce a golden file change, the engine sheriff could put an ignore in place and then triage it after the engine change lands.
- Drawbacks:
- The triage-after-landing method was unreliable prior to tryjobs in flutter/flutter. If a contributor forgot to triage an image after it landed, which may take an hour or two to appear based on how we prioritize cirrus tasks, any contributor that picked up the un-triaged change would experience test failures.
- The sheriff would also have to remember to remove the ignore rule in Gold after triaging.
- Ignoring test results is sad.
8. Continue to manually roll changes
- AFAIK, the manual engine roll above is the first that has been needed since October when the current Gold implementation landed. This doesn't necessarily mean that golden file changes from the engine happen so infrequently, but could be a possibility. We could decide to maintain and observe for a bit to see how often this would be needed over time. I could also be unaware of other manual engine rolls, a quick search didn't come up with anything.
- Drawbacks:
- Manually rolling the engine.
Ultimately, at some point in the process, a new or updated golden file needs human approval. This is the same for flutter rolls when rolling into google, and was the model for the new Flutter Gold system. I think this system can be extended further for a better experience in the engine, and I welcome all thoughts and feedback! :)
cc-ing some folks that may have some thoughts about this. Thanks in advance & feel free to cc anyone else that may be interested.
@Hixie @goderbauer @dnfield @kjlubick @cbracken