-
Notifications
You must be signed in to change notification settings - Fork 100
Treat flutter-gold failures in flutter/engine as failures.
#3487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat flutter-gold failures in flutter/engine as failures.
#3487
Conversation
Piinks
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely check with @mdebbar, I am less familiar here.
I think if the expectation that the engine sheriff will triage images is not working, then this is a good option. It is how we treat the same situation in the framework.
Alternatively, I think Gold has an endpoint to see "are there unapproved images at head?" If we wanted to try making ownership by the sheriff work, we could have a separate check or alert that notifies them. IIRC that endpoint had rate limitations though. @kjlubick may be able to expand more on this. :)
|
https://flutter-engine-gold.skia.org/search is the best way to see all unapproved images at head. It should be reasonably performant (as well as the underlying RPC call, but please try to avoid calling those directly, since we may break you). I think the rate limitation thing @Piinks was thinking of was related to the changes for a particular PR - we use some extra caching for HEAD that isn't there for PRs. |
|
I'm not familiar with the recent developments in the engine golden story. But one thing that came to mind a couple days ago as I was looking into some golden problems was this: cocoon/app_dart/lib/src/request_handlers/push_gold_status_to_github.dart Lines 122 to 127 in f163e34
I think that |
|
I would like a quick write up on which RPCs would be affected on the Gold side before making a call about quotas. |
@kjlubick This doc lists all the RPCs we use, where and how. It is linked here in the wiki: https://github.com/flutter/flutter/wiki/Writing-a-golden-file-test-for-package:flutter and cane be used to write this up if we want to make changes.
@mdebbar That's super intentional (#672 for context).
If the engine has deviated from the expected pattern for naming test shards that execute golden file tests, then that's something we can easily fix. |
|
Just so I understand correctly @Piinks, the configuration pointed out by @mdebbar is |
|
Also while I appreciate the chat here, I'd love to still get this PR approved/landed unless I'm missing something? |
| } | ||
|
|
||
| // Treat any GitHub check run as stale if created over [Config.kGitHubCheckStaleThreshold] hours ago. | ||
| bool isStale(DateTime dateTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the removal of this here. @keyonghan what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall correctly we are trying to flag PRs that have checkruns over 2 hours old so I believe this code needs to stay if you are only removing stuff related to flutter-gold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my bad, I had branched from a stale PR. Reverted.
| final String? name = status.context; | ||
|
|
||
| // If the account author is a roller account do not block merge on flutter-gold check. | ||
| if (isToEngineRoller(author, slug) && name == 'flutter-gold') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the only code that needs to be removed here to avoid skipping and enforcing the failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thinks Ricardo, this is my bad - it looks like I had an older branch. Let me redo this fix and tag you.
|
Overall I think this is good but I would like to get @keyonghan to look over the stale checkRun flagging as I am not sure if he is tracking metrics there or we have alerting around that. |
Thanks @ricardoamador. I actually made a mistake and branched from an older The change to |
No. That whole file is the flutter-gold implementation intended to cover flutter/flutter and flutter/engine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @matanlurey !
|
A note about the 170 untriaged web goldens. They all come from Safari + HTML renderer. All of them exhibit bimodality. There are exactly 2 golden variants per test. One hypothesis from @eyebrowsoffire is that our macOS fleet is made of Arm and x64 machines, and our tests get one of them randomly. If Safari is sensitive to Arm vs x64, that would lead to this behavior. Essentially, you send a PR, triage goldens, then submit. However, what you just did is triage a random combination of Arm and x64 goldens. After the PR is submitted, the post-submit instance of LUCI will again randomize the bots that run web tests, and report different golden outcomes in post-submit. |
|
Thanks for explaining! I'm happy to hold this PR until we have consensus on how to handle those diffs. |
|
Goldens are allowed to have multiple images per configuration, so if it was truly bimodal (only), then you'd eventually triage all of them. |
|
I don't have consensus on how to land this yet, so holding off until I chat more with the teams that will be impacted (I'll loop in infra as well). |
Closes flutter/flutter#143352.
In practice this reverts the functionality in flutter/flutter#123979 (#2585). Sorry for the back-and-forth, we tried this model but ended up with (currently) ~170 untriaged failures, so we want to go back to treating these as failures and improve our tools/process if needed instead.
/cc @jonahwilliams and @Piinks