Skip to content

MainThreadMonitor: fixed flakiness in CI#2517

Merged
NachoSoto merged 1 commit into
mainfrom
main-thread-monitor-threshold
May 19, 2023
Merged

MainThreadMonitor: fixed flakiness in CI#2517
NachoSoto merged 1 commit into
mainfrom
main-thread-monitor-threshold

Conversation

@NachoSoto

Copy link
Copy Markdown
Contributor

Fixes https://app.circleci.com/pipelines/github/RevenueCat/purchases-ios/11230/workflows/78444bf6-22b8-40fc-ad92-1f29279377d0/jobs/69663
This is meant to detect deadlocks. 1 second is unfortunately too low for CI with limited resources.

@NachoSoto NachoSoto added the test label May 19, 2023
@NachoSoto NachoSoto requested a review from a team May 19, 2023 20:05
@codecov

codecov Bot commented May 19, 2023

Copy link
Copy Markdown

Codecov Report

Merging #2517 (7ee5304) into main (55818e7) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #2517      +/-   ##
==========================================
- Coverage   87.83%   87.82%   -0.01%     
==========================================
  Files         199      199              
  Lines       13647    13647              
==========================================
- Hits        11987    11986       -1     
- Misses       1660     1661       +1     

see 1 file with indirect coverage changes

@joshdholtz joshdholtz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@NachoSoto NachoSoto merged commit d5b9236 into main May 19, 2023
@NachoSoto NachoSoto deleted the main-thread-monitor-threshold branch May 19, 2023 23:57
NachoSoto added a commit that referenced this pull request May 25, 2023
This was referenced May 31, 2023
NachoSoto added a commit that referenced this pull request Jun 15, 2023
See also #2517.

We keep getting flaky failures with this in CI, which is annoying because it marks the whole test run as failed and it doesn't retry.
The reason for that is because we have to use `fatalError` and can't use `fail()` because this detection has to happen outside the main thread.
If the main thread is blocked, well, we can't rely `XCTest` to run in the main thread.

Ultimately, we can't guarantee that CI machines will be fast enough to ensure they don't get blocked.
The purpose of this class was to detect deadlocks. By increasing it to 30 seconds, we pretty much avoid flaky failures for slow CI machines, but still have the ability to detect if the main thread is deadlocked due to a locking issue.
NachoSoto added a commit that referenced this pull request Jun 15, 2023
See also #2517.

We keep getting flaky failures with this in CI, which is annoying
because it marks the whole test run as failed and it doesn't retry.
The reason for that is because we have to use `fatalError` and can't use
`fail()` because this detection has to happen outside the main thread.
If the main thread is blocked, well, we can't rely `XCTest` to run in
the main thread.

Ultimately, we can't guarantee that CI machines will be fast enough to
ensure they don't get blocked.
The purpose of this class was to detect deadlocks. By increasing it to
30 seconds, we pretty much avoid flaky failures for slow CI machines,
but still have the ability to detect if the main thread is deadlocked
due to a locking issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants