E2E: Add configurable AV sync tolerance for effect-heavy scenarios#93
Conversation
Co-authored-by: chrisgleissner <3969147+chrisgleissner@users.noreply.github.com>
0d9c049
into
fix/e2e-avsync-pop-detection
There was a problem hiding this comment.
Pull request overview
This PR adds configurable AV sync tolerance for E2E test scenarios experiencing false failures due to heavy CRT effects interfering with pop detection on CI. The solution introduces a "lenient" mode that treats sync failures as warnings instead of errors specifically for CI environments.
Changes:
- Added
tolerances.av_syncconfiguration to scenario YAML files supporting bothciandlocalenvironment-specific settings - Implemented loading logic in
e2e.pywith backward compatibility for the legacyav_sync_tolerance_msformat - Updated validation logic in
results.pyto support lenient mode, treating all-pops-out-of-sync as warnings on CI when lenient mode is enabled - Applied lenient CI tolerance to
ntsc_vintage_tvandntsc_classic_crtscenarios
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/e2e/scenarios/ntsc_vintage_tv/scenario.yaml | Added av_sync.ci: lenient tolerance for heavy CRT effects |
| tests/e2e/scenarios/ntsc_classic_crt/scenario.yaml | Added av_sync.ci: lenient tolerance for heavy CRT effects |
| tests/e2e/e2e.py | Implemented tolerance loading logic with CI/local detection and backward compatibility |
| tests/e2e/framework/orchestrator.py | Updated to pass av_sync_tolerance_mode to validator |
| tests/e2e/framework/validation/results.py | Added lenient validation logic that treats sync failures as warnings on CI |
| is_ci = os.environ.get('CI', '').lower() in ('1', 'true', 'yes') or \ | ||
| os.environ.get('GITHUB_ACTIONS', '').lower() in ('1', 'true', 'yes') |
There was a problem hiding this comment.
CI detection logic is duplicated here and at line 172. The Environment class already has a _detect_ci() method that checks multiple CI indicators comprehensively. Consider reusing the Environment's CI detection logic or instantiating Environment earlier to access its is_ci property for consistency.
|
|
||
| # Check for CI-specific or local-specific tolerance | ||
| if is_ci and 'ci' in av_sync_tol: | ||
| av_sync_tolerance_mode = av_sync_tol['ci'] | ||
| elif not is_ci and 'local' in av_sync_tol: | ||
| av_sync_tolerance_mode = av_sync_tol['local'] |
There was a problem hiding this comment.
When tolerances.av_sync is specified but neither 'ci' nor 'local' keys match the current environment, av_sync_tolerance_mode remains None. This could result in unexpected default behavior. Consider adding a fallback to handle cases where the YAML has tolerances.av_sync but doesn't specify the appropriate environment key, or at least add a comment explaining the intended behavior.
| # Check for CI-specific or local-specific tolerance | |
| if is_ci and 'ci' in av_sync_tol: | |
| av_sync_tolerance_mode = av_sync_tol['ci'] | |
| elif not is_ci and 'local' in av_sync_tol: | |
| av_sync_tolerance_mode = av_sync_tol['local'] | |
| # Allow both simple scalar and environment-specific mappings for av_sync tolerance | |
| if isinstance(av_sync_tol, (int, float, str)): | |
| # Single value applies to all environments | |
| av_sync_tolerance_mode = av_sync_tol | |
| elif isinstance(av_sync_tol, dict): | |
| # Check for CI-specific or local-specific tolerance first | |
| if is_ci and 'ci' in av_sync_tol: | |
| av_sync_tolerance_mode = av_sync_tol['ci'] | |
| elif not is_ci and 'local' in av_sync_tol: | |
| av_sync_tolerance_mode = av_sync_tol['local'] | |
| # Fallbacks when environment-specific key is missing | |
| elif 'default' in av_sync_tol: | |
| av_sync_tolerance_mode = av_sync_tol['default'] | |
| elif 'ci' in av_sync_tol: | |
| # Use CI value for all environments if it's the only one provided | |
| av_sync_tolerance_mode = av_sync_tol['ci'] | |
| elif 'local' in av_sync_tol: | |
| # Use local value for all environments if it's the only one provided | |
| av_sync_tolerance_mode = av_sync_tol['local'] | |
| else: | |
| logging.warning( | |
| "Scenario YAML 'tolerances.av_sync' is present but does not define " | |
| "'ci', 'local', or 'default'; using orchestrator default tolerance." | |
| ) |
| network_simulation: Optional[Dict] = None, | ||
| full_frame_pop: bool = False, | ||
| av_sync_tolerance_ms: int = 40): | ||
| av_sync_tolerance_mode = None): |
There was a problem hiding this comment.
The av_sync_tolerance_mode parameter is missing a type hint. Based on the usage in the code (checking for 'lenient' string and numeric values), it should be typed as Optional[Union[str, int, float]] for clarity and type safety.
| verbose: bool = False, | ||
| full_frame_pop: bool = False, | ||
| av_sync_tolerance_ms: int = 40): | ||
| av_sync_tolerance_mode = None): |
There was a problem hiding this comment.
The av_sync_tolerance_mode parameter is missing a type hint. Based on the usage in the code (checking for 'lenient' string and numeric values), it should be typed as Optional[Union[str, int, float]] for clarity and type safety.
| results['av_sync'] = {'status': status_str, 'details': msg} | ||
| errors.append(f"AV Sync: All {analyzed_count} analyzed pops out of sync - {msg}") | ||
| # All pops are out of sync | ||
| # If lenient mode is enabled and we're on CI, treat as warning instead of error |
There was a problem hiding this comment.
The lenient mode currently only takes effect when both use_lenient_validation is True AND env.is_ci is True. This means if a scenario YAML specifies tolerances.av_sync.local: lenient, it will have no effect since local runs always remain strict. Consider either: (1) documenting this limitation clearly in a comment, or (2) removing support for tolerances.av_sync.local: lenient to avoid confusion, since lenient mode only applies to CI runs.
| # If lenient mode is enabled and we're on CI, treat as warning instead of error | |
| # NOTE: Lenient AV sync validation is intentionally CI-only. | |
| # - We only downgrade "all pops out of sync" from error to warning when BOTH | |
| # `use_lenient_validation` is True AND the environment is CI (`env.is_ci`). | |
| # - Local runs always remain strict, even if a scenario YAML specifies | |
| # `tolerances.av_sync.local: lenient`; that setting has no effect on local runs. | |
| # This is by design to keep local debugging stricter than CI. |
Heavy CRT effects (afterglow, bloom, scanlines) in scenarios like
ntsc_vintage_tvandntsc_classic_crtcan cause pop detection issues on CI, leading to false AV sync failures. The existing code treated all sync failures as hard errors.Changes:
Scenario configuration: Added
av_synctolerance entry under existingtolerancessection in scenario YAML filesLoading logic (
e2e.py): Parsetolerances.av_sync.{ci,local}and pass mode to validation pipeline. Maintains backward compatibility with numericav_sync_tolerance_ms.Validation (
results.py): Whenav_sync_tolerance_mode == 'lenient'andenv.is_ci, treat sync failures as warnings instead of errors. Local runs remain strict.Applied lenient mode to
ntsc_vintage_tvandntsc_classic_crtscenarios. Other scenarios unaffected.💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.