Skip to content

Update E2E Test Results#101

Merged
chrisgleissner merged 6 commits into
mainfrom
test/update-e2e-results
Jan 23, 2026
Merged

Update E2E Test Results#101
chrisgleissner merged 6 commits into
mainfrom
test/update-e2e-results

Conversation

@chrisgleissner

Copy link
Copy Markdown
Owner

No description provided.

- Fix result deletion: prevent --e2e=foo from deleting other test results
- Add 3s preamble to media source MP4s to account for recording delay
- Adjust orchestrator wait time for media source playback
- Configure ntsc_effects_green_monitor tolerances for tint/preamble effects
- Add green_monitor afterglow threshold to preset assertions

Fixes ntsc_effects_green_monitor test - all assertions now pass.
Copilot AI review requested due to automatic review settings January 23, 2026 14:05

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the E2E baselines and adjusts media-mode timing/thresholds so recorded output aligns better with expected playback behavior (especially for the Green Monitor effects scenario).

Changes:

  • Add a media-file preroll (black video + silence) and extend media-mode wait time to cover preroll + content.
  • Tune Green Monitor assertion thresholds/tolerances to reduce flakiness in local runs.
  • Refresh stored E2E result artifacts (logs/metrics/READMEs) across multiple scenarios.

Reviewed changes

Copilot reviewed 35 out of 329 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/e2e/util/preset_assertions.py Adds Green Monitor afterglow threshold override to reduce false failures.
tests/e2e/util/generate_media_source.py Prepends a preroll to generated media files to better match OBS recording delay characteristics.
tests/e2e/scenarios/ntsc_effects_green_monitor/scenario.yaml Adjusts tolerances for media-source preroll bleed and green-tint afterglow variability.
tests/e2e/results/pal_default/resource_usage.csv Updates stored resource metrics for the PAL Default scenario run.
tests/e2e/results/pal_default/resource.json Updates aggregated resource stats for the PAL Default scenario run.
tests/e2e/results/pal_default/obs_stdout.log Updates captured OBS stdout log for the PAL Default scenario run.
tests/e2e/results/pal_default/network.json Updates stored network timing stats for the PAL Default scenario run.
tests/e2e/results/pal_default/README.md Updates rendered results summary for the PAL Default scenario run.
tests/e2e/results/ntsc_vintage_tv/resource_usage.csv Updates stored resource metrics for the NTSC Vintage TV scenario run.
tests/e2e/results/ntsc_vintage_tv/resource.json Updates aggregated resource stats for the NTSC Vintage TV scenario run.
tests/e2e/results/ntsc_vintage_tv/obs_stdout.log Updates captured OBS stdout log for the NTSC Vintage TV scenario run.
tests/e2e/results/ntsc_vintage_tv/network.json Updates stored network timing stats for the NTSC Vintage TV scenario run.
tests/e2e/results/ntsc_vintage_tv/README.md Updates rendered results summary for the NTSC Vintage TV scenario run.
tests/e2e/results/ntsc_green_monitor/validation_results.json Updates stored validation output for the NTSC Green Monitor scenario run.
tests/e2e/results/ntsc_green_monitor/resource_usage.csv Updates stored resource metrics for the NTSC Green Monitor scenario run.
tests/e2e/results/ntsc_green_monitor/resource.json Updates aggregated resource stats for the NTSC Green Monitor scenario run.
tests/e2e/results/ntsc_green_monitor/playback.csv Updates playback timeline artifact for the NTSC Green Monitor scenario run.
tests/e2e/results/ntsc_green_monitor/obs_stdout.log Updates captured OBS stdout log for the NTSC Green Monitor scenario run.
tests/e2e/results/ntsc_green_monitor/network.json Updates stored network timing stats for the NTSC Green Monitor scenario run.
tests/e2e/results/ntsc_green_monitor/README.md Updates rendered results summary for the NTSC Green Monitor scenario run.
tests/e2e/results/ntsc_default_avsync/validation_results.json Updates stored validation output for the NTSC Default A/V Sync scenario run.
tests/e2e/results/ntsc_default_avsync/resource_usage.csv Updates stored resource metrics for the NTSC Default A/V Sync scenario run.
tests/e2e/results/ntsc_default_avsync/resource.json Updates aggregated resource stats for the NTSC Default A/V Sync scenario run.
tests/e2e/results/ntsc_default_avsync/network.json Updates stored network timing stats for the NTSC Default A/V Sync scenario run.
tests/e2e/results/ntsc_default_avsync/av-sync.csv Updates stored A/V sync CSV artifact for the NTSC Default A/V Sync scenario run.
tests/e2e/results/ntsc_default_avsync/README.md Updates rendered results summary for the NTSC Default A/V Sync scenario run.
tests/e2e/results/ntsc_default_720p/resource_usage.csv Updates stored resource metrics for the NTSC Default 720p scenario run.
tests/e2e/results/ntsc_default_720p/resource.json Updates aggregated resource stats for the NTSC Default 720p scenario run.
tests/e2e/results/ntsc_default_720p/network.json Updates stored network timing stats for the NTSC Default 720p scenario run.
tests/e2e/results/ntsc_default_720p/README.md Updates rendered results summary for the NTSC Default 720p scenario run.
tests/e2e/results/ntsc_default/validation_results.json Updates stored validation output for the NTSC Default scenario run.
tests/e2e/results/ntsc_default/resource_usage.csv Updates stored resource metrics for the NTSC Default scenario run.
tests/e2e/results/ntsc_default/resource.json Updates aggregated resource stats for the NTSC Default scenario run.
tests/e2e/results/ntsc_default/playback.csv Updates playback timeline artifact for the NTSC Default scenario run.
tests/e2e/results/ntsc_default/network.json Updates stored network timing stats for the NTSC Default scenario run.
tests/e2e/results/ntsc_default/README.md Updates rendered results summary for the NTSC Default scenario run.
tests/e2e/framework/orchestrator.py Extends media-mode wait time to include the added preroll.
local-build.sh Changes result archiving behavior when copying E2E outputs into scenario result directories.

Comment thread tests/e2e/util/generate_media_source.py Outdated
Comment on lines +155 to +160
"""Add black video frames and silent audio to the beginning to match UDP preamble.

In UDP mode, there's a ~9-10s preamble showing the C64 logo while waiting for packets.
For media mode, OBS starts recording ~3-4s after playback starts (natural delay).
Use 3s preamble so the natural recording delay skips most black frames.
"""

Copilot AI Jan 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Fixed: Updated docstring to clarify this is a black preroll to account for OBS start/recording delays, not matching the UDP logo screen.

Copilot uses AI. Check for mistakes.
Comment thread tests/e2e/util/generate_media_source.py Outdated
Comment on lines +161 to +162
preamble_frames = int(preamble_duration_s * fps)
height, width, channels = frames_rgb.shape[1], frames_rgb.shape[2], frames_rgb.shape[3]

Copilot AI Jan 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Fixed: Changed to use math.ceil() to ensure the full preamble duration is met (e.g., NTSC 59.826 Hz now produces 180 frames = 3.0087s instead of 179 frames = 2.992s).

Copilot uses AI. Check for mistakes.
Comment thread tests/e2e/framework/orchestrator.py Outdated
Comment on lines +149 to +153
logger.info("🎞️ Media mode: Waiting for OBS media playback...")
frame_rate = 50.125 if self.format == 'PAL' else 59.826
duration = self.frames / frame_rate
time.sleep(duration + 2)
# Media files have a 3s preamble, so wait for preamble + content + buffer
time.sleep(3.0 + duration + 2)

Copilot AI Jan 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Fixed: Created shared constant MEDIA_PREAMBLE_DURATION_S in tests/e2e/util/constants.py. Both generate_media_source.py and orchestrator.py now import and use this constant to prevent drift.

Copilot uses AI. Check for mistakes.
Comment thread local-build.sh
Comment on lines 1268 to 1270
if [[ "$has_test_output" == "true" ]]; then
rm -rf "$results_root_dir"
# Create results directory if it doesn't exist (don't delete to preserve other test results)
mkdir -p "$results_root_dir"

Copilot AI Jan 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Fixed: Changed to use rsync -a --delete to properly sync test output and remove stale files from previous runs, ensuring consistent and clean results.

Copilot uses AI. Check for mistakes.
Comment thread local-build.sh Outdated
Comment on lines 1278 to 1279
# Copy test output, overwriting only the files that were generated
cp -a "$test_output_dir/." "$results_root_dir/"

Copilot AI Jan 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Fixed: Replaced cp -a with rsync -a --delete to ensure obsolete files from previous runs are removed, making committed/archived results internally consistent.

Copilot uses AI. Check for mistakes.
@chrisgleissner chrisgleissner merged commit 88dd9e8 into main Jan 23, 2026
41 checks passed
@chrisgleissner chrisgleissner deleted the test/update-e2e-results branch January 23, 2026 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants