Improve A/V sync instrumentation#77
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces comprehensive A/V synchronization testing and instrumentation for the C64 Stream plugin. It adds two C64 test programs (manual and automatic modes) that generate deterministic A/V "pops" (white screen flashes with audio tones), along with debug-only detection logic in the OBS plugin to measure and log sync timing deltas.
Key changes:
- New C64 assembly programs for generating synchronized A/V test patterns with PAL/NTSC detection
- Debug-mode edge detection for video (all-white frames) and audio (non-silent samples) in the plugin
- CSV instrumentation extensions that append
is_all_whiteandhas_signalcolumns when debug logging is enabled - Cross-thread A/V sync delta logging at OBS submission time
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/c64/av-sync.asm | Updated manual A/V sync test with improved color scheme (white instead of blue), PAL/NTSC frequency detection, and consistent pulse waveform |
| tools/c64/av-sync-auto.asm | New automatic A/V sync test that generates one-frame pops every 48 frames using raster IRQs |
| tools/c64/README.md | Added documentation for the new av-sync-auto.asm program |
| src/c64-types.h | Added fields for A/V sync state tracking: timestamps, counters, and edge detection flags |
| src/c64-video.c | Implemented all-white frame detection and video pop edge detection with delta logging |
| src/c64-audio.c | Implemented audio signal detection and audio pop edge detection with delta logging, plus OBS submission timing |
| src/c64-protocol.c | Added packet-level payload inspection for white video data (0x11 bytes) and non-silent audio samples |
| src/c64-record.c | Added csv_debug_enabled snapshot during recording session initialization |
| src/c64-record-video.c | Updated video recording call to pass is_all_white parameter |
| src/c64-record-obs.h | Updated function signatures to include is_all_white and has_signal parameters |
| src/c64-record-obs.c | Modified CSV output to conditionally append debug columns based on csv_debug_enabled flag |
| src/c64-record-network.h | Updated function signatures to include is_all_white and has_signal parameters |
| src/c64-record-network.c | Modified network CSV output to conditionally append debug columns |
| doc/av-sync.md | Comprehensive new documentation explaining A/V sync testing methodology, program behavior, and CSV interpretation |
| doc/e2e.md | Added documentation about optional debug columns in obs.csv |
… tests - Introduced a new document detailing the frame progression marker detection algorithm, including marker design, detection algorithm, adaptive thresholds, and testing procedures. - Removed outdated acceptance test scripts and README related to the C64 Stream plugin, which are no longer relevant to the current testing framework. - Deleted scanline test script and properties ini defaults test, as they are superseded by new testing methodologies.
8 tasks
- Rename av_pop_delta → av_pop_offset assertion across codebase - Fix UDP generator timing to use correct offset from get_sync_timing_info() - Implement MP4 pop extraction (video + audio detection) - Add obs.log parsing for A/V offset validation (optional) - Implement strict per-pop offset validation (≤40ms for CSV, ≤1000ms for MP4) - CSV sources show excellent alignment: obs.csv 2.35ms max, network.csv 16.80ms max - All local tests passing Resolves full-frame pop detection issues in ntsc_default_debug scenario
… tests - Adjusted resource.csv to reflect new performance metrics, including CPU, RAM, and GPU usage. - Updated resource.json with revised duration, sample counts, and statistical values for CPU, RAM, and GPU metrics. - Modified validation_results.json to include updated packet reception, frame processing, video recording details, and network timing statistics. - Changed AV pop delta to AV pop offset with new metrics indicating a failure due to excessive pop offset in MP4.
- Updated video pop detection: >80% of pixels with RGB >= 0xE0 (was: all pixels >= 0xF0) - Allows for VPL color variations and CRT effects - More robust detection for test scenarios - Updated audio pop detection: 100+ samples exceed 0xFF (was: 8+ samples > ±512) - More reliable signal detection, ignores background noise - Consistent with test data generation - Added unified 'AV SYNC:' logging when both video and audio pops detected - Logs offset in milliseconds (1 decimal place) - Includes pop counts, wall clock timestamp, frame numbers, and timestamps - Provides all technical info needed for debugging A/V sync issues - Only active when debug logging enabled (zero performance impact otherwise) - Added AvSyncLogValidationAssertion to verify plugin logging - Validates AV SYNC log entries are internally consistent - Checks for duplicate pop detections from video/audio handlers - Verifies pop counts are reasonable - Cross-validates with CSV data where timing aligns - Added ntsc_default_debug scenario overrides - properties.ini with debug_logging=true - Enables real-time A/V sync logging for test validation All local tests passing, ready for CI validation.
chrisgleissner
commented
Jan 8, 2026
chrisgleissner
left a comment
Owner
Author
There was a problem hiding this comment.
@copilot Address all review comments
Contributor
|
@chrisgleissner I've opened a new pull request, #79, to work on those changes. Once the pull request is ready, I'll request review from you. |
Verifies that the debug_logging checkbox in plugin properties actually works as expected - when enabled, debug logs appear; when disabled, they don't. Changes: - Created DebugLogPresenceAssertion to validate debug logs are present - Performs spot checks on network, socket, and protocol debug log patterns - Validates that a minimum number of debug log lines appear (default: 10) - Added debug_log_presence to ntsc_default_debug scenario assertions The assertion verifies the checkbox mechanism works without requiring: - Special OBS startup flags - Manual OBS configuration changes - Restarting OBS This ensures users can simply tick the 'Debug logs' checkbox in plugin properties and instantly see debug logs appear in OBS logs. Testing: - Local E2E test passes with 3/3 assertions (av_pop_offset, av_sync_log_validation, debug_log_presence) - Validates 277 debug log lines appear during test run - Confirms network operations, socket creation, and protocol commands are logged when debug_logging=true
…low (#79) * Initial plan * Consolidate A/V sync logs and remove duplicate documentation Co-authored-by: chrisgleissner <3969147+chrisgleissner@users.noreply.github.com> * Extend real-device-av-sync.sh with configurable OBS/MP4 analysis and cross-platform documentation Co-authored-by: chrisgleissner <3969147+chrisgleissner@users.noreply.github.com> * Update tests/e2e/real-device-av-sync.sh Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: chrisgleissner <3969147+chrisgleissner@users.noreply.github.com> Co-authored-by: Christian Gleissner <chrisgleissner@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Adjusted CPU, RAM, and GPU metrics in resource.csv and resource.json to reflect new performance data. - Updated duration and sample statistics in resource.json. - Corrected packet reception details in validation_results.json. - Changed video recording size and packet integrity duration in validation_results.json. - Updated AV pop offset status from fail to pass with new metrics in validation_results.json. - Revised per-pop offsets for OBS, network, and MP4 sources in validation_results.json.
- Fix audio frame calculation to use int() for proper frame alignment - Rename av_pop_offset to av_sync_offset for consistency - Add debug_logging: true to ntsc_default_debug scenario - Fix UDP buffer prompt to check for existing config file - A/V offsets now 14-21ms (under 20ms target)
- Change audio packet frame assignment from int() truncation to using packet midpoint - This ensures audio packets are aligned with the frame they overlap most - Rename av_pop_offset assertion to av_sync_offset for consistency - Add debug_logging: true to ntsc_default_debug scenario - This addresses the A/V sync offset detection alignment
- Replace os_gettime_ns() (monotonic) with c64_get_millis() (CLOCK_REALTIME) for wall clock timestamps - Fixes timestamps showing 1970-01-01 epoch time instead of actual date - Add verbose debug logging to c64_debug_handle_audio_pop() for state machine tracing - Update PLANS.md with comprehensive A/V sync fix plan and progress tracking Wall clock timestamps now show actual date (e.g., 2026-01-08_12:47:56) instead of epoch time. Debug logging added to help diagnose off-by-one audio pop detection issue.
- Use rising edge timestamp for audio pop correlation with video pop start - Implement smart pairing: pops within 500ms (24 frames) are related - Increase audio threshold to 16384 (half of 16-bit max) for full-volume square waves - Add 4ms minimum duration validation to filter glitches - Log 'unpaired' keyword for pops from different cycles - Update assertion parser to handle unpaired pops This fixes false A/V sync offset reporting in OBS logs where multiple audio pops were incorrectly paired with single video pops, showing ~1000ms offsets instead of the actual <20ms sync. Real recordings already had excellent sync (p50=6.6ms).
Root cause: Packet filtering logic added after commit 8be672c was checking expected_peer_ip and dropping all packets that didn't match the configured IP address. This caused: - Synthetic E2E tests to fail (mock C64U sends from localhost but properties may expect different IP) - Real device tests to fail (real C64U at 192.168.1.13 but properties configured for 127.0.0.1 from previous E2E run) Evidence from logs: - tests/e2e/results/real_c64u_av_sync/session_20260109_072136/obs_log.txt: 'Dropping video UDP packet from 192.168.1.13 (expected 127.0.0.1)' Impact: - Before fix: 28% packet reception (5376/19252 packets) - After fix: 84% packet reception (16195/19252 packets) The packet filtering was too strict and caused more problems than it solved. UDP streams don't need source IP filtering - the binding to specific ports provides sufficient isolation. Remaining 16% packet loss appears to be test harness timing issues, not actual plugin problems. CI builds show 100% reception, suggesting the loss is related to longer test durations or local environment factors.
CRITICAL BUG FIXED: Synthetic E2E tests were receiving packets from real C64 device instead of mock C64U, causing all tests to fail incorrectly. Root cause: - UDP sockets bound to 0.0.0.0:port receive packets from ANY source IP - Real C64U at 192.168.1.13 was streaming to ports 21000/21001 - Mock C64U also sends to ports 21000/21001 - Plugin received BOTH streams, mixing real and synthetic packets - Sequence numbers proved pollution (real packets start at >1000, mock at 0) Changes: 1. Added packet source verification in e2e.py validate_test_results() - Checks first video/audio sequence numbers - Fails test with clear error if sequence >1000 (real device packets) - Provides actionable error message about cross-pollution 2. Added C64U reset call in e2e.sh before synthetic tests - Attempts to stop real C64 from streaming - Warns user if C64U configured for wrong ports 3. Changed real-device-av-sync.sh default ports to 11000/11001 - Real device now uses C64U hardware default ports (11000/11001) - Synthetic tests use ports 21000/21001 - Port isolation prevents cross-pollution even if both active Remaining work: - Manual workaround: Reconfigure C64U to use ports 11000/11001 via web UI - Future: Add C64U port reconfiguration via REST API if available - Test sequence: synth→real→synth→real to verify isolation works This is a critical fix that makes test results valid and reliable.
Added detailed documentation explaining: - Root cause of cross-pollution (UDP port binding behavior) - Evidence from sequence number analysis - Port isolation solution (synthetic=21000/21001, real=11000/11001) - Manual C64U reconfiguration steps (web UI, REST API, config file) - Verification test procedures - Technical details about UDP binding and sequence numbers This documents the critical blocker preventing synthetic E2E tests from working. Next step: Manually reconfigure C64U to use ports 11000/11001, then retest.
…ce tests Root cause: OBS stores plugin state in scene JSON files, which takes precedence over properties.ini. The template scenes have c64_host=127.0.0.1 hardcoded for synthetic E2E tests. When real device tests copied this template, the hardcoded value overrode the properties.ini setting (c64_host=c64u), causing the plugin to connect to 127.0.0.1 instead of the real C64 Ultimate at 192.168.1.13. Fix: RealDeviceE2E.create_obs_profile() now calls _clear_scene_plugin_state() which removes the hardcoded c64_host from the scene JSON after copying the template. This allows properties.ini to take precedence, enabling real device tests to connect to the correct host. Also fixed syntax error in e2e.sh stop_real_c64_streaming() function. Verified: - Real device test now receives packets from 192.168.1.13 (16248 packets) - Synthetic test still works with mock C64U (sequence numbers 0,1,2...) - No cross-pollution: synth→real→synth sequence properly isolated - Real device test resolves 'c64u' to 192.168.1.13 correctly
…or real device tests
…cket handling in E2E tests
…ved E2E test reliability
…acket loss analysis
Move heavy buffering/logging off socket threads using a lock-free SPSC ingest ring, drained by the processor thread. Also avoid 1ms EAGAIN sleeps to reduce receive-time quantization.
…lt scenario; update streaming stop logic in e2e.sh to enhance clarity and prevent cross-pollution warnings.
…tput verification - Introduced `test_network_timing_validation.py` for unit tests on network timing validation logic. - Created `verify_output.py` to verify recorded OBS output against expected patterns, including frame count, order, and A/V sync. - Added `verify_stills.py` to check extracted still images for sync markers. - Implemented `verify_tint.py` to validate green tint in OBS recordings as a preliminary check for visual filters.
- Updated resource.csv with new performance metrics, reflecting changes in CPU, RAM, and GPU usage. - Adjusted resource.json to match the updated metrics, including duration, sample count, and statistical values for CPU, RAM, and GPU. - Modified validation_results.json to correct packet reception details and update metrics related to audio/video synchronization. - Enhanced scenario_loader.py to improve path handling for loading presets and templates, ensuring compatibility with repository structure.
- Update workflow to run pytest against tests/e2e/util - Make mock TCP control parsing stream-safe (coalesced/fragmented recv) - Track PLANS.md/INVESTIGATIONS.md and add CI hardening TODOs
… tests - Adjusted CPU, RAM, and GPU metrics in resource.csv and resource.json to reflect new performance data. - Updated duration and sample statistics in resource.json. - Revised validation results in validation_results.json, including packet reception details and AV sync offsets. - Ensured consistency across metrics and validation outputs for accurate performance assessment.
… for improved AV sync handling
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Features: