Skip to content

Improve A/V sync instrumentation#77

Merged
chrisgleissner merged 61 commits into
mainfrom
test/av-sync-instrumentation
Jan 10, 2026
Merged

Improve A/V sync instrumentation#77
chrisgleissner merged 61 commits into
mainfrom
test/av-sync-instrumentation

Conversation

@chrisgleissner

@chrisgleissner chrisgleissner commented Jan 7, 2026

Copy link
Copy Markdown
Owner

Features:

  • Automated A/V pop creation/detection in real C64U/OBS to facilitate cross-platform and end user testing
  • Extend OBS logs with A/V sync details

Copilot AI review requested due to automatic review settings January 7, 2026 08:39

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces comprehensive A/V synchronization testing and instrumentation for the C64 Stream plugin. It adds two C64 test programs (manual and automatic modes) that generate deterministic A/V "pops" (white screen flashes with audio tones), along with debug-only detection logic in the OBS plugin to measure and log sync timing deltas.

Key changes:

  • New C64 assembly programs for generating synchronized A/V test patterns with PAL/NTSC detection
  • Debug-mode edge detection for video (all-white frames) and audio (non-silent samples) in the plugin
  • CSV instrumentation extensions that append is_all_white and has_signal columns when debug logging is enabled
  • Cross-thread A/V sync delta logging at OBS submission time

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tools/c64/av-sync.asm Updated manual A/V sync test with improved color scheme (white instead of blue), PAL/NTSC frequency detection, and consistent pulse waveform
tools/c64/av-sync-auto.asm New automatic A/V sync test that generates one-frame pops every 48 frames using raster IRQs
tools/c64/README.md Added documentation for the new av-sync-auto.asm program
src/c64-types.h Added fields for A/V sync state tracking: timestamps, counters, and edge detection flags
src/c64-video.c Implemented all-white frame detection and video pop edge detection with delta logging
src/c64-audio.c Implemented audio signal detection and audio pop edge detection with delta logging, plus OBS submission timing
src/c64-protocol.c Added packet-level payload inspection for white video data (0x11 bytes) and non-silent audio samples
src/c64-record.c Added csv_debug_enabled snapshot during recording session initialization
src/c64-record-video.c Updated video recording call to pass is_all_white parameter
src/c64-record-obs.h Updated function signatures to include is_all_white and has_signal parameters
src/c64-record-obs.c Modified CSV output to conditionally append debug columns based on csv_debug_enabled flag
src/c64-record-network.h Updated function signatures to include is_all_white and has_signal parameters
src/c64-record-network.c Modified network CSV output to conditionally append debug columns
doc/av-sync.md Comprehensive new documentation explaining A/V sync testing methodology, program behavior, and CSV interpretation
doc/e2e.md Added documentation about optional debug columns in obs.csv

Comment thread src/c64-audio.c Outdated
Comment thread src/c64-video.c
… tests

- Introduced a new document detailing the frame progression marker detection algorithm, including marker design, detection algorithm, adaptive thresholds, and testing procedures.
- Removed outdated acceptance test scripts and README related to the C64 Stream plugin, which are no longer relevant to the current testing framework.
- Deleted scanline test script and properties ini defaults test, as they are superseded by new testing methodologies.
Repository owner deleted a comment from Copilot AI Jan 7, 2026
- Rename av_pop_delta → av_pop_offset assertion across codebase
- Fix UDP generator timing to use correct offset from get_sync_timing_info()
- Implement MP4 pop extraction (video + audio detection)
- Add obs.log parsing for A/V offset validation (optional)
- Implement strict per-pop offset validation (≤40ms for CSV, ≤1000ms for MP4)
- CSV sources show excellent alignment: obs.csv 2.35ms max, network.csv 16.80ms max
- All local tests passing

Resolves full-frame pop detection issues in ntsc_default_debug scenario
… tests

- Adjusted resource.csv to reflect new performance metrics, including CPU, RAM, and GPU usage.
- Updated resource.json with revised duration, sample counts, and statistical values for CPU, RAM, and GPU metrics.
- Modified validation_results.json to include updated packet reception, frame processing, video recording details, and network timing statistics.
- Changed AV pop delta to AV pop offset with new metrics indicating a failure due to excessive pop offset in MP4.
- Updated video pop detection: >80% of pixels with RGB >= 0xE0 (was: all pixels >= 0xF0)
  - Allows for VPL color variations and CRT effects
  - More robust detection for test scenarios

- Updated audio pop detection: 100+ samples exceed 0xFF (was: 8+ samples > ±512)
  - More reliable signal detection, ignores background noise
  - Consistent with test data generation

- Added unified 'AV SYNC:' logging when both video and audio pops detected
  - Logs offset in milliseconds (1 decimal place)
  - Includes pop counts, wall clock timestamp, frame numbers, and timestamps
  - Provides all technical info needed for debugging A/V sync issues
  - Only active when debug logging enabled (zero performance impact otherwise)

- Added AvSyncLogValidationAssertion to verify plugin logging
  - Validates AV SYNC log entries are internally consistent
  - Checks for duplicate pop detections from video/audio handlers
  - Verifies pop counts are reasonable
  - Cross-validates with CSV data where timing aligns

- Added ntsc_default_debug scenario overrides
  - properties.ini with debug_logging=true
  - Enables real-time A/V sync logging for test validation

All local tests passing, ready for CI validation.

@chrisgleissner chrisgleissner left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Address all review comments

Comment thread src/c64-video.c Outdated
Comment thread doc/c64u-stream-spec.md Outdated
Comment thread tests/e2e/real-device-av-sync.sh

Copilot AI commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

@chrisgleissner I've opened a new pull request, #79, to work on those changes. Once the pull request is ready, I'll request review from you.

chrisgleissner and others added 7 commits January 8, 2026 09:40
Verifies that the debug_logging checkbox in plugin properties actually
works as expected - when enabled, debug logs appear; when disabled, they don't.

Changes:
- Created DebugLogPresenceAssertion to validate debug logs are present
- Performs spot checks on network, socket, and protocol debug log patterns
- Validates that a minimum number of debug log lines appear (default: 10)
- Added debug_log_presence to ntsc_default_debug scenario assertions

The assertion verifies the checkbox mechanism works without requiring:
- Special OBS startup flags
- Manual OBS configuration changes
- Restarting OBS

This ensures users can simply tick the 'Debug logs' checkbox in plugin
properties and instantly see debug logs appear in OBS logs.

Testing:
- Local E2E test passes with 3/3 assertions (av_pop_offset,
  av_sync_log_validation, debug_log_presence)
- Validates 277 debug log lines appear during test run
- Confirms network operations, socket creation, and protocol commands
  are logged when debug_logging=true
…low (#79)

* Initial plan

* Consolidate A/V sync logs and remove duplicate documentation

Co-authored-by: chrisgleissner <3969147+chrisgleissner@users.noreply.github.com>

* Extend real-device-av-sync.sh with configurable OBS/MP4 analysis and cross-platform documentation

Co-authored-by: chrisgleissner <3969147+chrisgleissner@users.noreply.github.com>

* Update tests/e2e/real-device-av-sync.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: chrisgleissner <3969147+chrisgleissner@users.noreply.github.com>
Co-authored-by: Christian Gleissner <chrisgleissner@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Adjusted CPU, RAM, and GPU metrics in resource.csv and resource.json to reflect new performance data.
- Updated duration and sample statistics in resource.json.
- Corrected packet reception details in validation_results.json.
- Changed video recording size and packet integrity duration in validation_results.json.
- Updated AV pop offset status from fail to pass with new metrics in validation_results.json.
- Revised per-pop offsets for OBS, network, and MP4 sources in validation_results.json.
- Fix audio frame calculation to use int() for proper frame alignment
- Rename av_pop_offset to av_sync_offset for consistency
- Add debug_logging: true to ntsc_default_debug scenario
- Fix UDP buffer prompt to check for existing config file
- A/V offsets now 14-21ms (under 20ms target)
- Change audio packet frame assignment from int() truncation to using packet midpoint
- This ensures audio packets are aligned with the frame they overlap most
- Rename av_pop_offset assertion to av_sync_offset for consistency
- Add debug_logging: true to ntsc_default_debug scenario
- This addresses the A/V sync offset detection alignment
- Replace os_gettime_ns() (monotonic) with c64_get_millis() (CLOCK_REALTIME) for wall clock timestamps
- Fixes timestamps showing 1970-01-01 epoch time instead of actual date
- Add verbose debug logging to c64_debug_handle_audio_pop() for state machine tracing
- Update PLANS.md with comprehensive A/V sync fix plan and progress tracking

Wall clock timestamps now show actual date (e.g., 2026-01-08_12:47:56) instead of epoch time.
Debug logging added to help diagnose off-by-one audio pop detection issue.
- Use rising edge timestamp for audio pop correlation with video pop start
- Implement smart pairing: pops within 500ms (24 frames) are related
- Increase audio threshold to 16384 (half of 16-bit max) for full-volume square waves
- Add 4ms minimum duration validation to filter glitches
- Log 'unpaired' keyword for pops from different cycles
- Update assertion parser to handle unpaired pops

This fixes false A/V sync offset reporting in OBS logs where multiple audio pops
were incorrectly paired with single video pops, showing ~1000ms offsets instead
of the actual <20ms sync. Real recordings already had excellent sync (p50=6.6ms).
Root cause: Packet filtering logic added after commit 8be672c was checking
expected_peer_ip and dropping all packets that didn't match the configured
IP address. This caused:
- Synthetic E2E tests to fail (mock C64U sends from localhost but properties
  may expect different IP)
- Real device tests to fail (real C64U at 192.168.1.13 but properties
  configured for 127.0.0.1 from previous E2E run)

Evidence from logs:
- tests/e2e/results/real_c64u_av_sync/session_20260109_072136/obs_log.txt:
  'Dropping video UDP packet from 192.168.1.13 (expected 127.0.0.1)'

Impact:
- Before fix: 28% packet reception (5376/19252 packets)
- After fix: 84% packet reception (16195/19252 packets)

The packet filtering was too strict and caused more problems than it solved.
UDP streams don't need source IP filtering - the binding to specific ports
provides sufficient isolation.

Remaining 16% packet loss appears to be test harness timing issues, not
actual plugin problems. CI builds show 100% reception, suggesting the
loss is related to longer test durations or local environment factors.
CRITICAL BUG FIXED: Synthetic E2E tests were receiving packets from real C64
device instead of mock C64U, causing all tests to fail incorrectly.

Root cause:
- UDP sockets bound to 0.0.0.0:port receive packets from ANY source IP
- Real C64U at 192.168.1.13 was streaming to ports 21000/21001
- Mock C64U also sends to ports 21000/21001
- Plugin received BOTH streams, mixing real and synthetic packets
- Sequence numbers proved pollution (real packets start at >1000, mock at 0)

Changes:
1. Added packet source verification in e2e.py validate_test_results()
   - Checks first video/audio sequence numbers
   - Fails test with clear error if sequence >1000 (real device packets)
   - Provides actionable error message about cross-pollution

2. Added C64U reset call in e2e.sh before synthetic tests
   - Attempts to stop real C64 from streaming
   - Warns user if C64U configured for wrong ports

3. Changed real-device-av-sync.sh default ports to 11000/11001
   - Real device now uses C64U hardware default ports (11000/11001)
   - Synthetic tests use ports 21000/21001
   - Port isolation prevents cross-pollution even if both active

Remaining work:
- Manual workaround: Reconfigure C64U to use ports 11000/11001 via web UI
- Future: Add C64U port reconfiguration via REST API if available
- Test sequence: synth→real→synth→real to verify isolation works

This is a critical fix that makes test results valid and reliable.
Added detailed documentation explaining:
- Root cause of cross-pollution (UDP port binding behavior)
- Evidence from sequence number analysis
- Port isolation solution (synthetic=21000/21001, real=11000/11001)
- Manual C64U reconfiguration steps (web UI, REST API, config file)
- Verification test procedures
- Technical details about UDP binding and sequence numbers

This documents the critical blocker preventing synthetic E2E tests from working.
Next step: Manually reconfigure C64U to use ports 11000/11001, then retest.
…ce tests

Root cause: OBS stores plugin state in scene JSON files, which takes
precedence over properties.ini. The template scenes have c64_host=127.0.0.1
hardcoded for synthetic E2E tests. When real device tests copied this
template, the hardcoded value overrode the properties.ini setting
(c64_host=c64u), causing the plugin to connect to 127.0.0.1 instead of
the real C64 Ultimate at 192.168.1.13.

Fix: RealDeviceE2E.create_obs_profile() now calls _clear_scene_plugin_state()
which removes the hardcoded c64_host from the scene JSON after copying the
template. This allows properties.ini to take precedence, enabling real
device tests to connect to the correct host.

Also fixed syntax error in e2e.sh stop_real_c64_streaming() function.

Verified:
- Real device test now receives packets from 192.168.1.13 (16248 packets)
- Synthetic test still works with mock C64U (sequence numbers 0,1,2...)
- No cross-pollution: synth→real→synth sequence properly isolated
- Real device test resolves 'c64u' to 192.168.1.13 correctly
Move heavy buffering/logging off socket threads using a lock-free SPSC ingest ring, drained by the processor thread. Also avoid 1ms EAGAIN sleeps to reduce receive-time quantization.
…lt scenario; update streaming stop logic in e2e.sh to enhance clarity and prevent cross-pollution warnings.
…tput verification

- Introduced `test_network_timing_validation.py` for unit tests on network timing validation logic.
- Created `verify_output.py` to verify recorded OBS output against expected patterns, including frame count, order, and A/V sync.
- Added `verify_stills.py` to check extracted still images for sync markers.
- Implemented `verify_tint.py` to validate green tint in OBS recordings as a preliminary check for visual filters.
- Updated resource.csv with new performance metrics, reflecting changes in CPU, RAM, and GPU usage.
- Adjusted resource.json to match the updated metrics, including duration, sample count, and statistical values for CPU, RAM, and GPU.
- Modified validation_results.json to correct packet reception details and update metrics related to audio/video synchronization.
- Enhanced scenario_loader.py to improve path handling for loading presets and templates, ensuring compatibility with repository structure.
- Update workflow to run pytest against tests/e2e/util

- Make mock TCP control parsing stream-safe (coalesced/fragmented recv)

- Track PLANS.md/INVESTIGATIONS.md and add CI hardening TODOs
… tests

- Adjusted CPU, RAM, and GPU metrics in resource.csv and resource.json to reflect new performance data.
- Updated duration and sample statistics in resource.json.
- Revised validation results in validation_results.json, including packet reception details and AV sync offsets.
- Ensured consistency across metrics and validation outputs for accurate performance assessment.
@chrisgleissner chrisgleissner merged commit 1fc110e into main Jan 10, 2026
36 checks passed
@chrisgleissner chrisgleissner deleted the test/av-sync-instrumentation branch January 10, 2026 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants