refactor(processor): decompose analyzer into focused modules by flexiondotorg · Pull Request #49 · linuxmatters/jivetalking

flexiondotorg · 2026-03-13T16:06:20Z

Summary

Decomposed the 4316-line monolithic analyzer.go into five focused modules across a 5-phase refactor. Improves maintainability, reduces cognitive load, and establishes clear separation of concerns. Output is bit-for-bit identical to pre-refactor code.

Changes

Phase 0: Type safety and speculative generality removal

Removed unused centroidP50 parameter from speechScore
Introduced PassNumber type with constants (PassAnalysis, PassProcessing, PassMeasuring, PassNormalising)

Phase 1: Frame loop consolidation (6 migrations)

Created runFilterGraph abstraction to eliminate frame loop duplication
Migrated all 6 call sites (AnalyzeAudio, processWithFilters, measureWithLoudnorm, applyLoudnormAndMeasure, measureOutputSilenceRegionFromReader, measureOutputSpeechRegionFromReader)

Phase 2: Spectral accumulator consolidation (3 tasks)

Consolidated 13 flat spectral fields into spectralMetrics struct in both SilenceCandidateMetrics and SpeechCandidateMetrics
Extracted generic measureOutputRegionFromReader implementation; collapsed two ~190-line functions into thin wrappers

Phase 3: Data consistency (2 tasks)

Added explicit json tags to SilenceCandidateMetrics (consistency with SpeechCandidateMetrics)
Embedded SpectralMetrics in candidate types (breaking JSON change: flat keys → nested structure)

Phase 4: File splitting (4 extractions)

analyzer_metrics.go - spectral/interval accumulators, extraction helpers
analyzer_candidates.go - silence/speech candidate detection and scoring
analyzer_output.go - output region measurement functions
frame_processor.go - frame processing abstraction and filter graph setup

Result: analyzer.go reduced from 4316 to 871 lines (80% reduction).

Phase 5: Validation

All 10 acceptance criteria pass
`go vet ./...` clean
Output-stable: every measurement identical to pre-refactor code

Testing

All existing tests pass (no new tests required; output is byte-identical)
`go build ./...` clean
`go vet ./...` clean
20 commits across phases, each producing output-stable intermediate state
Acceptance criteria: output measurements verified identical pre/post refactor

Breaking Changes

JSON output structure for candidate metrics: spectral fields (Mean, Median, Stdev, P5, P95) are now nested under `Spectral` field instead of flat at candidate level.

Before:
```json
{
"Mean": 0.5,
"Median": 0.4
}
```

After:
```json
{
"Spectral": {
"Mean": 0.5,
"Median": 0.4
}
}
```

Consumers of candidate metrics JSON must update parsing logic to access nested structure.

Related Issues

Internal code quality improvement; no external issues.

…Number type safety - Remove unused centroidP50 parameter from speechScore function and centroidValues calculation from findSpeechCandidatesFromIntervals, eliminating speculative generality that added no value to the scoring logic - Introduce PassNumber type with constants (PassAnalysis, PassProcessing, PassMeasuring, PassNormalising) to replace bare int for pass numbers, making invalid pass values a compile error across processor, analyzer, normalise, and UI components Signed-off-by: Martin Wimpress <code@wimpress.io>

…traction - Add FrameAction type with FrameDiscard/FrameKeep constants for frame handling semantics - Add FrameLoopConfig struct with optional error callbacks (OnReadError, OnPushError, OnPullError) for per-site error handling configuration - Add runFilterGraph function encapsulating push-pull-flush frame processing loop pattern currently duplicated across six call sites - Implement lenient read error handling (break loop) and strict push/pull error handling (return error) matching existing behaviour - Add TestRunFilterGraph and TestRunFilterGraphLenientErrors to verify frame counting and error callback invocation This foundation eliminates frame loop duplication across AnalyzeAudio, processWithFilters, measureWithLoudnorm, applyLoudnormAndMeasure, and silence/speech measurement functions, enabling Phase 1.2+ refactoring to decompose analyzer.go. Signed-off-by: Martin Wimpress <code@wimpress.io>

- Replace inline frame loop (~30 lines) with runFilterGraph call (~12 lines) - Configure lenient OnPushError and OnPullError handlers instead of strict error propagation - Move progress tracking into OnInputFrame callback - Remove manual filteredFrame allocation (now managed by runFilterGraph) - Preserve explicit AVFilterGraphFree timing before stopLoudnormCapture (ensures loudnorm JSON output emission) Reduces duplication by using runFilterGraph abstraction introduced previously. This is the first of six frame loop migrations planned across the codebase. Signed-off-by: Martin Wimpress <code@wimpress.io>

- Replace inline frame loop with runFilterGraph call - Configure lenient error handlers (OnPushError, OnPullError return nil) - Move progress reporting into OnFrame callback - Preserve AVFilterGraphFree timing before stats capture - Eliminate frame loop duplication (2 of 6 migrations complete) Signed-off-by: Martin Wimpress <code@wimpress.io>

- Replace inline frame loop with runFilterGraph call - Move level calculation and output measurement into OnFrame callback - Move frame counting and progress reporting into OnInputFrame callback - Remove manual filteredFrame allocation (now managed by runFilterGraph) - Remove unused errors import - Configure strict error handling with read error propagation Eliminates ~100 lines of duplicated frame loop code. Third migration of six across the codebase. Maintains exact error handling semantics and callback ordering. Signed-off-by: Martin Wimpress <code@wimpress.io>

- Replace inline frame loop with runFilterGraph abstraction - Move pre-filter work (level calculation, RMS/peak accumulation, interval sampling, progress updates) into OnInputFrame callback - Move spectral extraction into OnFrame callback returning FrameDiscard - Configure strict error handlers for read/push/pull paths - Remove manual filteredFrame allocation (managed by runFilterGraph) - Preserve filterFreed guard pattern for deferred cleanup Reduces function complexity by centralising frame loop mechanics whilst maintaining closure capture pattern for interval/spectral accumulators and progress state. Signed-off-by: Martin Wimpress <code@wimpress.io>

…runFilterGraph - Replace inline frame loop and flush logic with runFilterGraph call - Eliminate ~190 lines of duplicated frame processing code - Configure lenient error handlers following existing pattern - Move metadata extraction into OnFrame callback - Retain defer cleanup for AVFilterGraphFree This is the fifth of six frame loop migrations. Signed-off-by: Martin Wimpress <code@wimpress.io>

…eOutputSpeechRegionFromReader - Replace inline frame loop and flush block with extractMeasurements closure - Delegate frame processing to runFilterGraph abstraction - Remove manual filteredFrame allocation, now managed by runFilterGraph - Configure lenient error handlers following established pattern All frame loop duplication consolidated into runFilterGraph. Only remaining AVBuffersrcAddFrameFlags references are within runFilterGraph implementation and comments, not at call sites. Signed-off-by: Martin Wimpress <code@wimpress.io>

…lence region measurement Replace 13 individual spectralSum variables with spectralMetrics struct in measureOutputSilenceRegionFromReader. This eliminates duplication and aligns with the refactoring pattern for application across other measurement functions. - Consolidate 13 individual var spectral*Sum declarations into single spectralMetrics struct - Replace 13 getFloatMetadata calls with spectralAcc.add(extractSpectralMetrics(metadata)) - Replace manual 13-line averaging loop with spectralAcc.average() call - Update all field references to use avg.* naming convention - Reduces function from ~260 lines of duplicated accumulation logic to reusable struct methods Signed-off-by: Martin Wimpress <code@wimpress.io>

…h region measurement - Replace 13 individual spectral accumulator variables with single spectralMetrics struct - Consolidate getFloatMetadata calls and accumulation into spectralAcc.add() pattern - Replace manual averaging block with spectralAcc.average() call - Update field references to use averaged metric names - Both output region measurement functions now use consolidated accumulators Signed-off-by: Martin Wimpress <code@wimpress.io>

…generic implementation - Add regionMeasurements value type for shared measurement results - Create measureOutputRegionFromReader generic function with unified logic: - Filter graph setup and frame processing - Spectral measurement extraction and averaging - Crest factor and RMS fallbacks - dB conversion - Replace measureOutputSilenceRegionFromReader with thin wrapper mapping to SilenceCandidateMetrics - Replace measureOutputSpeechRegionFromReader with thin wrapper mapping to SpeechCandidateMetrics - Move region-type-specific debug strings into wrappers - Eliminates ~145 lines of duplicated measurement logic Signed-off-by: Martin Wimpress <code@wimpress.io>

- Add explicit `json:"snake_case"` tags to all 22 untagged fields - Aligns JSON key naming with SpeechCandidateMetrics (snake_case convention) - Prevents silent marshalling inconsistency if types are restructured - Ensures consistent JSON serialisation across both metrics types Signed-off-by: Martin Wimpress <code@wimpress.io>

- Consolidate 13 flat Spectral* fields into single SpectralMetrics named field - Apply embedding to both SilenceCandidateMetrics and SpeechCandidateMetrics - Export spectralMetrics to SpectralMetrics for cross-package use - Update 4 assignment sites in analyzer.go - Update 3 read sites in scoring functions - Update all external references in internal/logging package - Modernise test struct literals to nested SpectralMetrics pattern BREAKING CHANGE: JSON output structure changes from flat spectral keys (e.g., spectral_mean, spectral_median) to nested structure under Spectral field (e.g., Spectral.mean, Spectral.median). Update JSON consumers accordingly. Signed-off-by: Martin Wimpress <code@wimpress.io>

…metrics.go Move metrics-related concerns from analyzer.go into a focused sub-file to reduce cognitive load and improve code organisation. This extraction includes all spectral and interval metrics types, accumulator structures, and their associated extraction and helper functions. - Extract SpectralMetrics, baseMetadataAccumulators, metadataAccumulators, outputMetadataAccumulators, IntervalSample, intervalAccumulator, and intervalFrameMetrics types to analyzer_metrics.go - Move extraction functions: extractSpectralMetrics(), extractFrameMetadata(), extractIntervalFrameMetrics(), extractOutputFrameMetadata(), finalizeOutputMeasurements() - Move helper functions: getFloatMetadata(), linearRatioToDB(), linearSampleToDBFS(), frameSumSquaresAndPeak() - Relocate all metaKey* constants to new file Signed-off-by: Martin Wimpress <code@wimpress.io>

…ate file Move candidate-related functionality from analyzer.go to focused analyzer_candidates.go file. Extracted ~1701 lines including: - Candidate detection functions (silence/speech finding) - Candidate scoring and measurement functions - Candidate-related constants (minimumSilenceIntervals, roomToneAmplitudeDecayDB, silenceThresholdHeadroomDB, etc.) Reduces analyzer.go by 60% reduction, leaving it focused on Pass 1 analysis and interval management. No logic changes; pure code reorganisation as part of Phase 4 file-splitting refactor. Signed-off-by: Martin Wimpress <code@wimpress.io>

…output.go - Move regionMeasurements type and measurement functions to focused sub-file - Extract measureOutputRegionFromReader generic implementation - Extract silence and speech region measurement wrappers - Extract MeasureOutput* and MeasureOutputRegions public functions - Reduce analyzer.go from 1158 to 871 lines - Consolidate output measurement concerns in dedicated file Signed-off-by: Martin Wimpress <code@wimpress.io>

- Create frame_processor.go with frame processing abstraction - Move FrameAction type with FrameDiscard/FrameKeep constants - Move FrameLoopConfig struct and runFilterGraph function - Move setupFilterGraph, createBufferSource, createBufferSink from filters.go - Reduce analyzer.go from 872 to 729 lines - Reduce filters.go from 918 to 791 lines Completes Phase 4 decomposition. Original 4316-line analyzer.go now split across five focused modules: analyzer.go, analyzer_metrics.go, analyzer_candidates.go, analyzer_output.go, and frame_processor.go. Signed-off-by: Martin Wimpress <code@wimpress.io>

The Seek method signature conflicted with Go's standard io.Seeker interface, which expects Seek(int64, int) (int64, error). This method uses different semantics (timestamp-based seeking rather than byte-offset seeking), making the name conflict confusing. Renaming to SeekTo clarifies the intent and eliminates the interface signature mismatch caught by go vet. - Rename Seek(int64) error to SeekTo(int64) error in reader.go - Update call site in analyzer_output.go Signed-off-by: Martin Wimpress <code@wimpress.io>

flexiondotorg · 2026-03-13T16:10:02Z

The refactor is output-stable. Every audio measurement - loudness, spectral, dynamic, noise floor - is bit-for-bit identical between before and after the refactor.

cubic-dev-ai

9 issues found across 21 files

Confidence score: 2/5

There is a concrete regression risk in internal/processor/analyzer_metrics.go: planar multi-channel formats can read/slice channel data incorrectly, which can corrupt metrics or trigger runtime failures for common audio layouts.
internal/processor/analyzer_output.go and internal/processor/normalise.go both weaken error handling (swallowed runFilterGraph failures and early drain-loop exit), so partial processing may be reported as valid output.
Test coverage concerns in internal/processor/analyzer_test.go (frame-count assumptions and unexercised error-path callbacks) increase the chance these behaviors slip through CI, so this is not yet a low-risk merge.
Pay close attention to internal/processor/analyzer_metrics.go, internal/processor/analyzer_output.go, internal/processor/normalise.go, and internal/processor/analyzer_test.go - they carry the highest user-impact risk around correctness and silent partial results.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/processor/analyzer_metrics.go">

<violation number="1" location="internal/processor/analyzer_metrics.go:162">
P1: For planar sample formats (S16P, FLTP, S32P, DBLP), `Data().Get(0)` returns only plane 0 containing `nbSamples` elements for channel 0, but the code slices `nbSamples * nbChannels` elements from it. With multi-channel planar audio this is an out-of-bounds read via `unsafe.Slice`. Currently safe because the pipeline uses mono, but this is a latent memory-safety bug in a general-purpose function.

Consider either (a) iterating each plane separately for planar formats, or (b) adding a guard that returns `false` when `nbChannels > 1` and the format is planar.</violation>
</file>

<file name="internal/processor/analyzer_output.go">

<violation number="1" location="internal/processor/analyzer_output.go:120">
P2: Don't swallow `runFilterGraph` errors here; partial decode/filter failures can return truncated region metrics as if they were valid.</violation>
</file>

<file name="internal/processor/analyzer_test.go">

<violation number="1" location="internal/processor/analyzer_test.go:2025">
P2: Don't assert a 1:1 input/output frame count here; the filter graph may re-chunk audio frames and make this test fail spuriously.</violation>

<violation number="2" location="internal/processor/analyzer_test.go:2063">
P2: This test never triggers the read/push error paths it claims to validate, so it won't catch regressions in the callback contract.</violation>
</file>

<file name="internal/processor/processor.go">

<violation number="1" location="internal/processor/processor.go:217">
P3: Add push/pull error wrappers here; Pass 2 now bubbles up raw FFmpeg errors without saying whether the failure happened while feeding or draining the filter graph.</violation>
</file>

<file name="internal/processor/frame_processor.go">

<violation number="1" location="internal/processor/frame_processor.go:19">
P2: `FrameKeep` is unused across all call sites and its documented semantics are unsafe: since `filteredFrame` is a single reused allocation (line 68), the next `AVBuffersinkGetFrame` call will unref the frame's data internally before writing new data. A callback returning `FrameKeep` cannot safely hold the pointer as the doc suggests—it would need to clone the frame first. Consider removing `FrameKeep` (consistent with the PR's own "speculative generality removal" goal) or, if it's needed later, fixing the doc to require `av_frame_clone`.</violation>
</file>

<file name="internal/processor/normalise.go">

<violation number="1" location="internal/processor/normalise.go:551">
P2: This changes unexpected sink errors from retrying to breaking out of the drain loop, so normalisation can continue after only partially draining the current output batch.</violation>
</file>

<file name="internal/processor/analyzer_candidates.go">

<violation number="1" location="internal/processor/analyzer_candidates.go:169">
P3: Comment says "stability (0.30) and quality (0.70)" but actual weights sum to stability=0.40 and quality=0.60.</violation>

<violation number="2" location="internal/processor/analyzer_candidates.go:855">
P2: End-of-loop silence region boundary includes trailing non-silent interruptions. The mid-loop termination correctly uses `lastSilentIdx = i - interruptionCount` to exclude trailing interruptions, but the end-of-loop case unconditionally uses the last interval at `searchLimit-1`. This inflates the region's end time and duration by up to `interruptionToleranceIntervals * 250ms` (750ms).</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

internal/processor/analyzer_metrics.go

internal/processor/analyzer_output.go

internal/processor/analyzer_test.go

internal/processor/frame_processor.go

internal/processor/normalise.go

internal/processor/analyzer_candidates.go

internal/processor/processor.go

internal/processor/analyzer_candidates.go

…review - Add memory safety guard in frameSumSquaresAndPeak for planar multi-channel access to prevent out-of-bounds read - Remove unsafe unused FrameKeep constant and document removal rationale - Fix stability/quality weight comment (0.40/0.60 instead of 0.30/0.70) - Fix end-of-loop silence boundary detection to use correct limit, preventing up to 750ms duration overcount - Wrap OnPushError and OnPullError callback errors with context for improved debugging Signed-off-by: Martin Wimpress <code@wimpress.io>

cubic-dev-ai

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/processor/analyzer_metrics.go">

<violation number="1" location="internal/processor/analyzer_metrics.go:168">
P1: Returning `false` for planar multichannel frames silently breaks interval RMS/peak analysis for stereo inputs processed before downmix.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

internal/processor/analyzer_metrics.go

Previously, frameSumSquaresAndPeak returned false for planar formats when nbChannels > 1, silently dropping all RMS and peak data. Since most audio codecs (MP3, AAC, Vorbis, OPUS) decode to stereo FLTP, all stereo input files produced zero RMS/peak interval data. Rewrote frameSumSquaresAndPeak to iterate each plane separately using frame.Data().Get(uintptr(plane)) for planar formats. Single-plane (interleaved) path remains unchanged. Now correctly handles S16P, S32P, DBLP and other planar sample formats. Signed-off-by: Martin Wimpress <code@wimpress.io>

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/processor/analyzer_metrics.go">

<violation number="1" location="internal/processor/analyzer_metrics.go:174">
P2: The planar-channel loop reads every channel from `frame.Data()`, which only covers the first 8 FFmpeg data pointers. Planar inputs with more than 8 channels will fail RMS/peak accumulation instead of using `extended_data`.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

internal/processor/analyzer_metrics.go

…review - Add memory safety guard in frameSumSquaresAndPeak for planar multi-channel access to prevent out-of-bounds read - Remove unsafe unused FrameKeep constant and document removal rationale - Fix stability/quality weight comment (0.40/0.60 instead of 0.30/0.70) - Fix end-of-loop silence boundary detection to use correct limit, preventing up to 750ms duration overcount - Wrap OnPushError and OnPullError callback errors with context for improved debugging Signed-off-by: Martin Wimpress <code@wimpress.io>

flexiondotorg added 18 commits March 13, 2026 10:42

cubic-dev-ai bot reviewed Mar 13, 2026

View reviewed changes

internal/processor/analyzer_metrics.go Outdated Show resolved Hide resolved

cubic-dev-ai bot reviewed Mar 13, 2026

View reviewed changes

internal/processor/analyzer_metrics.go Show resolved Hide resolved

flexiondotorg merged commit dbab9e3 into main Mar 13, 2026
7 checks passed

flexiondotorg deleted the refactor branch March 13, 2026 16:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(processor): decompose analyzer into focused modules#49

refactor(processor): decompose analyzer into focused modules#49
flexiondotorg merged 20 commits intomainfrom
refactor

flexiondotorg commented Mar 13, 2026

Uh oh!

flexiondotorg commented Mar 13, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

flexiondotorg commented Mar 13, 2026

Summary

Changes

Testing

Breaking Changes

Related Issues

Uh oh!

flexiondotorg commented Mar 13, 2026

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant