Overview
The file pkg/parser/import_processor.go has grown to 1106 lines, significantly exceeding the 800-line healthy threshold. This monolithic file handles too many concerns in a single place: queue-based BFS traversal, field extraction for 15+ frontmatter keys, cycle detection, topological sorting, and remote origin resolution. Splitting it into focused modules will improve readability, testability, and maintainability.
Current State
- File:
pkg/parser/import_processor.go
- Size: 1106 lines
- Test Coverage: Related tests exist across multiple files (~789 lines in
import_topological_test.go + import_cycle_test.go), but no dedicated import_processor_test.go
- Complexity: One mega-function
processImportsFromFrontmatterWithManifestAndSource spans lines 182–849 (~667 lines) and handles BFS traversal, field extraction for 15+ frontmatter keys, deduplication, and result assembly inline
Full File Analysis
Functions
| Function |
Lines (approx) |
Responsibility |
ProcessImportsFromFrontmatter |
82–89 |
Public thin wrapper |
parseRemoteOrigin |
117–167 |
Parse workflowspec into remote origin struct |
ProcessImportsFromFrontmatterWithManifest |
171–179 |
Public wrapper |
ProcessImportsFromFrontmatterWithSource |
177–180 |
Public wrapper |
processImportsFromFrontmatterWithManifestAndSource |
182–849 |
667-line mega-function — BFS traversal + extraction of 15+ fields |
findCyclePath |
853–874 |
DFS cycle path construction |
dfsForCycle |
879–914 |
DFS helper for cycle detection |
topologicalSortImports |
920–1072 |
Kahn's algorithm for topological sort |
extractImportPaths |
1074–1106 |
Extract import paths from frontmatter |
Complexity Hotspots
- Lines 182–360: Import spec parsing and BFS queue seeding (complex spec parsing, remote origin detection, lock file validation, duplicate checking)
- Lines 360–590: BFS traversal loop — agent files, YAML workflow files, nested import discovery, remote path resolution
- Lines 590–807: Field extraction block — 15+ sequential blocks each extracting a different frontmatter key (tools, engines, mcp-servers, safe-outputs, safe-inputs, steps, runtimes, services, network, permissions, secret-masking, bots, skip-roles, skip-bots, plugins, post-steps, labels, cache, features)
- Lines 920–1072: Topological sort — already self-contained but lives in the same file
Duplicate Patterns
The field extraction block (lines 590–807) repeats the same pattern 15+ times:
fooContent, err := extractXFromContent(string(content))
if err == nil && fooContent != "" && fooContent != "{}" {
fooBuilder.WriteString(fooContent + "\n")
}
This repetitive code is a strong signal for extraction.
Refactoring Strategy
Proposed File Splits
-
import_bfs.go (~300 lines)
- Functions:
processImportsFromFrontmatterWithManifestAndSource (the BFS traversal core — queue seeding, BFS loop, queue item dispatch, result assembly)
- Responsibility: Orchestration of the BFS import traversal
- Estimated LOC: ~300
-
import_field_extractor.go (~200 lines)
- Functions: new
extractAllImportFields(content string, item importQueueItem, result *importAccumulator) that consolidates the 15+ field extraction blocks from lines 590–807
- Responsibility: Extracting all frontmatter fields from a single imported file and accumulating results
- Estimated LOC: ~200
-
import_cycle.go (~120 lines)
- Functions:
findCyclePath, dfsForCycle (already cohesive — lines 851–914)
- Responsibility: Cycle detection in the import dependency graph
- Estimated LOC: ~120
-
import_topological.go (~160 lines)
- Functions:
topologicalSortImports, extractImportPaths (already cohesive — lines 916–1106)
- Responsibility: Topological ordering of imports using Kahn's algorithm
- Estimated LOC: ~160
-
import_remote.go (~60 lines)
- Functions:
parseRemoteOrigin, remoteImportOrigin type, importQueueItem type
- Responsibility: Remote origin parsing and queue item types
- Estimated LOC: ~60
-
import_processor.go (keep, reduced to ~200 lines)
- Remaining:
ImportsResult, ImportSpec, ImportInputDefinition types, public API functions (ProcessImportsFromFrontmatter, ProcessImportsFromFrontmatterWithManifest, ProcessImportsFromFrontmatterWithSource)
- Responsibility: Public API surface and core types
Shared Utilities
An importAccumulator struct could centralize the ~25 builder/slice/set variables currently declared inline in the mega-function, passed to extractAllImportFields as a pointer receiver.
Interface Abstractions
- Consider an
importFieldExtractor interface if the extraction logic needs to be tested independently or swapped
Test Coverage Plan
Add or augment tests for each new file:
-
import_cycle_test.go (already partially exists)
- Test cases: simple cycle (A→B→A), longer cycle (A→B→C→A), no cycle, self-import
- Target coverage: >80%
-
import_topological_test.go (already partially exists)
- Test cases: linear chain, diamond dependency, independent files, circular (should error)
- Target coverage: >80%
-
import_field_extractor_test.go (new)
- Test cases: extraction of each field type, empty file, missing fields, deduplication of bots/labels/plugins
- Target coverage: >80%
-
import_remote_test.go (new)
- Test cases: valid workflowspec parsing, missing ref defaults to "main", invalid specs return nil
- Target coverage: >80%
Implementation Guidelines
- Preserve Behavior: The public API (
ProcessImportsFromFrontmatter*) must remain unchanged
- Maintain Exports: All exported types and functions stay exported
- Introduce
importAccumulator: Move the 25+ local variables into a struct to reduce function signature complexity
- Incremental Changes: Extract
import_cycle.go first (already cohesive), then import_topological.go, then import_remote.go, then import_field_extractor.go
- Run Tests Frequently: Verify
make test-unit passes after each extraction
- Document Changes: Add package-level comment to each new file explaining its role
Acceptance Criteria
Additional Context
- Repository Guidelines: Follow patterns in
AGENTS.md — prefer many smaller files grouped by functionality
- Validation Complexity: Target 100–200 lines per file, hard limit 300 lines
- Existing Test Files:
import_topological_test.go, import_cycle_test.go, yaml_import_test.go, import_remote_nested_test.go already provide good coverage scaffolding
Priority: Medium
Effort: Medium (well-contained package, clear split boundaries, strong existing test coverage)
Expected Impact: Improved maintainability, easier onboarding, reduced complexity per file
References:
Generated by Daily File Diet
Overview
The file
pkg/parser/import_processor.gohas grown to 1106 lines, significantly exceeding the 800-line healthy threshold. This monolithic file handles too many concerns in a single place: queue-based BFS traversal, field extraction for 15+ frontmatter keys, cycle detection, topological sorting, and remote origin resolution. Splitting it into focused modules will improve readability, testability, and maintainability.Current State
pkg/parser/import_processor.goimport_topological_test.go+import_cycle_test.go), but no dedicatedimport_processor_test.goprocessImportsFromFrontmatterWithManifestAndSourcespans lines 182–849 (~667 lines) and handles BFS traversal, field extraction for 15+ frontmatter keys, deduplication, and result assembly inlineFull File Analysis
Functions
ProcessImportsFromFrontmatterparseRemoteOriginProcessImportsFromFrontmatterWithManifestProcessImportsFromFrontmatterWithSourceprocessImportsFromFrontmatterWithManifestAndSourcefindCyclePathdfsForCycletopologicalSortImportsextractImportPathsComplexity Hotspots
Duplicate Patterns
The field extraction block (lines 590–807) repeats the same pattern 15+ times:
This repetitive code is a strong signal for extraction.
Refactoring Strategy
Proposed File Splits
import_bfs.go(~300 lines)processImportsFromFrontmatterWithManifestAndSource(the BFS traversal core — queue seeding, BFS loop, queue item dispatch, result assembly)import_field_extractor.go(~200 lines)extractAllImportFields(content string, item importQueueItem, result *importAccumulator)that consolidates the 15+ field extraction blocks from lines 590–807import_cycle.go(~120 lines)findCyclePath,dfsForCycle(already cohesive — lines 851–914)import_topological.go(~160 lines)topologicalSortImports,extractImportPaths(already cohesive — lines 916–1106)import_remote.go(~60 lines)parseRemoteOrigin,remoteImportOrigintype,importQueueItemtypeimport_processor.go(keep, reduced to ~200 lines)ImportsResult,ImportSpec,ImportInputDefinitiontypes, public API functions (ProcessImportsFromFrontmatter,ProcessImportsFromFrontmatterWithManifest,ProcessImportsFromFrontmatterWithSource)Shared Utilities
An
importAccumulatorstruct could centralize the ~25 builder/slice/set variables currently declared inline in the mega-function, passed toextractAllImportFieldsas a pointer receiver.Interface Abstractions
importFieldExtractorinterface if the extraction logic needs to be tested independently or swappedTest Coverage Plan
Add or augment tests for each new file:
import_cycle_test.go(already partially exists)import_topological_test.go(already partially exists)import_field_extractor_test.go(new)import_remote_test.go(new)Implementation Guidelines
ProcessImportsFromFrontmatter*) must remain unchangedimportAccumulator: Move the 25+ local variables into a struct to reduce function signature complexityimport_cycle.gofirst (already cohesive), thenimport_topological.go, thenimport_remote.go, thenimport_field_extractor.gomake test-unitpasses after each extractionAcceptance Criteria
import_processor.goreduced to ≤300 lines (types + public API only)import_cycle.gocreated with cycle detection functions (≤150 lines)import_topological.gocreated with topological sort functions (≤200 lines)import_remote.gocreated with remote origin types/parsing (≤100 lines)import_field_extractor.gocreated with field extraction logic (≤250 lines)import_bfs.gocreated with BFS traversal core (≤350 lines)make test-unit)make lint)make build)Additional Context
AGENTS.md— prefer many smaller files grouped by functionalityimport_topological_test.go,import_cycle_test.go,yaml_import_test.go,import_remote_nested_test.goalready provide good coverage scaffoldingPriority: Medium
Effort: Medium (well-contained package, clear split boundaries, strong existing test coverage)
Expected Impact: Improved maintainability, easier onboarding, reduced complexity per file
References: