Comprehensive semantic analysis of all Go source files in the repository to identify refactoring opportunities through function clustering, duplicate detection, and code organization assessment.
Full Analysis Report
Executive Summary
The codebase demonstrates generally good organization with clear patterns:
- ✅ Well-organized validation files (
*_validation.go)
- ✅ Clear creation operation files (
create_*.go)
- ✅ Feature-specific grouping (MCP, logs, Docker, npm, pip)
- ⚠️ Some duplicate functions identified (3 cases)
- ⚠️ Many small files with 1-2 functions in
pkg/workflow/
- ⚠️ Scattered helper functions could benefit from consolidation
Detailed Findings
1. Duplicate Functions Detected
Priority 1: extractErrorMessage() - Near-Duplicate (90% similar)
Occurrence 1: pkg/workflow/metrics.go:462
func extractErrorMessage(line string) string {
// Uses pre-compiled regex patterns
cleanedLine := line
cleanedLine = timestampPattern1.ReplaceAllString(cleanedLine, "")
cleanedLine = timestampPattern2.ReplaceAllString(cleanedLine, "")
cleanedLine = timestampPattern3.ReplaceAllString(cleanedLine, "")
cleanedLine = logLevelPattern.ReplaceAllString(cleanedLine, "")
cleanedLine = strings.TrimSpace(cleanedLine)
return cleanedLine
}
Occurrence 2: pkg/cli/copilot_agent.go:297
func extractErrorMessage(line string) string {
// Compiles regex inline
line = regexp.MustCompile(`^\d{4}-\d{2}-\d{2}[T\s]...`).ReplaceAllString(line, "")
line = regexp.MustCompile(`^\[\d{4}-\d{2}-\d{2}\s+...`).ReplaceAllString(line, "")
line = regexp.MustCompile(`^(ERROR|WARN|WARNING|...)...`).ReplaceAllString(line, "")
line = strings.TrimSpace(line)
return line
}
Analysis:
- Both functions clean error messages from log lines
- Metrics version uses pre-compiled patterns (more efficient)
- Copilot agent version compiles inline (less efficient)
- Impact: Code duplication, maintenance burden, performance inconsistency
Recommendation: Create shared utility function in pkg/logger/ or pkg/workflow/strings.go
Priority 2: isMCPType() - Near-Duplicate with Inconsistency
Occurrence 1: pkg/workflow/mcp-config.go:954
func isMCPType(typeStr string) bool {
switch typeStr {
case "stdio", "http", "local": // Supports 3 types
return true
default:
return false
}
}
Occurrence 2: pkg/parser/frontmatter.go:72
func isMCPType(typeStr string) bool {
switch typeStr {
case "stdio", "http": // Supports only 2 types
return true
default:
return false
}
}
Analysis:
- Critical inconsistency: Different valid type sets
mcp-config.go accepts "local" but frontmatter.go does not
- Impact: Validation inconsistency, potential bugs
Recommendation: Single source of truth in pkg/parser/mcp.go or as constant
Priority 3: parseRunURL() - Function Wrapping
Occurrence 1: pkg/cli/audit.go:103
func parseRunURL(input string) (RunURLInfo, error) {
runID, owner, repo, hostname, err := parser.ParseRunURL(input)
if err != nil {
return RunURLInfo{}, err
}
return RunURLInfo{
RunID: runID,
Owner: owner,
Repo: repo,
Hostname: hostname,
}, nil
}
Occurrence 2: pkg/parser/github_urls.go:177
func parseRunURL(host, owner, repo string, parts []string) (*GitHubURLComponents, error) {
// Different signature - internal parser
...
}
Analysis:
- Different signatures (not true duplicates)
- CLI version wraps parser's public function
- Impact: Acceptable wrapper pattern, but review if necessary
Recommendation: Review if wrapper adds value or just adds indirection
2. Function Clustering Analysis
Clustering by Naming Patterns
| Prefix |
Count |
Common Files |
Organization Status |
get* |
88 |
scripts.go (31), mcp_servers.go (9), js.go (6) |
✅ Good |
extract* |
62 |
frontmatter.go (11), logs_metrics.go (5), tools.go (4) |
⚠️ Review |
parse* |
60 |
Multiple files |
✅ Good |
is* |
46 |
Various validation files |
✅ Good |
generate* |
40 |
Various files |
✅ Good |
build* |
37 |
Compiler/builder files |
✅ Good |
render* |
34 |
Console/output files |
✅ Good |
validate* |
76 |
Dedicated *_validation.go files |
✅ Excellent |
Notable Observation: The 31 get* functions in scripts.go form a script registry pattern - this is well-organized ✅
3. Validation Functions - Excellent Organization ✅
The codebase demonstrates best-practice validation organization:
pkg/workflow/
├── agent_validation.go
├── bundler_validation.go
├── docker_validation.go
├── engine_validation.go
├── expression_validation.go
├── mcp_config_validation.go
├── npm_validation.go
├── pip_validation.go
├── repository_features_validation.go
├── runtime_validation.go
├── schema_validation.go
├── step_order_validation.go
├── strict_mode_validation.go
├── template_validation.go
└── validation.go (general)
Status: Each validation concern has a dedicated file following the *_validation.go pattern. No refactoring needed.
4. Small Files Analysis (Potential Consolidation)
Files with ≤2 functions and <100 lines in pkg/workflow/:
| File |
Lines |
Funcs |
Purpose |
temp_folder.go |
13 |
1 |
Temp folder management |
xpia.go |
13 |
1 |
XPIA detection |
safe_outputs_prompt.go |
14 |
1 |
Prompt text |
detection.go |
17 |
1 |
Detection logic |
github_context.go |
18 |
1 |
GitHub context |
cache_memory_prompt.go |
19 |
1 |
Prompt text |
git_patch.go |
19 |
1 |
Git patch handling |
edit_tool_prompt.go |
22 |
2 |
Prompt text |
playwright_prompt.go |
22 |
2 |
Prompt text |
safe_output_config.go |
25 |
1 |
Config struct |
yaml_generation.go |
25 |
2 |
YAML helpers |
domain_sanitization.go |
29 |
1 |
Sanitization |
| ... and 18 more |
|
|
|
Analysis:
- Many prompt-related files could be grouped:
*_prompt.go → prompts.go
- Small utility functions scattered
- Impact: Many files to navigate, harder to discover related functions
Recommendation: Consider consolidating related small files:
- Merge prompt files →
prompts.go or prompt_templates.go
- Group small helpers →
helpers.go or enhance existing config_helpers.go
5. Package Organization Assessment
pkg/cli/ (75 files) - ✅ Well-Organized
Patterns observed:
- ✅ Command pattern:
*_command.go (add_command, compile_command, etc.)
- ✅ Feature grouping:
mcp_*.go (11 MCP-related files)
- ✅ Feature grouping:
logs_*.go (5 logs-related files)
- ✅ Clear responsibilities per file
Status: Excellent organization, no refactoring needed
pkg/workflow/ (155 files) - ⚠️ Mixed
Good patterns:
- ✅ Creation operations:
create_*.go pattern
- ✅ Validation:
*_validation.go pattern
- ✅ Feature files:
docker.go, npm.go, pip.go, nodejs.go
- ✅ Safe outputs:
safe_*.go grouping
- ✅ Engine files:
*_engine.go pattern
Areas for improvement:
- ⚠️ Many small files with 1-2 functions
- ⚠️ 30+ files with <100 lines
- ⚠️ Some generic utilities not centralized
Status: Generally good, but could benefit from consolidation of small files
pkg/parser/ (10 files) - ✅ Excellent
Clear responsibilities:
frontmatter.go (11 extract functions - makes sense)
github_urls.go
import_cache.go
schema.go
yaml_error.go
- etc.
Status: Well-organized, no refactoring needed
pkg/console/ (5 files) - ✅ Excellent
console/
├── banner.go
├── console.go
├── format.go (all FormatXXXMessage functions here - good!)
├── render.go
└── spinner.go
Status: Perfect organization, no refactoring needed
6. Function Distribution Analysis
Extract Functions (62 total):
pkg/parser/frontmatter.go: 11 functions (makes sense - frontmatter parsing)
pkg/cli/logs_metrics.go: 5 functions (log processing)
pkg/workflow/tools.go: 4 functions
- Scattered across 20+ other files
Status: Frontmatter and logs metrics concentrations are justified by purpose
Refactoring Recommendations
Priority 1: High Impact, Low Effort
1. Consolidate extractErrorMessage() Duplicates
Action: Create single implementation
- Location:
pkg/logger/error_formatting.go or pkg/workflow/strings.go
- Implementation: Use pre-compiled regex patterns (metrics.go version)
- Effort: 1-2 hours
- Benefits:
- Single source of truth
- Better performance (pre-compiled patterns)
- Easier maintenance
Example:
// pkg/logger/error_formatting.go
package logger
var (
timestampPattern1 = regexp.MustCompile(`^\d{4}-\d{2}-\d{2}[T\s]\d{2}:\d{2}:\d{2}(\.\d+)?([+-]\d{2}:\d{2}|Z)?\s*`)
timestampPattern2 = regexp.MustCompile(`^\[\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\]\s*`)
logLevelPattern = regexp.MustCompile(`^(ERROR|WARN|WARNING|INFO|DEBUG):\s*`)
)
func ExtractErrorMessage(line string) string {
cleaned := timestampPattern1.ReplaceAllString(line, "")
cleaned = timestampPattern2.ReplaceAllString(cleaned, "")
cleaned = logLevelPattern.ReplaceAllString(cleaned, "")
return strings.TrimSpace(cleaned)
}
2. Unify isMCPType() Implementations
Action: Create single source of truth with consistent validation
- Location:
pkg/parser/mcp.go or create pkg/workflow/mcp_types.go
- Critical: Determine correct set of valid types (
stdio, http, local?)
- Effort: 1 hour
- Benefits:
- Consistent validation across codebase
- Fix potential bugs from inconsistent type checking
- Easier to add new MCP types in future
Example:
// pkg/parser/mcp.go
package parser
// ValidMCPTypes defines all valid MCP server types
var ValidMCPTypes = []string{"stdio", "http", "local"}
// IsMCPType checks if a type string represents an MCP-compatible type
func IsMCPType(typeStr string) bool {
switch typeStr {
case "stdio", "http", "local":
return true
default:
return false
}
}
Then update both files to use:
import "github.com/githubnext/gh-aw/pkg/parser"
// Use parser.IsMCPType() instead of local function
Priority 2: Medium Impact, Medium Effort
3. Consolidate Small Prompt Files
Current state:
safe_outputs_prompt.go (14 lines, 1 func)
cache_memory_prompt.go (19 lines, 1 func)
edit_tool_prompt.go (22 lines, 2 funcs)
playwright_prompt.go (22 lines, 2 funcs)
pr_prompt.go (61 lines, 2 funcs)
Recommendation: Merge into pkg/workflow/prompts.go or prompt_templates.go
- Effort: 2-3 hours
- Benefits:
- Easier to find all prompts in one place
- Reduce file count
- Better overview of all prompt templates
4. Review and Consolidate Small Utility Files
Candidates:
temp_folder.go (13 lines, 1 func)
xpia.go (13 lines, 1 func)
detection.go (17 lines, 1 func)
github_context.go (18 lines, 1 func)
git_patch.go (19 lines, 1 func)
domain_sanitization.go (29 lines, 1 func)
Recommendation: Group related utilities
- Option A: Create
pkg/workflow/utilities.go for miscellaneous helpers
- Option B: Merge into existing related files (e.g.,
xpia.go → validation.go)
- Effort: 3-4 hours
- Benefits: Fewer files to navigate, related functions together
Priority 3: Long-term Improvements
5. Consider Extract Function Consolidation
Current state: 62 extract* functions across 20+ files
Recommendation: Review for patterns
- Many frontmatter extract functions (11) - justified ✅
- Many logs metrics extract functions (5) - justified ✅
- Others scattered - review if some could be consolidated
Effort: 4-6 hours (analysis + refactoring)
Benefits: Potentially discover more duplicates or near-duplicates
Implementation Checklist
Phase 1: Quick Wins (Priority 1)
Phase 2: File Consolidation (Priority 2)
Phase 3: Future Considerations (Priority 3)
Analysis Metadata
- Analysis Date: 2025-12-04
- Tool: Semantic function clustering + manual code review
- Files Analyzed: 259 Go files (excluding tests)
- Total Functions: 1,568
- Duplicates Found: 3 cases (2 high priority, 1 low priority)
- Small Files (<100 lines, ≤2 funcs): 30+ in
pkg/workflow/
- Validation Files: 15 (excellently organized)
- Creation Files: 6
create_*.go (well organized)
Success Criteria
This refactoring will be successful when:
- ✅ Zero duplicate
extractErrorMessage() implementations
- ✅ Single source of truth for
isMCPType() validation
- ✅ Consistent MCP type validation across codebase
- ✅ Reduced file count through consolidation of small files
- ✅ Maintained or improved test coverage
- ✅ No functionality broken or changed (refactoring only)
Positive Findings (Celebrate What Works!) 🎉
The codebase demonstrates many excellent patterns:
- ✅ Validation organization: Each concern has dedicated
*_validation.go file
- ✅ Create operations: Clear
create_*.go pattern
- ✅ Command pattern: CLI commands follow
*_command.go convention
- ✅ Feature grouping: MCP files grouped with
mcp_* prefix
- ✅ Console formatting: All format functions in
console/format.go
- ✅ Parser package: Clean, focused responsibilities
- ✅ Scripts registry: 31 get functions organized in
scripts.go
Overall Assessment: The codebase is well-organized with clear patterns. The identified issues are minor and easily addressable.
Analysis Summary
Comprehensive semantic analysis of all Go source files in the repository to identify refactoring opportunities through function clustering, duplicate detection, and code organization assessment.
Scope: 259 Go files (excluding tests) containing 1,568 functions across
pkg/cli,pkg/workflow,pkg/parser, and other packages.Full Analysis Report
Executive Summary
The codebase demonstrates generally good organization with clear patterns:
*_validation.go)create_*.go)pkg/workflow/Detailed Findings
1. Duplicate Functions Detected
Priority 1:
extractErrorMessage()- Near-Duplicate (90% similar)Occurrence 1: pkg/workflow/metrics.go:462
Occurrence 2: pkg/cli/copilot_agent.go:297
Analysis:
Recommendation: Create shared utility function in
pkg/logger/orpkg/workflow/strings.goPriority 2:
isMCPType()- Near-Duplicate with InconsistencyOccurrence 1: pkg/workflow/mcp-config.go:954
Occurrence 2: pkg/parser/frontmatter.go:72
Analysis:
mcp-config.goaccepts"local"butfrontmatter.godoes notRecommendation: Single source of truth in
pkg/parser/mcp.goor as constantPriority 3:
parseRunURL()- Function WrappingOccurrence 1: pkg/cli/audit.go:103
Occurrence 2: pkg/parser/github_urls.go:177
Analysis:
Recommendation: Review if wrapper adds value or just adds indirection
2. Function Clustering Analysis
Clustering by Naming Patterns
get*scripts.go(31),mcp_servers.go(9),js.go(6)extract*frontmatter.go(11),logs_metrics.go(5),tools.go(4)parse*is*generate*build*render*validate**_validation.gofilesNotable Observation: The 31
get*functions inscripts.goform a script registry pattern - this is well-organized ✅3. Validation Functions - Excellent Organization ✅
The codebase demonstrates best-practice validation organization:
Status: Each validation concern has a dedicated file following the
*_validation.gopattern. No refactoring needed.4. Small Files Analysis (Potential Consolidation)
Files with ≤2 functions and <100 lines in
pkg/workflow/:temp_folder.goxpia.gosafe_outputs_prompt.godetection.gogithub_context.gocache_memory_prompt.gogit_patch.goedit_tool_prompt.goplaywright_prompt.gosafe_output_config.goyaml_generation.godomain_sanitization.goAnalysis:
*_prompt.go→prompts.goRecommendation: Consider consolidating related small files:
prompts.goorprompt_templates.gohelpers.goor enhance existingconfig_helpers.go5. Package Organization Assessment
pkg/cli/ (75 files) - ✅ Well-Organized
Patterns observed:
*_command.go(add_command, compile_command, etc.)mcp_*.go(11 MCP-related files)logs_*.go(5 logs-related files)Status: Excellent organization, no refactoring needed
pkg/workflow/ (155 files) -⚠️ Mixed
Good patterns:
create_*.gopattern*_validation.gopatterndocker.go,npm.go,pip.go,nodejs.gosafe_*.gogrouping*_engine.gopatternAreas for improvement:
Status: Generally good, but could benefit from consolidation of small files
pkg/parser/ (10 files) - ✅ Excellent
Clear responsibilities:
frontmatter.go(11 extract functions - makes sense)github_urls.goimport_cache.goschema.goyaml_error.goStatus: Well-organized, no refactoring needed
pkg/console/ (5 files) - ✅ Excellent
Status: Perfect organization, no refactoring needed
6. Function Distribution Analysis
Extract Functions (62 total):
pkg/parser/frontmatter.go: 11 functions (makes sense - frontmatter parsing)pkg/cli/logs_metrics.go: 5 functions (log processing)pkg/workflow/tools.go: 4 functionsStatus: Frontmatter and logs metrics concentrations are justified by purpose
Refactoring Recommendations
Priority 1: High Impact, Low Effort
1. Consolidate
extractErrorMessage()DuplicatesAction: Create single implementation
pkg/logger/error_formatting.goorpkg/workflow/strings.goExample:
2. Unify
isMCPType()ImplementationsAction: Create single source of truth with consistent validation
pkg/parser/mcp.goor createpkg/workflow/mcp_types.gostdio,http,local?)Example:
Then update both files to use:
Priority 2: Medium Impact, Medium Effort
3. Consolidate Small Prompt Files
Current state:
safe_outputs_prompt.go(14 lines, 1 func)cache_memory_prompt.go(19 lines, 1 func)edit_tool_prompt.go(22 lines, 2 funcs)playwright_prompt.go(22 lines, 2 funcs)pr_prompt.go(61 lines, 2 funcs)Recommendation: Merge into
pkg/workflow/prompts.goorprompt_templates.go4. Review and Consolidate Small Utility Files
Candidates:
temp_folder.go(13 lines, 1 func)xpia.go(13 lines, 1 func)detection.go(17 lines, 1 func)github_context.go(18 lines, 1 func)git_patch.go(19 lines, 1 func)domain_sanitization.go(29 lines, 1 func)Recommendation: Group related utilities
pkg/workflow/utilities.gofor miscellaneous helpersxpia.go→validation.go)Priority 3: Long-term Improvements
5. Consider Extract Function Consolidation
Current state: 62
extract*functions across 20+ filesRecommendation: Review for patterns
Effort: 4-6 hours (analysis + refactoring)
Benefits: Potentially discover more duplicates or near-duplicates
Implementation Checklist
Phase 1: Quick Wins (Priority 1)
pkg/logger/error_formatting.gowith unifiedExtractErrorMessage()pkg/workflow/metrics.goto use new functionpkg/cli/copilot_agent.goto use new functionIsMCPType()unified function inpkg/parser/mcp.gomcp-config.goandfrontmatter.goto use itlocalincluded?)Phase 2: File Consolidation (Priority 2)
pkg/workflow/prompts.goorprompt_templates.goPhase 3: Future Considerations (Priority 3)
extract*functionsAnalysis Metadata
pkg/workflow/create_*.go(well organized)Success Criteria
This refactoring will be successful when:
extractErrorMessage()implementationsisMCPType()validationPositive Findings (Celebrate What Works!) 🎉
The codebase demonstrates many excellent patterns:
*_validation.gofilecreate_*.gopattern*_command.goconventionmcp_*prefixconsole/format.goscripts.goOverall Assessment: The codebase is well-organized with clear patterns. The identified issues are minor and easily addressable.
Quick Action Items
For immediate impact, focus on:
Consolidate
extractErrorMessage()- Two nearly identical functionspkg/workflow/metrics.go:462andpkg/cli/copilot_agent.go:297Unify
isMCPType()- Inconsistent validation (critical!)pkg/workflow/mcp-config.go:954andpkg/parser/frontmatter.go:72Consider consolidating small prompt files - 5+ files with single functions
*_prompt.gofiles inpkg/workflow/Estimated Total Effort:
Total: ~11-16 hours for full implementation