Skip to content

[refactor] Semantic Function Clustering Analysis - Code Organization & Refactoring Opportunities #1064

@github-actions

Description

@github-actions

Executive Summary

Analyzed 63 Go source files (excluding tests) across 16 packages containing ~450 functions to identify refactoring opportunities through semantic function clustering and duplicate detection.

Key Findings:

  • Overall code quality is excellent - well-organized with strong use of Go patterns
  • 🔧 8 high-priority refactoring opportunities identified (duplicate logic, misplaced functions)
  • 📊 Function clusters are generally well-organized by file purpose
  • ⚠️ Minor inconsistencies in naming patterns and file organization

Analysis covered:

  • 16 packages in internal/ directory
  • Pattern analysis using semantic clustering
  • Duplicate detection across files
  • File organization validation

Package Overview

Files by Package

Package Files Primary Purpose
logger 14 Debug logging framework, file/markdown/JSONL logging, RPC logging
server 10 HTTP server (routed/unified modes), handlers, middleware
config 8 Configuration parsing (TOML/JSON), validation, variable expansion
cmd 7 CLI (Cobra), flag management, completions
difc 5 Data Information Flow Control labels and evaluation
guard 4 Security guard framework and registry
testutil/mcptest 4 Test utilities for MCP integration
launcher 3 Backend process management and connection pooling
mcp 2 MCP protocol types and connections
tty 2 Terminal detection utilities
auth 1 Authentication header parsing
envutil 1 Environment variable utilities
middleware 1 HTTP middleware (jq schema processing)
sys 1 System utilities
timeutil 1 Time formatting utilities
version 1 Version management

Identified Issues

Priority 1: High-Impact Refactoring

1. Duplicate Auth Middleware Application (server package)

Issue: Auth middleware is applied with identical pattern in two places.

Location: internal/server/transport.go:120-122 and 137-139

Current Code Pattern:

// Pattern repeated twice
finalHandler := shutdownHandler
if apiKey != "" {
    finalHandler = authMiddleware(apiKey, shutdownHandler.ServeHTTP)
}

finalCloseHandler := closeHandler
if apiKey != "" {
    finalCloseHandler = authMiddleware(apiKey, closeHandler.ServeHTTP)
}

Recommendation: Extract to helper function

func applyAuthIfConfigured(apiKey string, handler http.HandlerFunc) http.HandlerFunc {
    if apiKey != "" {
        return authMiddleware(apiKey, handler)
    }
    return handler
}

// Usage
finalHandler := applyAuthIfConfigured(apiKey, shutdownHandler)
finalCloseHandler := applyAuthIfConfigured(apiKey, closeHandler)

Estimated Impact: Reduced duplication, easier maintenance
Effort: 30 minutes


2. Misplaced Generic Logging Function (server package)

Issue: Generic runtime error logging function is in auth-specific file.

Function: logRuntimeError() in internal/server/auth.go

Problem: This function logs generic runtime errors and panics, not authentication-specific errors.

Current Location: internal/server/auth.go (lines ~50-60)

Recommendation: Move to internal/server/server.go or create internal/server/errors.go for error handling utilities.

Estimated Impact: Improved code organization, clearer file purpose
Effort: 15 minutes


3. Large Schema Normalization Function in Types File (mcp package)

Issue: 86-line schema normalization logic mixed with type definitions.

Function: NormalizeInputSchema() in internal/mcp/types.go

Problem: types.go should contain simple type definitions, not complex transformation logic.

Current: 86 lines of schema transformation logic in types.go
Better: Extract to internal/mcp/schema.go or internal/mcp/normalize.go

Recommendation:

  1. Create internal/mcp/schema.go
  2. Move NormalizeInputSchema() to new file
  3. Keep only type definitions in types.go

Estimated Impact: Clearer separation of concerns, easier to test
Effort: 20 minutes


4. Unnecessary Validation Wrapper Function (config package)

Issue: Thin wrapper function that adds no value.

Function: validateServerConfig() in internal/config/validation.go:104-106

Current Code:

func validateServerConfig(name string, srv ServerConfig, cfg *Config) error {
    return validateServerConfigWithCustomSchemas(name, srv, cfg, nil)
}

Problem: This wrapper just calls the main function with nil as the last parameter.

Recommendation: Remove wrapper, call validateServerConfigWithCustomSchemas(name, srv, cfg, nil) directly at call sites.

Estimated Impact: Reduced indirection, clearer code
Effort: 20 minutes


Priority 2: Medium-Impact Improvements

5. Docker Inspect Template Duplication (config package)

Issue: Three functions wrap runDockerInspect() with different format templates.

Location: internal/config/validation_env.go

Functions:

  • checkPortMapping() → uses template \{\{range .NetworkSettings.Ports}}...
  • checkStdinInteractive() → uses template \{\{.Config.OpenStdin}}
  • checkLogDirMounted() → uses template \{\{range .Mounts}}...

Recommendation: Extract format templates as package constants to make patterns explicit.

const (
    dockerInspectPortsTemplate = `\{\{range .NetworkSettings.Ports}}...`
    dockerInspectStdinTemplate = `\{\{.Config.OpenStdin}}`
    dockerInspectMountsTemplate = `\{\{range .Mounts}}...`
)

Estimated Impact: Easier to maintain and modify templates
Effort: 30 minutes


6. Shared Launch Logic Between Stateless/Stateful (launcher package)

Issue: GetOrLaunch() and GetOrLaunchForSession() share ~70% logic.

Location: internal/launcher/launcher.go

Shared Logic:

  • Timeout handling
  • Connection creation
  • Error logging
  • Success logging

Recommendation: Extract common flow to private helper launchConnection() that both functions call with appropriate parameters.

Estimated Impact: Reduced duplication, easier to modify launch logic
Effort: 1-2 hours


7. DIFC Violation Error Formatting (difc package)

Issue: Large formatting function (36 lines) in evaluator file.

Function: FormatViolationError() in internal/difc/evaluator.go:204

Problem: String formatting logic mixed with evaluation logic.

Recommendation: Extract to internal/difc/errors.go or internal/difc/formatting.go.

Estimated Impact: Clearer separation between evaluation and presentation
Effort: 20 minutes


8. Flag Default Getters Pattern (cmd package)

Issue: Three similar getDefault*() functions with env var fallback pattern.

Location: internal/cmd/flags_logging.go

Functions:

  • getDefaultLogDir() - env var fallback
  • getDefaultPayloadDir() - env var fallback
  • getDefaultPayloadSizeThreshold() - env var fallback

Current Pattern (repeated 3 times):

func getDefaultLogDir() string {
    if dir := os.Getenv("MCP_GATEWAY_LOG_DIR"); dir != "" {
        return dir
    }
    return "/tmp/gh-aw/mcp-logs"
}

Recommendation: Create generic helper with callback

func getEnvOrDefault[T any](envVar string, defaultVal T, parser func(string) (T, error)) T {
    if val := os.Getenv(envVar); val != "" {
        if parsed, err := parser(val); err == nil {
            return parsed
        }
    }
    return defaultVal
}

Estimated Impact: DRY principle, easier to add new env-backed defaults
Effort: 45 minutes


Positive Patterns Identified

✅ Excellent Use of Helpers

Logger Package - Already well-refactored:

  • logWithLevel() helper consolidates LogInfo/Warn/Error/Debug variants
  • logWithMarkdownLevel() consolidates markdown logging
  • logWithLevelAndServer() consolidates per-server logging
  • Generic initGlobalLogger[T]() and closeGlobalLogger[T]() use generics effectively

Config Package - Good consolidation:

  • expandVariablesCore() centralizes variable expansion logic
  • intPtrOrDefault() helper for optional fields
  • newLabelWithTags() (difc) reduces label duplication

Guard Package - Clean architecture:

  • Clear interface separation (guard.go defines interface, noop.go provides implementation)
  • Well-organized context helpers in context.go

Function Clustering Analysis

By Naming Pattern

Pattern Count Packages Assessment
New*() 45+ All ✅ Consistent constructor naming
Log*() 30+ logger, launcher ✅ Clear logging hierarchy
validate*() 13 config ✅ Well-organized validation
Handle*() 7 server ⚠️ Inconsistent casing (HandleHealth vs handleHealth)
Get*() 20+ Multiple ✅ Clear getter convention
Init*() 8 logger, cmd ✅ Initialization pattern
Close*() 7 logger ✅ Cleanup pattern

Outliers (Functions in Unexpected Files)

Function Current File Issue Priority
logRuntimeError() auth.go Not auth-specific High
NormalizeInputSchema() types.go Complex logic in types file High
FormatViolationError() evaluator.go Formatting in logic file Medium

Detailed Cluster Analysis

Cluster: Configuration Functions

Pattern: Configuration loading, parsing, and validation
Files: internal/config/ (8 files)
Key Functions:

  • LoadFromFile() - TOML loading
  • LoadFromStdin() - JSON loading from stdin
  • validateServerConfigWithCustomSchemas() - Server validation
  • ExpandRawJSONVariables() - Variable expansion

Analysis: ✅ Well-organized by functionality. Files are appropriately named:

  • config_core.go - Core TOML loading
  • config_stdin.go - Stdin JSON loading
  • validation.go - Core validation logic
  • validation_env.go - Environment validation
  • validation_schema.go - JSON schema validation

Minor Issue: Thin wrapper validateServerConfig() adds unnecessary indirection.


Cluster: Server Functions

Pattern: HTTP server and routing
Files: internal/server/ (10 files)
Key Functions:

  • CreateHTTPServerForMCP() - Unified mode server factory
  • CreateHTTPServerForRoutedMode() - Routed mode server factory
  • HandleHealth() - Health check handler
  • authMiddleware() - Auth middleware

Analysis: ⚠️ Good separation by file, but has duplication issues:

  • Duplicate auth middleware application (Priority 1)
  • logRuntimeError() in wrong file (Priority 1)
  • Legacy Server struct appears unused

Files are well-organized:

  • transport.go - HTTP transport layer
  • routed.go - Routed mode
  • unified.go - Unified mode
  • handlers.go - Endpoint handlers
  • auth.go - Auth middleware
  • health.go - Health checks

Cluster: Logger Functions

Pattern: Debug logging, file logging, RPC logging
Files: internal/logger/ (14 files)
Key Functions: 30+ logging functions across text/markdown/JSONL/per-server formats

Analysis: ✅ Excellent architecture with strong use of Go generics and helper functions.

Strengths:

  • Generic initGlobalLogger[T]() eliminates init duplication
  • Helper functions (logWithLevel, logWithMarkdownLevel) consolidate variants
  • Clear file organization by output format
  • Good use of generics for type-safe logger management

No major issues identified - this package shows best practices.


Cluster: Launcher Functions

Pattern: Backend process management and connection pooling
Files: internal/launcher/ (3 files)
Key Functions:

  • GetOrLaunch() - Stateless connection
  • GetOrLaunchForSession() - Stateful pooled connection
  • NewSessionConnectionPool() - Pool management

Analysis: ✅ Clean architecture with room for minor improvement.

Strengths:

  • Clear separation: launcher.go (main), connection_pool.go (pool), log_helpers.go (logging)
  • Good helper function organization

Minor Issue: ~70% code overlap between GetOrLaunch() and GetOrLaunchForSession() (Priority 2).


Implementation Checklist

High Priority (Do First)

Medium Priority (Follow-up)

Future Considerations

  • Review legacy Server struct in server.go (appears unused)
  • Standardize Handle* function casing (HandleHealth vs handleHealth)
  • Consider extracting tool conversion logic from unified.go

Analysis Metadata

  • Total Go Files Analyzed: 63 (excluding test files)
  • Total Packages: 16
  • Function Clusters Identified: 8 major clusters
  • Outliers Found: 3 high-priority, 5 medium-priority
  • Duplicates Detected: 4 patterns with significant duplication
  • Detection Method: Semantic code analysis + naming pattern analysis
  • Analysis Date: 2026-02-18

Conclusion

The codebase demonstrates high-quality Go practices with excellent use of:

  • Generic functions for type-safe code reuse
  • Helper functions to eliminate duplication
  • Clear file organization by feature/purpose
  • Consistent naming conventions

The identified refactoring opportunities are minor improvements that will:

  • Reduce code duplication by ~200 lines
  • Improve file organization and code clarity
  • Make the codebase easier to maintain and extend

Recommended approach: Address high-priority issues first (estimated 2-3 hours total), then tackle medium-priority improvements in subsequent PRs.

Generated by Semantic Function Refactoring

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions