Skip to content

[refactor] Semantic Function Clustering - High-Priority Refactoring Opportunities #728

@github-actions

Description

@github-actions

🔧 Semantic Function Clustering Analysis

Analysis performed on repository: github/gh-aw-mcpg
Analysis date: 2026-02-05
Total Go files analyzed: 62 files (39 non-test files)

Executive Summary

The codebase demonstrates excellent overall organization with clear package boundaries and minimal code duplication. The analysis identified 2 high-priority refactoring opportunities and several minor organizational improvements. Most patterns follow Go idioms and domain-driven design principles.

Key Findings:

  • Strong package cohesion - Most files serve a single, clear purpose
  • Minimal duplication - Most "similar" functions serve different domains
  • ⚠️ 1 misplaced function - TruncateSessionID in auth package (used only by logger)
  • ⚠️ 1 outlier function - runDockerInspect in validation package (Docker operation)
  • Well-organized helpers - Domain-specific helpers properly placed

Function Inventory Summary

Packages Analyzed

Package Files Primary Purpose Organization Quality
internal/auth 1 Authentication header parsing ✅ Good (1 outlier)
internal/cmd 8 CLI commands and flags ✅ Excellent
internal/config 8 Configuration loading/validation ⚠️ Good (1 outlier)
internal/difc 5 DIFC security labels ✅ Excellent
internal/guard 4 Security guards ✅ Excellent
internal/launcher 3 Backend process management ✅ Excellent
internal/logger 13 Logging framework ✅ Excellent
internal/mcp 2 MCP protocol types ✅ Excellent
internal/middleware 1 Request middleware ✅ Excellent
internal/server 10 HTTP server implementation ✅ Excellent
internal/sys 1 System utilities ✅ Good
internal/testutil 4 Testing utilities ✅ Excellent
internal/timeutil 1 Time formatting ✅ Excellent
internal/tty 2 TTY detection ✅ Excellent
internal/version 1 Version management ✅ Excellent

Total Functions Cataloged: ~350 functions and methods
Semantic Clusters Identified: 9 major patterns
High-Priority Issues Found: 2
Duplicates Detected: 0 (all similar functions serve different domains)


Identified Issues

1. ⚠️ Misplaced Function: TruncateSessionID

Priority: 🔴 HIGH
Impact: Medium - Code organization and discoverability

Problem

TruncateSessionID() function is defined in internal/auth/header.go but is exclusively used by the logger package (internal/server/sdk_logging.go, 7 occurrences). This violates the principle of placing functions close to their usage.

Current Location

// File: internal/auth/header.go:172
func TruncateSessionID(sessionID string) string {
    if len(sessionID) <= 8 {
        return sessionID
    }
    return sessionID[:8] + "..."
}

Usage Analysis

All 7 calls are in internal/server/sdk_logging.go:

  • Line 54: Logging session comparison
  • Line 70: JSON-RPC request logging
  • Line 111: Error response logging
  • Lines 119-120: Session ID diagnostics
  • Lines 125-126: Session mismatch logging

Evidence

$ grep -r "TruncateSessionID" internal/ --include="*.go" | grep -v "_test.go"
internal/server/sdk_logging.go:54:     mode, auth.TruncateSessionID(sessionID), ...
internal/server/sdk_logging.go:70:     mode, jsonrpcReq.Method, auth.TruncateSessionID(sessionID))
internal/server/sdk_logging.go:111:    mode, jsonrpcReq.Method, auth.TruncateSessionID(sessionID), ...
internal/server/sdk_logging.go:119:    logSDK.Printf("    Session ID: %s", auth.TruncateSessionID(sessionID))
internal/server/sdk_logging.go:120:    logSDK.Printf("    MCP-Session-Id header: %s", auth.TruncateSessionID(mcpSessionID))
internal/server/sdk_logging.go:125:    mode, jsonrpcReq.Method, auth.TruncateSessionID(sessionID), ...
internal/server/sdk_logging.go:126:    auth.TruncateSessionID(mcpSessionID), errorMsg)
internal/auth/header.go:172:func TruncateSessionID(sessionID string) string {

Recommendation

Move TruncateSessionID to internal/logger/sanitize/sanitize.go

Benefits:

  1. ✅ Collocates with similar function TruncateSecret() (same purpose, different length)
  2. ✅ Places function near its only usage (logger package)
  3. ✅ Improves discoverability - all truncation helpers in one place
  4. ✅ Maintains consistency with existing sanitize package purpose

Implementation Steps:

  1. Move function from internal/auth/header.go to internal/logger/sanitize/sanitize.go
  2. Update imports in internal/server/sdk_logging.go: auth.TruncateSessionIDsanitize.TruncateSessionID
  3. Optionally: Parameterize truncation length for reusability
    func TruncateString(s string, maxLen int) string {
        if len(s) <= maxLen {
            return s
        }
        return s[:maxLen] + "..."
    }

Estimated Effort: 30-45 minutes
Risk: Low - Simple move with clear usage pattern


2. ⚠️ Outlier Function: runDockerInspect in validation_env.go

Priority: 🟡 MEDIUM
Impact: Low-Medium - Single-responsibility principle violation

Problem

runDockerInspect() is a Docker operation helper located in a validation package. This function executes docker inspect commands but doesn't perform validation logic itself.

Current Location

// File: internal/config/validation_env.go:156
func runDockerInspect(containerID, formatTemplate string) (string, error) {
    cmd := exec.Command("docker", "inspect", 
        "--format", formatTemplate, containerID)
    output, err := cmd.Output()
    // ... error handling ...
    return strings.TrimSpace(string(output)), nil
}

Analysis

Why it's an outlier:

  • ❌ Executes Docker commands (infrastructure operation)
  • ❌ Not validation logic (just a Docker client wrapper)
  • ❌ Breaks single-responsibility principle for validation_env.go
  • ✅ Used by validation functions (5 callers in same file)

Why current placement is acceptable:

  • ✅ Only used within validation_env.go (unexported helper)
  • ✅ Tightly coupled to containerized validation checks
  • ✅ Small function (~20 lines)

Recommendation

Option A (Recommended): Keep as-is with improved documentation

  • Add comment explaining it's a Docker operation helper for validation
  • Document that it's intentionally kept in-file due to tight coupling
  • Estimated Effort: 5 minutes

Option B (Future Enhancement): Create internal/docker/ package

  • Extract Docker operations if more Docker helpers are added
  • Consolidate Docker client interactions
  • Estimated Effort: 2-3 hours (requires careful dependency analysis)
  • Trigger: If 3+ Docker helper functions emerge

Current Recommendation: Option A - The function is an acceptable outlier given its tight coupling and single-file usage.


Detailed Function Clusters

Cluster 1: Validation Functions ✅

Pattern: validate*, check*, ensure*
Files: 8 files across config/, auth/, server/ packages
Organization:Excellent - Well-distributed by domain

Examples:

  • config/validation.go: Config structure validation
  • config/validation_env.go: Environment validation
  • config/validation_schema.go: JSON schema validation
  • config/rules/rules.go: Validation rule constructors
  • auth/header.go: API key validation
  • server/http_helpers.go: Session validation

Analysis: Each validation function is correctly placed in its domain package. No consolidation needed.


Cluster 2: Logging Functions ✅

Pattern: Log*, log*, format*, sanitize*
Files: 13 files in logger/ package + domain-specific helpers
Organization:Excellent - Centralized core with domain helpers

Core Logger Files:

  • logger/file_logger.go - File-based logging
  • logger/jsonl_logger.go - JSONL format logging
  • logger/markdown_logger.go - Markdown format logging
  • logger/server_file_logger.go - Per-server log files
  • logger/sanitize/sanitize.go - Secret sanitization

Domain-Specific Helpers:

  • launcher/log_helpers.go - Launcher-specific logging (✅ Correctly placed)
  • server/auth.go: logRuntimeError - Runtime error logging (✅ Correctly placed)
  • server/http_helpers.go: logHTTPRequestBody - HTTP debugging (✅ Correctly placed)

Analysis: The separation between core logging (logger package) and domain helpers (in domain packages) follows Go best practices. No changes needed.


Cluster 3: HTTP/Transport Functions ✅

Pattern: handle*, create*, setup*, http*, transport*
Files: 10 files in server/ and mcp/ packages
Organization:Excellent - Clear separation by mode

Files:

  • server/handlers.go - Generic HTTP handlers
  • server/health.go - Health endpoint
  • server/routed.go - Routed mode implementation
  • server/unified.go - Unified mode implementation
  • server/transport.go - Transport layer
  • mcp/connection.go - MCP protocol connections

Analysis: Clear functional separation. Each file serves a distinct purpose. No refactoring needed.


Cluster 4: Initialization Functions ✅

Pattern: New*, Init*, Create*, Register*
Files: Distributed across all packages
Organization:Excellent - Idiomatic Go constructors

Patterns Identified:

  • New* - Constructor for structs (e.g., NewLauncher, NewUnified)
  • Init* - Initialize global state (e.g., InitFileLogger, InitServerFileLogger)
  • Create* - Factory functions (e.g., CreateHTTPServerForMCP)
  • Register* - Registration pattern (e.g., RegisterGuardType, RegisterDefaults)

Analysis: Naming conventions are consistent and idiomatic. The generic helper logger/common.go:initLogger[T] demonstrates good use of generics to reduce duplication.


Cluster 5: Formatting Functions ✅

Pattern: format*, Format*, truncate*, Truncate*
Files: 7 files across logger/, config/, auth/, difc/ packages
Organization: ⚠️ Good (1 outlier: TruncateSessionID)

Core Formatting:

  • logger/rpc_formatter.go - RPC message formatting
  • logger/sanitize/sanitize.go - Secret truncation/sanitization
  • timeutil/format.go - Duration formatting

Domain-Specific Formatting:

  • config/validation_schema.go: formatSchemaError - JSON schema errors (✅ Correctly placed)
  • difc/evaluator.go: FormatViolationError - DIFC violations (✅ Correctly placed)
  • auth/header.go: TruncateSessionID - Session ID truncation (⚠️ Outlier - see Issue Configure as a Go CLI tool #1)

Analysis: Domain-specific formatters are correctly placed except for TruncateSessionID (see Issue #1).


Cluster 6: Parsing/Extraction Functions ✅

Pattern: parse*, extract*, Parse*, Extract*
Files: 5 files across auth/, mcp/, logger/, server/ packages
Organization:Excellent

Examples:

  • auth/header.go: ParseAuthHeader, ExtractSessionID, ExtractAgentID
  • mcp/connection.go: parseSSEResponse
  • logger/rpc_helpers.go: extractEssentialFields, ExtractErrorMessage
  • server/unified.go: parseToolArguments

Analysis: All parsing functions are correctly placed in their domain packages. No consolidation needed.


Cluster 7: Connection/Session Management ✅

Pattern: Get*, Set*, Delete*, connection*, session*
Files: 4 files in launcher/, mcp/, server/ packages
Organization:Excellent - Clear ownership boundaries

Key Files:

  • launcher/connection_pool.go - Connection pool CRUD operations
  • launcher/launcher.go - Connection lifecycle (GetOrLaunch, GetOrLaunchForSession)
  • mcp/connection.go - MCP protocol connections
  • server/unified.go - Session management

Analysis: Well-defined interfaces with clear responsibilities. No refactoring needed.


Cluster 8: Cleanup/Close Functions ✅

Pattern: Close, close*, Stop, cleanup*
Files: Distributed across multiple packages
Organization:Excellent - Following Go interface patterns

Examples:

  • launcher/connection_pool.go: Stop, cleanupIdleConnections
  • launcher/launcher.go: Close
  • logger/*: Close methods on all logger types
  • mcp/connection.go: Close
  • server/unified.go: Close

Analysis: Consistent use of Close() method for resource cleanup following Go's io.Closer pattern. No changes needed.


Cluster 9: DIFC Security Functions ✅

Pattern: DIFC-specific operations (labels, evaluation, guards)
Files: 5 files in difc/ package
Organization:Excellent - Cohesive security domain

Files:

  • difc/agent.go - Agent label management
  • difc/capabilities.go - Capabilities tracking
  • difc/evaluator.go - Access control evaluation
  • difc/labels.go - Label types and operations
  • difc/resource.go - Resource labeling

Analysis: DIFC implementation is well-isolated in a dedicated package with clear interfaces. Excellent domain-driven design.


Duplicate Analysis: No True Duplicates Found

Investigated Candidates

❌ Not Duplicates: Environment Variable Expansion

Functions:

  • config/validation.go: expandVariables - Runtime config value expansion
  • cmd/root.go: loadEnvFile - .env file parsing
  • mcp/connection.go: expandDockerEnvArgs - Docker argument expansion

Analysis: These serve different contexts and have different error handling requirements. Consolidation would create artificial coupling.


❌ Not Duplicates: Empty Checking

Functions:

  • logger/rpc_helpers.go: isEffectivelyEmpty - JSON-RPC data validation
  • difc/labels.go: IsEmpty - DIFC label checking

Analysis: Type-specific logic operating on different data structures. No duplication.


✅ Acceptable Pattern: Error Formatting

Functions:

  • config/validation_schema.go: formatSchemaError - JSON schema errors
  • difc/evaluator.go: FormatViolationError - DIFC violations
  • config/rules/rules.go: Error constructors - Config validation errors

Analysis: Domain-specific error formatting is idiomatic Go. Each formatter handles its domain's error context. Consolidation would lose domain specificity.


✅ Acceptable Pattern: Logging Initialization

Functions:

  • logger/file_logger.go: InitFileLogger
  • logger/jsonl_logger.go: InitJSONLLogger
  • logger/markdown_logger.go: InitMarkdownLogger
  • logger/server_file_logger.go: InitServerFileLogger

Analysis: All use the generic helper logger/common.go:initLogger[T]. Type-specific wrappers are needed for their return types. No further consolidation possible.


Summary of Recommendations

Priority 1: High Impact (Implement Soon)

# Issue Effort Risk Impact
1 Move TruncateSessionID to logger/sanitize/ 30-45 min Low Medium

Implementation Checklist:

  • Move TruncateSessionID from internal/auth/header.go to internal/logger/sanitize/sanitize.go
  • Update import in internal/server/sdk_logging.go
  • Run tests: make test
  • Verify no functionality broken

Priority 2: Medium Impact (Consider for Future)

# Issue Effort Risk Impact
2 Document runDockerInspect placement 5 min None Low
3 Consider internal/docker/ package if more Docker helpers emerge 2-3 hours Medium Medium

Architecture Insights

✅ Strengths Identified

  1. Excellent Package Cohesion

    • Each package has a clear, single purpose
    • Dependencies flow in correct direction (no circular dependencies detected)
    • Clear separation between core logic and domain helpers
  2. Minimal Code Duplication

    • Functions with similar names serve different domains
    • Generic helpers exist where appropriate (e.g., logger/common.go:initLogger[T])
    • Type-specific wrappers are kept minimal
  3. Idiomatic Go Patterns

    • Consistent constructor naming (New*, Init*, Create*)
    • Proper use of interfaces (io.Closer, Guard, BackendCaller)
    • Domain-driven design with clear boundaries
  4. Good Use of Generics

    • logger/common.go:initLogger[T] eliminates logger initialization duplication
    • Type constraints properly used (closableLogger constraint)
  5. Security-First Design

    • Dedicated logger/sanitize/ package for secret handling
    • DIFC implementation isolated in dedicated package
    • Consistent truncation for safe logging

⚠️ Minor Observations

  1. Naming Convention Inconsistency (Minor)

    • Mix of New* vs Create* for constructors
    • Mix of Init* vs New* for initialization
    • Recommendation: Document conventions in CONTRIBUTING.md
  2. Large Files

    • internal/server/unified.go - 1008 lines
    • internal/mcp/connection.go - 999 lines
    • Status: Acceptable - Each implements a cohesive feature set

Analysis Metadata

  • Total Go Files Analyzed: 62 files (39 non-test + 23 test files)
  • Total Functions/Methods Cataloged: ~350
  • Semantic Clusters Identified: 9 major patterns
  • High-Priority Issues: 1 (TruncateSessionID misplacement)
  • Medium-Priority Issues: 1 (runDockerInspect outlier)
  • True Duplicates Found: 0
  • Detection Method: Manual semantic analysis + grep pattern matching
  • Time to Analyze: ~45 minutes
  • Analysis Tool: GitHub Copilot CLI Agent

Conclusion

The gh-aw-mcpg codebase demonstrates excellent code organization with minimal refactoring needs. The single high-priority issue (TruncateSessionID misplacement) is straightforward to fix. Most functions are correctly placed following Go best practices and domain-driven design principles.

Recommended Action: Implement Priority 1 refactoring (TruncateSessionID move) in the next maintenance cycle. Priority 2 items are optional enhancements.

Overall Code Quality Rating: ⭐⭐⭐⭐⭐ (5/5) - Exceptionally well-organized codebase

AI generated by Semantic Function Refactoring

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions