Skip to content

[EPIC][TESTING]: Achieve 80%+ Code Coverage with CI/CD EnforcementΒ #2625

@crivetimihai

Description

@crivetimihai

πŸ§ͺ Epic: Achieve 80%+ Code Coverage with CI/CD Enforcement

Goal

Implement a comprehensive code coverage quality gate that enforces 80% minimum coverage across the MCP Gateway codebase. This includes upgrading the existing pytest coverage infrastructure, configuring per-file coverage thresholds, implementing CI/CD enforcement that blocks PRs below the threshold, generating coverage reports and badges, and providing developer tooling to identify and close coverage gaps systematically.

Why Now?

With MCP Gateway approaching production maturity and increasing adoption, code quality assurance becomes critical:

  1. Quality Assurance: Current coverage at 75% leaves 25% of code paths untested, risking undetected bugs in production
  2. Regression Prevention: Higher coverage catches regressions early before they reach users
  3. Refactoring Safety: Coverage enables confident refactoring knowing tests will catch breaking changes
  4. CI/CD Enforcement: Automated gates prevent coverage regression as the codebase grows
  5. Documentation: Coverage reports serve as living documentation of tested vs untested code paths
  6. Contributor Onboarding: Clear coverage requirements guide new contributors on testing expectations

Current State:

  • Total coverage: 75% (48,627 statements, 10,580 missing)
  • Branch coverage: 82% (14,800 branches, 2,597 missing)
  • CI/CD threshold: 60% (too permissive, allows regression)
  • Per-file enforcement: None

Target State:

  • Total coverage: 80%+ with path to 90%
  • Branch coverage: 85%+
  • CI/CD threshold: 80% (blocks PRs below threshold)
  • Per-file enforcement: 70% minimum per file

πŸ“– User Stories

US-1: CI/CD Engineer - Enforce Coverage Threshold in Pipeline

As a CI/CD Engineer
I want the GitHub Actions pipeline to fail if coverage drops below 80%
So that we prevent coverage regression and maintain quality standards

Acceptance Criteria:

Given a PR is submitted to the main branch
When pytest runs with coverage measurement
Then the build should:
  - Measure line and branch coverage for mcpgateway/
  - Compare against 80% threshold
  - FAIL the build if coverage < 80%
  - PASS the build if coverage >= 80%
  - Display coverage summary in GitHub Actions logs
  - Post coverage delta as PR comment (optional)

Technical Requirements:

  • Update .github/workflows/pytest.yml with --cov-fail-under=80
  • Configure branch coverage enforcement
  • Add coverage summary to GitHub job summary
  • Block PR merge on coverage failure
US-2: Developer - Identify Coverage Gaps Locally

As a Developer
I want to easily identify which files and lines lack test coverage
So that I can write targeted tests to close coverage gaps

Acceptance Criteria:

Given I run `make coverage-gaps`
Then I should see:
  - Files below 70% coverage sorted by coverage %
  - Missing line numbers for each file
  - Suggested test locations
  - Total uncovered statements count

Given I run `make htmlcov`
Then I should see:
  - Interactive HTML coverage report at docs/docs/coverage/index.html
  - Line-by-line coverage highlighting
  - Branch coverage indicators
  - Sortable file list by coverage %

Technical Requirements:

  • Create make coverage-gaps target showing low-coverage files
  • Enhance make htmlcov to generate comprehensive HTML report
  • Add make coverage-annotate for .cover file generation
  • Document coverage workflow in tests/AGENTS.md
US-3: Release Manager - Coverage Badge and Reporting

As a Release Manager
I want automated coverage badges and reports in documentation
So that stakeholders can track coverage trends over time

Acceptance Criteria:

Given main branch receives a merged PR
When the pytest workflow completes successfully
Then the system should:
  - Generate coverage.xml for external tools
  - Generate HTML report at docs/docs/coverage/
  - Update coverage badge SVG at .github/badges/coverage.svg
  - Update coverage summary in docs/docs/test/coverage.md
  - Publish trend data (optional: Codecov/Coveralls integration)

Technical Requirements:

  • Configure coverage-badge generation in CI/CD
  • Create markdown coverage summary generator
  • Document badge embedding in README.md
  • Optional: Integrate with Codecov or Coveralls for trending
US-4: Tech Lead - Per-File Coverage Enforcement

As a Tech Lead
I want per-file minimum coverage requirements
So that critical files maintain adequate test coverage

Acceptance Criteria:

Given per-file thresholds are configured:
  default_threshold: 70%
  critical_files:
    - mcpgateway/services/*.py: 85%
    - mcpgateway/routers/*.py: 80%
    - mcpgateway/middleware/*.py: 80%
When a PR modifies a critical file
And the file's coverage drops below its threshold
Then the build should fail with a clear error message
And suggest which lines need test coverage

Technical Requirements:

  • Configure per-file thresholds in pyproject.toml or coverage config
  • Create custom coverage enforcement script
  • Integrate per-file checks into CI/CD pipeline
  • Allow threshold exemptions with inline comments
US-5: QA Engineer - Coverage Trend Analysis

As a QA Engineer
I want to track coverage trends across releases
So that I can identify coverage regression patterns

Acceptance Criteria:

Given coverage data is collected over multiple builds
When I view the coverage dashboard
Then I should see:
  - Coverage percentage over time graph
  - Per-module coverage breakdown
  - Files with declining coverage highlighted
  - Comparison between releases/milestones

Technical Requirements:

  • Store historical coverage data (JSON/SQLite)
  • Generate trend reports in markdown
  • Optional: Integrate with external coverage tracking service
  • Alert on significant coverage drops

πŸ— Architecture

Coverage Enforcement Flow

graph TD
    subgraph "Developer Workflow"
        D1[Write Code] --> D2[Run make test]
        D2 --> D3{Local Coverage OK?}
        D3 -->|No| D4[Write More Tests]
        D4 --> D2
        D3 -->|Yes| D5[Commit & Push]
    end
    
    subgraph "CI/CD Pipeline"
        D5 --> C1[GitHub Actions Triggered]
        C1 --> C2[pytest --cov=mcpgateway]
        C2 --> C3[Measure Coverage]
        C3 --> C4{Coverage >= 80%?}
        C4 -->|No| C5[FAIL Build]
        C5 --> C6[Block PR Merge]
        C4 -->|Yes| C7[PASS Build]
        C7 --> C8[Generate Reports]
    end
    
    subgraph "Reporting"
        C8 --> R1[HTML Report]
        C8 --> R2[Coverage Badge]
        C8 --> R3[Markdown Summary]
        C8 --> R4[XML for Tools]
    end
Loading

Coverage Configuration Architecture

graph LR
    subgraph "Configuration"
        CFG1[pyproject.toml] --> CFG2[tool.coverage.run]
        CFG1 --> CFG3[tool.coverage.report]
        CFG1 --> CFG4[tool.coverage.html]
    end
    
    subgraph "Execution"
        EX1[pytest] --> EX2[coverage.py]
        EX2 --> EX3[Measure Lines]
        EX2 --> EX4[Measure Branches]
    end
    
    subgraph "Output"
        EX3 --> OUT1[coverage.xml]
        EX4 --> OUT2[htmlcov/]
        EX3 --> OUT3[.coverage]
        EX4 --> OUT4[term-report]
    end
Loading

πŸ“‹ Implementation Tasks

Phase 1: CI/CD Enforcement βœ…

  • Update pytest workflow threshold

    • Modify .github/workflows/pytest.yml
    • Change --cov-fail-under=60 to --cov-fail-under=80
    • Add branch coverage enforcement
    • Test with intentionally low coverage PR
  • Add coverage summary to job output

    • Enable GitHub job summary for coverage
    • Show per-module coverage breakdown
    • Highlight files below threshold
    • Include trend comparison (if available)
  • Block PR merge on failure

    • Configure required status checks
    • Ensure coverage job is mandatory
    • Test enforcement with failing PR

Phase 2: Developer Tooling βœ…

  • Create make coverage-gaps target

    • Script to list files below 70% coverage
    • Show missing line numbers
    • Sort by coverage percentage
    • Estimate effort to reach threshold
  • Enhance make htmlcov target

    • Generate HTML to docs/docs/coverage/
    • Include branch coverage visualization
    • Add sortable file list
    • Include timestamp and commit hash
  • Add make coverage-annotate target

    • Generate .cover files for each source file
    • Indicate covered/uncovered lines inline
    • Support IDE integration (VSCode, PyCharm)
  • Update make test target

    • Run with coverage by default
    • Show summary after test run
    • Fail locally if below threshold (optional)

Phase 3: Reporting Infrastructure βœ…

  • Implement coverage badge generation

    • Generate SVG badge after successful build
    • Commit badge to .github/badges/coverage.svg
    • Document badge usage in README.md
  • Create markdown coverage report

    • Generate docs/docs/test/coverage.md
    • Include module-level breakdown
    • Show files needing attention
    • Update automatically in CI/CD
  • Configure XML export for tools

    • Generate coverage.xml in standard format
    • Support SonarQube integration
    • Support Codecov/Coveralls (optional)

Phase 4: Close Coverage Gaps βœ…

  • Identify high-impact low-coverage files

    • Services layer (target: 85%)
    • Routers layer (target: 80%)
    • Middleware layer (target: 80%)
    • Utils layer (target: 75%)
  • Write tests for critical paths

    • Authentication flows
    • Tool invocation paths
    • Error handling branches
    • Edge cases and boundary conditions
  • Add tests for uncovered branches

    • Exception handlers
    • Validation error paths
    • Configuration edge cases
    • Async error handling

Phase 5: Per-File Enforcement βœ…

  • Define per-file thresholds

    • Critical files: 85% minimum
    • Standard files: 70% minimum
    • Allow exemptions with justification
  • Create enforcement script

    • Parse coverage.json per-file data
    • Compare against configured thresholds
    • Report violations clearly
    • Exit non-zero on violations
  • Integrate into CI/CD

    • Run per-file check after pytest
    • Fail build on threshold violations
    • Generate per-file coverage report

Phase 6: Documentation βœ…

  • Update tests/AGENTS.md

    • Document coverage workflow
    • Explain make targets
    • Describe threshold requirements
    • Provide troubleshooting guide
  • Create coverage contribution guide

    • How to check local coverage
    • How to write effective tests
    • How to request threshold exemptions
    • Best practices for high coverage
  • Document CI/CD enforcement

    • Explain what blocks PRs
    • How to fix coverage failures
    • How to interpret reports

βš™οΈ Configuration

pyproject.toml Coverage Configuration

[tool.coverage.run]
source = ["mcpgateway"]
branch = true
omit = [
    "*/tests/*",
    "*/test_*.py",
    "*/__init__.py",
    "*/alembic/*",
    "*/version.py",
    # Builder deployment files - require external tools
    "mcpgateway/tools/builder/common.py",
    "mcpgateway/tools/builder/dagger_deploy.py",
    "mcpgateway/tools/builder/python_deploy.py"
]

[tool.coverage.report]
show_missing = true
skip_covered = false
fail_under = 80
precision = 1
exclude_lines = [
    "pragma: no cover",
    "def __repr__",
    "if self.debug:",
    "if settings.DEBUG",
    "raise AssertionError",
    "raise NotImplementedError",
    "if 0:",
    "if __name__ == .__main__.:",
    "class .*\\bProtocol\\):",
    "@(abc\\.)?abstractmethod",
]

[tool.coverage.html]
directory = "docs/docs/coverage"
title = "MCP Gateway Coverage Report"
show_contexts = true

GitHub Actions Workflow Update

# .github/workflows/pytest.yml
- name: πŸ§ͺ Run pytest with coverage
  run: |
    uv run pytest \
      --ignore=tests/fuzz \
      --cov=mcpgateway \
      --cov-report=xml \
      --cov-report=html \
      --cov-report=term \
      --cov-branch \
      --cov-fail-under=80

- name: πŸ“Š Coverage Summary
  if: always()
  run: |
    echo "### Coverage Report" >> $GITHUB_STEP_SUMMARY
    uv run coverage report --format=markdown >> $GITHUB_STEP_SUMMARY

Makefile Targets

# Coverage targets
.PHONY: coverage coverage-gaps htmlcov coverage-annotate

coverage: ## Run tests with full coverage analysis
	$(UV_RUN) pytest --cov=mcpgateway --cov-report=term --cov-report=html --cov-branch --cov-fail-under=80

coverage-gaps: ## Show files below 70% coverage
	@$(UV_RUN) coverage report --sort=cover | grep -E "^mcpgateway" | awk '$$NF < 70 {print}'

htmlcov: ## Generate HTML coverage report
	$(UV_RUN) pytest --cov=mcpgateway --cov-report=html:$(COVERAGE_DIR) --cov-branch -q

coverage-annotate: ## Generate annotated source files
	$(UV_RUN) coverage annotate -d $(COVERAGE_DIR)/annotated

βœ… Success Criteria

  • Threshold Enforcement: CI/CD fails builds with coverage < 80%
  • Current Coverage: Achieve 80%+ total coverage (from current 75%)
  • Branch Coverage: Achieve 85%+ branch coverage (from current 82%)
  • Developer Experience: make coverage-gaps shows actionable improvement areas
  • Reporting: HTML report generated and accessible at docs/docs/coverage/
  • Badge: Coverage badge displays current percentage in README
  • Documentation: Coverage workflow documented in tests/AGENTS.md
  • Per-File: Critical files (services, routers, middleware) at 80%+
  • Trend Tracking: Coverage data preserved for historical analysis
  • PR Blocking: PRs cannot merge if they regress coverage below threshold

🏁 Definition of Done

  • CI/CD threshold updated to 80% in pytest.yml
  • Coverage badge generated and committed automatically
  • HTML coverage report published to docs/docs/coverage/
  • Makefile targets created: coverage, coverage-gaps, htmlcov
  • Per-file coverage check script implemented
  • tests/AGENTS.md updated with coverage workflow
  • Critical service files at 80%+ coverage
  • Total coverage reaches 80%+ (verified in CI)
  • Branch coverage reaches 85%+ (verified in CI)
  • PR merge blocked on coverage failure (verified)
  • Coverage trend data preserved
  • All code quality checks pass (make verify)

πŸ“Š Current Coverage Analysis

Files Needing Attention (< 70% coverage)

File Coverage Missing Lines
mcpgateway/transports/local_transport.py 52% 196
mcpgateway/utils/sso_bootstrap.py 43% 37
mcpgateway/utils/psycopg3_optimizations.py 57% 42
mcpgateway/services/chat_service.py ~65% TBD
mcpgateway/services/team_invitation_service.py ~68% TBD
mcpgateway/wrapper.py 71% 75

High-Value Test Targets

  1. Services Layer - Business logic with complex branching
  2. Error Handlers - Exception paths often untested
  3. Async Operations - Timeout and cancellation paths
  4. Validation Logic - Edge cases and malformed input
  5. Configuration Parsing - Environment variable handling

πŸ“ Additional Notes

πŸ”Ή Incremental Approach: Start with 80% enforcement, incrementally raise to 85% then 90%

πŸ”Ή Exemption Policy: Files may be exempted from per-file requirements with documented justification in pyproject.toml

πŸ”Ή Coverage vs Quality: High coverage doesn't guarantee quality tests; focus on testing behavior, not just lines

πŸ”Ή Branch Coverage: Prioritize branch coverage for conditional logic; line coverage alone misses important paths

πŸ”Ή Performance: Coverage measurement adds ~10-15% overhead to test runtime; acceptable for quality assurance

πŸ”Ή Integration with #261: This epic implements the stepping stone to the 90% goal defined in issue #261


πŸ”— Related Issues


πŸ“š References

Metadata

Metadata

Assignees

Labels

SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releasecicdIssue with CI/CD process (GitHub Actions, scaffolding)epicLarge feature spanning multiple issuespythonPython / backend development (FastAPI)testingTesting (unit, e2e, manual, automated, etc)

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions