[EPIC][RUNTIME]: Secure MCP runtime - Remote server deployment and catalog integration (Docker, Code Engine)

# Secure MCP Runtime - Remote Server Deployment & Catalog Integration

## Goal

Implement a **secure MCP Runtime** that enables on-demand deployment and connection of remote MCP servers to the gateway directly from a catalog. The runtime provides sandboxed execution environments with configurable security guardrails, supporting deployment via Docker/Docker-Compose or GitHub repositories, initially targeting Docker and IBM Code Engine as compute backends.

## Why Now?

As ContextForge adoption grows, operators need simplified server lifecycle management:

1. **Operational Complexity**: Currently deploying MCP servers requires manual container orchestration, network configuration, and gateway registration - each step introduces friction and potential misconfiguration
2. **Catalog-Driven Deployment**: Users want to browse a catalog and deploy MCP servers with one click, similar to app stores or cloud marketplaces
3. **Security Isolation**: Untrusted MCP servers need sandboxed execution with resource limits, network policies, and capability restrictions
4. **Multi-Cloud Flexibility**: Different deployment targets (local Docker, IBM Code Engine, future: AWS Lambda, Azure Container Instances) require abstraction layers
5. **GitHub Integration**: Developers want to deploy MCP servers directly from GitHub repos with automatic builds and versioning
6. **Enterprise Controls**: Organizations require approval workflows, security scanning, and audit trails for deployed runtimes

By implementing the runtime as a first-class gateway capability, we enable self-service MCP server deployment while maintaining centralized security governance.

---

## Backend Capability Matrix

> **Important**: Docker and IBM Code Engine have different capabilities. The runtime service must handle these differences gracefully.

| Capability | Docker | IBM Code Engine | Notes |
|------------|--------|-----------------|-------|
| **Deployment Sources** |
| Docker image | ✅ | ✅ | |
| GitHub repo build | ✅ | ✅ | CE: no build cache, each build starts fresh |
| Docker Compose | ✅ | ❌ | CE: use separate app deployments |
| **Security Guardrails** |
| Network egress control | ✅ Configurable | ✅ On/Off | CE: project-level, not per-host filtering |
| Read-only filesystem | ✅ Configurable | ❌ N/A | CE: ephemeral FS tied to memory allocation |
| Custom capabilities | ✅ Configurable | ❌ Fixed | CE: enforces pod security standards |
| Resource limits | ✅ Configurable | ✅ Per-deployment | CE: set by ContextForge at deploy time |
| Seccomp/AppArmor | ✅ Configurable | ✅ Enforced | CE: automatically applied |
| **Scaling** |
| Min instances | ✅ | ✅ 0-250 | CE: 0 = scale-to-zero |
| Max instances | ✅ | ✅ Max 250 | |
| Cold start | N/A | ~17s avg | Set min=1 to avoid |
| **Networking** |
| Network isolation | Per container/network | Per CE project | |
| Private endpoints | ✅ | ✅ VPE | CE: `--visibility=private` |
| Internal DNS | Docker networks | `<app>.<project-id>.svc.cluster.local` | |
| **Monitoring** |
| Native metrics | Docker API | IBM Cloud Monitoring | |
| Native logs | Docker logs API | IBM Cloud Logs | |
| Health checks | ✅ | ✅ TCP/HTTP probes | |
| **Limits** |
| Max CPU | Host-limited | 12 vCPU | |
| Max memory | Host-limited | 48 GB | |
| App timeout | Unlimited | 600 seconds | |
| Apps per project | N/A | 40 | |

### Code Engine Specific Constraints

From [CE Pod Security Standards](https://github.ibm.com/coligo/readme/blob/main/architecture/projectisolation.md#pod-security-standards):

- **Fixed Security Context**: CE enforces multi-tenant isolation automatically; cannot grant elevated capabilities (SYSADMIN, NETADMIN, etc.)
- **Ephemeral Storage**: Tied to memory allocation (e.g., 1GB memory = 1GB max ephemeral storage)
- **No Read-Only Filesystem Option**: Filesystem is always ephemeral and writable
- **Project-Level Network Isolation**: All apps within a project can communicate; isolation is between projects
- **Default Limits via Support Only**: Account/project default limits require IBM support case; ContextForge must set limits per-deployment

---

## User Stories

<details>
<summary>US-1: Developer - Deploy MCP Server from Catalog</summary>

**As a** Developer
**I want** to deploy an MCP server from the catalog with one click
**So that** I can quickly add new capabilities without manual infrastructure setup

**Acceptance Criteria:**

```gherkin
Given I'm viewing the MCP server catalog at /admin/catalog
And I see a server entry "mcp-server-filesystem" with description and metadata
When I click "Deploy" on the server entry
Then I should see a deployment configuration dialog with:
 - Runtime selection (Docker, IBM Code Engine)
 - Resource limits (CPU, memory)
 - Environment variables
 - Security guardrails selection (filtered by backend capabilities)
When I confirm deployment
Then the runtime should:
 - Pull or build the server image
 - Start the container with configured limits
 - Register the server with the gateway
 - Show deployment status (pending → running → connected)
And the server should appear in my virtual server list
```

**Technical Requirements:**
- Catalog API with server metadata (name, description, image/repo, version)
- Deployment workflow orchestration
- Status tracking and progress reporting
- Automatic gateway registration on successful deployment
- **Backend capability filtering**: Only show options supported by selected backend

</details>

<details>
<summary>US-2: Platform Admin - Configure Runtime Backends</summary>

**As a** Platform Administrator
**I want** to configure available runtime backends (Docker, IBM Code Engine)
**So that** developers can deploy to approved infrastructure

**Acceptance Criteria:**

```gherkin
Given I'm in /admin/settings/runtimes
When I configure a Docker backend:
 docker:
 socket: /var/run/docker.sock
 network: mcp-network
 default_limits:
 cpu: "0.5"
 memory: "512m"
 allowed_registries:
 - docker.io
 - ghcr.io
Then Docker should be available as a deployment target

When I configure an IBM Code Engine backend:
 ibm_code_engine:
 api_key: ${IBM_CLOUD_API_KEY}
 region: us-south
 project_id: mcp-runtimes
 # NOTE: These are applied per-deployment by ContextForge
 # CE does not support self-service default limits at account level
 per_deployment_limits:
 cpu: "0.25"
 memory: "256m"
 observability:
 cloud_logs_instance_id: ${IBM_CLOUD_LOGS_ID}
 cloud_monitoring_instance_id: ${IBM_CLOUD_MONITORING_ID}
Then IBM Code Engine should be available as a deployment target
```

**Technical Requirements:**
- Backend configuration schema (Docker, IBM Code Engine)
- Credential management (API keys, service accounts)
- Registry allowlisting
- **Per-deployment resource limits** (CE: applied at deploy time, not account-level)
- Health checks for backend connectivity
- **IBM Cloud Observability integration** for CE backend

**IBM Code Engine Notes:**
- Default limits cannot be configured at account/project level via self-service
- ContextForge must set limits on every deployment API call
- Multi-tenant security guardrails are enforced automatically by CE

</details>

<details>
<summary>US-3: Security Engineer - Configure Security Guardrails</summary>

**As a** Security Engineer
**I want** to define security guardrails for runtime deployments
**So that** untrusted MCP servers cannot escape their sandbox or abuse resources

**Acceptance Criteria:**

```gherkin
Given I define a security profile "restricted":
 profiles:
 restricted:
 network:
 egress_allowed: false # Supported: Docker ✅, CE ✅
 ingress_ports: [8080] # Supported: Docker ✅, CE ✅
 allowed_hosts: [] # Supported: Docker ✅, CE ❌ (on/off only)
 filesystem:
 read_only_root: true # Supported: Docker ✅, CE ❌ (always ephemeral)
 allowed_mounts: ["/data"] # Supported: Docker ✅, CE ❌
 capabilities:
 drop_all: true # Supported: Docker ✅, CE ✅ (enforced)
 add: [NET_BIND_SERVICE] # Supported: Docker ✅, CE ❌ (fixed by CE)
 resources:
 max_cpu: "0.5" # Supported: Docker ✅, CE ✅
 max_memory: "256m" # Supported: Docker ✅, CE ✅
 max_pids: 100 # Supported: Docker ✅, CE ❌
 seccomp: runtime/default # Supported: Docker ✅, CE ✅ (enforced)
 apparmor: mcp-server # Supported: Docker ✅, CE ✅ (enforced)
When a developer deploys with profile "restricted"
Then the container should have guardrails applied per backend capabilities
And unsupported guardrails should be logged/noted (not cause failure)
```

**Technical Requirements:**
- Security profile schema with network, filesystem, capability, resource sections
- **Backend-specific enforcement mapping** (see capability matrix)
- Profile validation with backend compatibility warnings
- Container security context generation
- Audit logging for profile applications
- Support for custom seccomp/AppArmor profiles (Docker only)

**IBM Code Engine Constraints:**
- ❌ **Cannot** run with read-only filesystem (ephemeral FS tied to memory size)
- ❌ **Cannot** grant elevated capabilities (SYSADMIN, NETADMIN, etc.)
- ❌ **Cannot** filter egress by specific hosts (only enable/disable)
- ✅ **Always enforces** seccomp and AppArmor profiles automatically
- ✅ **Always enforces** capability restrictions per [CE Pod Security Standards](https://github.ibm.com/coligo/readme/blob/main/architecture/projectisolation.md#pod-security-standards)
- ✅ **Supports** resource limits (CPU, memory) - set per deployment

</details>

<details>
<summary>US-4: Developer - Deploy from GitHub Repository</summary>

**As a** Developer
**I want** to deploy an MCP server directly from a GitHub repository
**So that** I can use custom or private MCP servers without pre-built images

**Acceptance Criteria:**

```gherkin
Given I have a GitHub repository "org/my-mcp-server" with a Dockerfile
When I create a catalog entry with source type "github":
 source:
 type: github
 repo: org/my-mcp-server
 branch: main
 dockerfile: Dockerfile
 build_args:
 PYTHON_VERSION: "3.11"
And I deploy the server
Then the runtime should:
 - Clone the repository (using configured credentials)
 - Build the Docker image
 - Tag with commit SHA and version
 - Push to configured registry (optional, recommended for CE)
 - Deploy the built image
 - Report build logs and status
```

**Technical Requirements:**
- GitHub repository cloning (HTTPS, SSH)
- Dockerfile build pipeline
- **Build cache**: Docker ✅, CE ❌ (each build starts from scratch)
- Image tagging strategy (commit SHA, semantic version)
- Build log streaming
- Optional push to registry for caching
- **CE Recommendation**: Push to ICR after build for faster subsequent deployments

**IBM Code Engine Notes:**
- CE Builds support GitHub cloning and Dockerfile builds
- **No build cache** - each build starts fresh (consider pushing to registry)
- Build timeout: default 600s, max 3600s
- Max 100 build configs and 100 build runs per project

</details>

<details>
<summary>US-5: Operator - Monitor Runtime Health and Metrics</summary>

**As an** Operator
**I want** to monitor the health and resource usage of deployed runtimes
**So that** I can ensure service reliability and right-size resources

**Acceptance Criteria:**

```gherkin
Given I have deployed runtimes running
When I view /admin/runtimes
Then I should see for each runtime:
 - Status (running, stopped, error, pending)
 - Uptime and restart count
 - CPU and memory usage (current/limit)
 - Network I/O (Docker only)
 - Last health check result
 - Linked gateway registration

When I view /admin/runtimes/{id}/logs
Then I should see:
 - Container stdout/stderr logs
 - Structured log parsing (if JSON logs)
 - Log filtering by level and time range

When a runtime exceeds resource limits or crashes
Then I should receive:
 - Alert notification (if alerting configured)
 - Automatic restart (based on restart policy)
 - Event logged in audit trail
```

**Technical Requirements:**
- Runtime status aggregation
- Metrics collection (Prometheus format)
- Log aggregation and streaming
- Health check integration
- Alerting hooks
- Restart policy enforcement

**Backend-Specific Monitoring:**

| Feature | Docker | IBM Code Engine |
|---------|--------|-----------------|
| Metrics source | Docker API | IBM Cloud Monitoring |
| Logs source | Docker logs API | IBM Cloud Logs (via kubeapi) |
| Alerting | Custom webhooks | IBM Cloud Logs/Monitoring alerts |
| Health checks | Custom implementation | Native liveness/readiness probes |

**IBM Code Engine Integration:**
- Use [CE Metrics Collector](https://github.com/IBM/CodeEngine/tree/main/metrics-collector) pattern
- Configure IBM Cloud Logs instance for log aggregation
- Configure IBM Cloud Monitoring instance for metrics
- CE provides: status, uptime, restart count, health check results

</details>

<details>
<summary>US-6: Developer - Deploy with Docker Compose</summary>

> **Scope Change**: Docker Compose deployments are **only supported on the Docker backend**. IBM Code Engine does not support Docker Compose.

**As a** Developer
**I want** to deploy multi-container MCP servers using Docker Compose
**So that** I can run servers with dependencies (databases, caches, sidecars)

**Acceptance Criteria:**

```gherkin
Given I have a docker-compose.yml:
 services:
 mcp-server:
 image: my-mcp-server:latest
 ports:
 - "8080:8080"
 depends_on:
 - redis
 redis:
 image: redis:7-alpine
When I create a catalog entry with source type "compose":
 source:
 type: compose
 compose_file: docker-compose.yml
 main_service: mcp-server
And I select Docker as the runtime backend
And I deploy the server
Then the runtime should:
 - Parse and validate the compose file
 - Create an isolated network for the stack
 - Start services in dependency order
 - Wait for health checks on all services
 - Register the main_service with the gateway
 - Track all containers as a single runtime unit
```

**Technical Requirements (Docker Backend Only):**
- Docker Compose file parsing and validation
- Multi-container lifecycle management
- Dependency ordering and health checks
- Network isolation per deployment
- Unified logging across containers
- Cleanup of all containers on undeploy

**IBM Code Engine Alternative:**

For multi-service architectures on CE, use separate catalog entries:

```yaml
# Instead of Compose, deploy as separate CE apps:
- name: "RAG Server"
 slug: "rag-server"
 source:
 type: docker
 image: ghcr.io/example/mcp-rag:latest
 environment:
 QDRANT_URL: "http://qdrant-server.${CE_PROJECT_ID}.svc.cluster.local:6333"
 dependencies:
 - qdrant-server # Documentation only, not enforced

- name: "Qdrant Vector DB"
 slug: "qdrant-server"
 source:
 type: docker
 image: qdrant/qdrant:latest
 visibility: project # Internal only
```

CE apps in the same project communicate via internal DNS: `<app-name>.<project-id>.svc.cluster.local`

</details>

<details>
<summary>US-7: Security Admin - Approve Runtime Deployments</summary>

**As a** Security Administrator
**I want** to review and approve runtime deployment requests
**So that** only vetted MCP servers run in production environments

**Acceptance Criteria:**

```gherkin
Given approval workflow is enabled:
 approval:
 enabled: true
 required_for:
 - source_type: github
 - registry_not_in: [docker.io/official]
 approvers:
 - security-team
When a developer requests deployment of an unapproved server
Then the deployment should:
 - Create a pending approval request
 - Notify approvers via configured channels
 - Show pending status in UI
When an approver reviews the request
Then they should see:
 - Server metadata and source
 - Security scan results (if available)
 - Requested resources and guardrails
When approved
Then the deployment should proceed automatically
When rejected
Then the developer should be notified with reason
```

**Technical Requirements:**
- Approval workflow engine
- Approval rules (source type, registry, resource thresholds)
- Approver groups and notification
- Approval audit trail
- Timeout and escalation policies

</details>

<details>
<summary>US-8: Platform Admin - Manage MCP Server Catalog</summary>

**As a** Platform Administrator
**I want** to curate the MCP server catalog with approved servers
**So that** developers can discover and deploy pre-vetted solutions

**Acceptance Criteria:**

```gherkin
Given I'm in /admin/catalog/manage
When I add a catalog entry:
 name: "Filesystem Server"
 description: "MCP server for file operations"
 icon: "folder"
 category: "storage"
 source:
 type: docker
 image: docker.io/mcp/filesystem-server:1.0.0
 supported_backends: [docker, ibm_code_engine] # NEW: Backend compatibility
 guardrails_profile: restricted
 documentation_url: "https://example.com/docs"
 tags: ["files", "storage", "official"]
Then the entry should appear in the catalog

When I mark an entry as "featured" or "deprecated"
Then it should be highlighted or hidden accordingly

When I import entries from a remote catalog:
 import:
 url: https://mcp-catalog.example.com/v1/servers
 filter:
 categories: [ai, data]
Then matching entries should be added to local catalog
```

**Technical Requirements:**
- Catalog CRUD API
- Catalog entry schema (metadata, source, guardrails, **supported_backends**)
- Categories and tagging
- Featured/deprecated flags
- Remote catalog federation
- Catalog versioning and sync

</details>

---

## Architecture

### System Overview

```mermaid
graph TB
 subgraph "ContextForge Gateway"
 API[Gateway API]
 CAT[Catalog Service]
 RTS[Runtime Service]
 REG[Server Registry]
 CAPS[Backend Capabilities]
 end

 subgraph "Runtime Backends"
 DOCK[Docker Backend]
 ICE[IBM Code Engine Backend]
 FUTURE[Future: Lambda, ACI...]
 end

 subgraph "Deployments"
 D1[MCP Server 1]
 D2[MCP Server 2]
 D3[MCP Server N]
 end

 subgraph "Sources"
 DREG[Docker Registry]
 GH[GitHub Repos]
 COMP[Compose Files]
 end

 subgraph "IBM Cloud Services"
 ICL[IBM Cloud Logs]
 ICM[IBM Cloud Monitoring]
 end

 API --> CAT
 API --> RTS
 CAT --> RTS
 RTS --> CAPS
 CAPS --> DOCK
 CAPS --> ICE
 RTS --> FUTURE

 DOCK --> D1
 DOCK --> D2
 ICE --> D3

 DREG --> DOCK
 DREG --> ICE
 GH --> DOCK
 GH --> ICE
 COMP --> DOCK

 D1 --> REG
 D2 --> REG
 D3 --> REG

 ICE --> ICL
 ICE --> ICM

 REG --> API
```

### Backend Abstraction

```python
# mcpgateway/runtimes/base.py
from abc import ABC, abstractmethod
from dataclasses import dataclass

@dataclass
class BackendCapabilities:
 """Declares what a backend supports."""
 supports_compose: bool = False
 supports_readonly_fs: bool = False
 supports_custom_capabilities: bool = False
 supports_capability_add: bool = False
 supports_build_cache: bool = False
 supports_egress_host_filtering: bool = False
 supports_pid_limits: bool = False
 network_isolation_level: str = "container" # "container" or "project"
 observability_type: str = "native" # "native" or "ibm_cloud"
 max_cpu: float = 0 # 0 = unlimited
 max_memory_gb: float = 0 # 0 = unlimited
 max_timeout_seconds: int = 0 # 0 = unlimited

class RuntimeBackend(ABC):
 @abstractmethod
 def get_capabilities(self) -> BackendCapabilities:
 """Return what this backend supports."""
 pass

 @abstractmethod
 async def deploy(self, request: DeployRequest) -> Deployment:
 pass

 @abstractmethod
 async def stop(self, deployment_id: str) -> None:
 pass

 @abstractmethod
 async def start(self, deployment_id: str) -> None:
 pass

 @abstractmethod
 async def remove(self, deployment_id: str) -> None:
 pass

 @abstractmethod
 async def logs(self, deployment_id: str, tail: int = 100) -> str:
 pass

 @abstractmethod
 async def status(self, deployment_id: str) -> DeploymentStatus:
 pass


class DockerBackend(RuntimeBackend):
 def get_capabilities(self) -> BackendCapabilities:
 return BackendCapabilities(
 supports_compose=True,
 supports_readonly_fs=True,
 supports_custom_capabilities=True,
 supports_capability_add=True,
 supports_build_cache=True,
 supports_egress_host_filtering=True,
 supports_pid_limits=True,
 network_isolation_level="container",
 observability_type="native",
 )


class IBMCodeEngineBackend(RuntimeBackend):
 def get_capabilities(self) -> BackendCapabilities:
 return BackendCapabilities(
 supports_compose=False, # Not supported
 supports_readonly_fs=False, # Ephemeral FS only
 supports_custom_capabilities=False, # Fixed by CE
 supports_capability_add=False, # Cannot elevate
 supports_build_cache=False, # Each build fresh
 supports_egress_host_filtering=False, # On/off only
 supports_pid_limits=False, # Not exposed
 network_isolation_level="project",
 observability_type="ibm_cloud",
 max_cpu=12.0,
 max_memory_gb=48.0,
 max_timeout_seconds=600,
 )
```

### Deployment Flow

```mermaid
sequenceDiagram
 participant User as Developer
 participant API as Gateway API
 participant Cat as Catalog Service
 participant RTS as Runtime Service
 participant Caps as Capabilities Check
 participant Backend as Runtime Backend
 participant Container as MCP Container
 participant Reg as Server Registry

 User->>API: POST /runtimes/deploy
 API->>Cat: Get catalog entry
 Cat-->>API: Entry metadata, source, guardrails

 API->>RTS: Request deployment
 RTS->>Caps: Check backend capabilities
 Caps-->>RTS: Capabilities + warnings

 alt Unsupported features requested
 RTS-->>API: Warning: some guardrails not supported
 end

 RTS->>RTS: Validate security profile
 RTS->>RTS: Check approval (if required)

 alt GitHub source
 RTS->>Backend: Clone and build image
 Backend-->>RTS: Image ID
 end

 RTS->>Backend: Create container with guardrails
 Backend->>Container: Start container
 Container-->>Backend: Health check passed

 Backend-->>RTS: Container ID, endpoint

 RTS->>Reg: Register MCP server
 Reg-->>RTS: Server registered

 RTS-->>API: Deployment complete
 API-->>User: Runtime ID, status, endpoint
```

### Database Schema

```sql
-- Catalog entries
CREATE TABLE runtime_catalog (
 id UUID PRIMARY KEY,
 name VARCHAR(100) NOT NULL,
 slug VARCHAR(100) UNIQUE NOT NULL,
 description TEXT,
 icon VARCHAR(50),
 category VARCHAR(50),
 tags JSONB DEFAULT '[]',

 -- Source configuration
 source_type VARCHAR(20) NOT NULL, -- docker, github, compose
 source_config JSONB NOT NULL, -- image, repo, compose_file, etc.

 -- Backend compatibility (NEW)
 supported_backends JSONB DEFAULT '["docker", "ibm_code_engine"]',

 -- Security
 guardrails_profile VARCHAR(50),
 requires_approval BOOLEAN DEFAULT FALSE,

 -- Metadata
 documentation_url VARCHAR(500),
 version VARCHAR(50),
 is_featured BOOLEAN DEFAULT FALSE,
 is_deprecated BOOLEAN DEFAULT FALSE,

 created_by UUID REFERENCES users(id),
 created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
 updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

 INDEX idx_category (category),
 INDEX idx_slug (slug)
);

-- Runtime backends configuration
CREATE TABLE runtime_backends (
 id UUID PRIMARY KEY,
 name VARCHAR(100) UNIQUE NOT NULL,
 backend_type VARCHAR(50) NOT NULL, -- docker, ibm_code_engine
 config JSONB NOT NULL, -- socket, api_key, region, etc.
 default_limits JSONB, -- Applied per-deployment
 capabilities JSONB, -- Cached capabilities
 is_enabled BOOLEAN DEFAULT TRUE,
 health_status VARCHAR(20),
 last_health_check TIMESTAMP,
 created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Security guardrails profiles
CREATE TABLE runtime_guardrails (
 id UUID PRIMARY KEY,
 name VARCHAR(100) UNIQUE NOT NULL,
 description TEXT,

 -- Network policies
 network_egress_allowed BOOLEAN DEFAULT TRUE,
 network_allowed_hosts JSONB DEFAULT '[]', -- Docker only
 network_ingress_ports JSONB DEFAULT '[]',

 -- Filesystem policies
 filesystem_read_only_root BOOLEAN DEFAULT FALSE, -- Docker only
 filesystem_allowed_mounts JSONB DEFAULT '[]', -- Docker only

 -- Capabilities
 capabilities_drop_all BOOLEAN DEFAULT FALSE,
 capabilities_add JSONB DEFAULT '[]', -- Docker only

 -- Resource limits
 resource_max_cpu VARCHAR(20),
 resource_max_memory VARCHAR(20),
 resource_max_pids INTEGER, -- Docker only

 -- Advanced security
 seccomp_profile VARCHAR(100),
 apparmor_profile VARCHAR(100),

 -- Backend compatibility metadata (NEW)
 backend_compatibility JSONB DEFAULT '{}',

 created_by UUID REFERENCES users(id),
 created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Deployed runtimes
CREATE TABLE runtime_deployments (
 id UUID PRIMARY KEY,
 catalog_entry_id UUID REFERENCES runtime_catalog(id),
 backend_id UUID REFERENCES runtime_backends(id),
 guardrails_id UUID REFERENCES runtime_guardrails(id),

 -- Deployment state
 status VARCHAR(30) NOT NULL, -- pending, building, starting, running, stopped, error
 status_message TEXT,

 -- Container info
 container_id VARCHAR(100),
 container_name VARCHAR(100),
 endpoint VARCHAR(500),

 -- Runtime metadata
 image_used VARCHAR(500),
 build_logs TEXT,
 environment JSONB DEFAULT '{}',

 -- Guardrails applied (with backend-specific notes)
 guardrails_applied JSONB DEFAULT '{}',
 guardrails_warnings JSONB DEFAULT '[]', -- NEW: Unsupported features

 -- Metrics
 restart_count INTEGER DEFAULT 0,
 last_health_check TIMESTAMP,
 health_status VARCHAR(20),

 -- Gateway integration
 gateway_server_id UUID REFERENCES servers(id),

 -- Lifecycle
 deployed_by UUID REFERENCES users(id),
 deployed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
 stopped_at TIMESTAMP,

 INDEX idx_status (status),
 INDEX idx_backend (backend_id)
);

-- Deployment approvals
CREATE TABLE runtime_approvals (
 id UUID PRIMARY KEY,
 deployment_request JSONB NOT NULL,
 catalog_entry_id UUID REFERENCES runtime_catalog(id),

 status VARCHAR(20) NOT NULL, -- pending, approved, rejected
 requested_by UUID REFERENCES users(id),
 requested_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

 reviewed_by UUID REFERENCES users(id),
 reviewed_at TIMESTAMP,
 review_notes TEXT,

 expires_at TIMESTAMP,

 INDEX idx_status (status)
);

-- Runtime events/audit log
CREATE TABLE runtime_events (
 id UUID PRIMARY KEY,
 deployment_id UUID REFERENCES runtime_deployments(id),
 event_type VARCHAR(50) NOT NULL, -- created, started, stopped, restarted, error, health_check
 event_data JSONB,
 timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

 INDEX idx_deployment (deployment_id),
 INDEX idx_timestamp (timestamp)
);
```

---

## Implementation Tasks

### Phase 1: Core Runtime Service

- [ ] **Runtime Service Foundation**
 - [ ] Create `mcpgateway/services/runtime_service.py`
 - [ ] Define `RuntimeService` class with deployment orchestration
 - [ ] Implement backend abstraction interface `RuntimeBackend`
 - [ ] Implement `BackendCapabilities` dataclass
 - [ ] Add deployment state machine (pending → building → starting → running)
 - [ ] Add graceful shutdown and cleanup handlers
 - [ ] **Add capability checking before deployment**

- [ ] **Database Models**
 - [ ] Create Alembic migration for runtime tables
 - [ ] Define SQLAlchemy models: `RuntimeCatalog`, `RuntimeBackend`, `RuntimeGuardrails`, `RuntimeDeployment`
 - [ ] Add `supported_backends` field to catalog entries
 - [ ] Add `guardrails_warnings` field to deployments
 - [ ] Add relationships and cascading deletes
 - [ ] Create repository layer with CRUD operations

- [ ] **Configuration Schema**
 - [ ] Define Pydantic models for backend configuration
 - [ ] Define Pydantic models for guardrails profiles
 - [ ] **Add backend capability validation**
 - [ ] Add environment variable support for secrets
 - [ ] Validate configuration on startup

### Phase 2: Docker Backend

- [ ] **Docker Backend Implementation**
 - [ ] Create `mcpgateway/runtimes/docker_backend.py`
 - [ ] Implement `get_capabilities()` returning full Docker capabilities
 - [ ] Implement container lifecycle: create, start, stop, remove
 - [ ] Add Docker socket connection with health checks
 - [ ] Implement image pull with progress tracking
 - [ ] Add container log streaming
 - [ ] Implement health check polling

- [ ] **Security Context Generation**
 - [ ] Map guardrails profile to Docker security options
 - [ ] Implement network policy (--network, --cap-drop)
 - [ ] Implement filesystem policy (--read-only, --mount)
 - [ ] Implement resource limits (--cpus, --memory, --pids-limit)
 - [ ] Add seccomp and AppArmor profile support

- [ ] **Docker Compose Support**
 - [ ] Parse and validate docker-compose.yml files
 - [ ] Implement multi-container deployment
 - [ ] Handle service dependencies and health checks
 - [ ] Create isolated networks per deployment
 - [ ] Track all containers as single deployment unit

### Phase 3: GitHub Source Support

- [ ] **Repository Cloning**
 - [ ] Implement HTTPS clone with token auth
 - [ ] Implement SSH clone with key auth
 - [ ] Support branch, tag, and commit ref
 - [ ] Clone to temporary build context

- [ ] **Image Building**
 - [ ] Implement Dockerfile build with Docker API
 - [ ] Support build arguments and multi-stage builds
 - [ ] Stream build logs to client
 - [ ] Tag images with commit SHA and version
 - [ ] Implement build caching (Docker only)
 - [ ] **Add note: CE builds have no cache**

- [ ] **Build Pipeline**
 - [ ] Clone → Build → Tag → Deploy workflow
 - [ ] Optional push to configured registry
 - [ ] **Recommend registry push for CE deployments**
 - [ ] Cleanup of build context and intermediate images
 - [ ] Build timeout and cancellation

### Phase 4: IBM Code Engine Backend

- [ ] **Code Engine Client**
 - [ ] Create `mcpgateway/runtimes/ibm_code_engine_backend.py`
 - [ ] Implement `get_capabilities()` with CE constraints
 - [ ] Implement IBM Cloud API authentication
 - [ ] Create project/application management
 - [ ] Implement deployment from registry

- [ ] **Application Lifecycle**
 - [ ] Create application with resource limits (per-deployment)
 - [ ] Configure min-scale (0 for scale-to-zero, 1 to avoid cold starts)
 - [ ] Get application status and endpoint
 - [ ] Update and redeploy applications
 - [ ] Delete applications and cleanup

- [ ] **Guardrails Mapping**
 - [ ] Map supported guardrails to CE configuration
 - [ ] **Log warnings for unsupported guardrails** (read-only FS, custom caps, etc.)
 - [ ] Set resource limits via CE API
 - [ ] Configure network visibility (public/private/project)

- [ ] **Health Checks**
 - [ ] Configure liveness probes (TCP or HTTP)
 - [ ] Configure readiness probes
 - [ ] Set appropriate timeouts and intervals

- [ ] **Observability Integration**
 - [ ] Integrate with IBM Cloud Logs for log retrieval
 - [ ] Integrate with IBM Cloud Monitoring for metrics
 - [ ] Implement status polling via CE API

### Phase 5: Catalog API

- [ ] **Catalog CRUD Endpoints**
 - [ ] `GET /catalog` - List catalog entries with filtering
 - [ ] `GET /catalog/{slug}` - Get catalog entry details
 - [ ] `POST /catalog` - Create catalog entry (admin)
 - [ ] `PUT /catalog/{slug}` - Update catalog entry (admin)
 - [ ] `DELETE /catalog/{slug}` - Remove catalog entry (admin)

- [ ] **Catalog Features**
 - [ ] Category and tag filtering
 - [ ] Search by name/description
 - [ ] **Filter by supported_backends**
 - [ ] Featured and deprecated flags
 - [ ] Version management
 - [ ] Remote catalog import/sync

### Phase 6: Runtime Deployment API

- [ ] **Deployment Endpoints**
 - [ ] `POST /runtimes/deploy` - Deploy from catalog
 - [ ] `GET /runtimes` - List deployments
 - [ ] `GET /runtimes/{id}` - Get deployment status
 - [ ] `POST /runtimes/{id}/stop` - Stop deployment
 - [ ] `POST /runtimes/{id}/start` - Start stopped deployment
 - [ ] `DELETE /runtimes/{id}` - Remove deployment
 - [ ] `GET /runtimes/{id}/logs` - Get container logs

- [ ] **Status and Monitoring**
 - [ ] Real-time status updates (SSE)
 - [ ] Metrics endpoint (Prometheus format)
 - [ ] Health check aggregation
 - [ ] Event history
 - [ ] **Include guardrails_warnings in response**

### Phase 7: Security Guardrails Management

- [ ] **Guardrails API**
 - [ ] `GET /guardrails` - List profiles
 - [ ] `GET /guardrails/{name}` - Get profile details
 - [ ] `GET /guardrails/{name}/compatibility` - **Check backend compatibility**
 - [ ] `POST /guardrails` - Create profile (admin)
 - [ ] `PUT /guardrails/{name}` - Update profile (admin)
 - [ ] `DELETE /guardrails/{name}` - Remove profile (admin)

- [ ] **Preset Profiles**
 - [ ] `unrestricted` - Full access (development, Docker only recommended)
 - [ ] `standard` - Balanced security (works on both backends)
 - [ ] `restricted` - Minimal capabilities (works on both backends)
 - [ ] `airgapped` - No network egress (works on both backends)

### Phase 8: Approval Workflow

- [ ] **Approval Engine**
 - [ ] Define approval rules schema
 - [ ] Evaluate rules on deployment request
 - [ ] Create pending approval records
 - [ ] Notify approvers (email, webhook)

- [ ] **Approval API**
 - [ ] `GET /approvals` - List pending approvals
 - [ ] `GET /approvals/{id}` - Get approval details
 - [ ] `POST /approvals/{id}/approve` - Approve request
 - [ ] `POST /approvals/{id}/reject` - Reject request
 - [ ] Expiration and escalation handling

### Phase 9: Admin UI

- [ ] **Catalog UI**
 - [ ] Page: `/admin/catalog` - Browse catalog with cards
 - [ ] Deploy button with configuration dialog
 - [ ] **Show backend compatibility badges**
 - [ ] Category filtering and search
 - [ ] Page: `/admin/catalog/manage` - CRUD for admins

- [ ] **Runtimes UI**
 - [ ] Page: `/admin/runtimes` - List deployments with status
 - [ ] Runtime detail view with metrics
 - [ ] **Show guardrails warnings if any**
 - [ ] Log viewer with filtering
 - [ ] Start/stop/delete controls

- [ ] **Settings UI**
 - [ ] Page: `/admin/settings/runtimes` - Backend configuration
 - [ ] Page: `/admin/settings/guardrails` - Profile management
 - [ ] **Show backend compatibility matrix**
 - [ ] Page: `/admin/approvals` - Approval queue

### Phase 10: Gateway Integration

- [ ] **Automatic Registration**
 - [ ] On successful deployment, register with gateway
 - [ ] Create gateway entry with transport (SSE/WebSocket)
 - [ ] Configure authentication if required
 - [ ] Link deployment to gateway server record

- [ ] **Lifecycle Sync**
 - [ ] Update gateway on runtime stop/start
 - [ ] Remove gateway entry on deployment delete
 - [ ] Health check propagation

### Phase 11: Testing

- [ ] **Unit Tests**
 - [ ] Runtime service orchestration
 - [ ] **Backend capability checking**
 - [ ] Docker backend operations
 - [ ] CE backend operations
 - [ ] Guardrails profile validation
 - [ ] Catalog CRUD operations
 - [ ] Approval workflow logic

- [ ] **Integration Tests**
 - [ ] End-to-end deployment from catalog (Docker)
 - [ ] End-to-end deployment from catalog (CE - if available)
 - [ ] GitHub source build and deploy
 - [ ] Docker Compose multi-container (Docker only)
 - [ ] Gateway registration flow
 - [ ] Approval workflow integration

- [ ] **Security Tests**
 - [ ] Guardrails enforcement verification (Docker)
 - [ ] Container escape prevention
 - [ ] Network policy enforcement
 - [ ] Resource limit enforcement

### Phase 12: Documentation

- [ ] **User Documentation**
 - [ ] Catalog usage guide
 - [ ] Deployment configuration reference
 - [ ] **Backend comparison guide**
 - [ ] Guardrails profile guide
 - [ ] Troubleshooting guide

- [ ] **Admin Documentation**
 - [ ] Backend configuration guide
 - [ ] **IBM Code Engine setup guide**
 - [ ] Security best practices
 - [ ] Approval workflow setup
 - [ ] Monitoring and alerting

- [ ] **API Documentation**
 - [ ] OpenAPI spec for all endpoints
 - [ ] Example API calls
 - [ ] SDK examples

---

## Configuration Example

### Environment Variables

```bash
# Runtime feature flag
MCPGATEWAY_RUNTIME_ENABLED=true

# Docker backend
RUNTIME_DOCKER_SOCKET=/var/run/docker.sock
RUNTIME_DOCKER_NETWORK=mcp-network
RUNTIME_DOCKER_REGISTRY_MIRROR=https://mirror.example.com

# IBM Code Engine backend
RUNTIME_IBM_API_KEY=${IBM_CLOUD_API_KEY}
RUNTIME_IBM_REGION=us-south
RUNTIME_IBM_PROJECT_ID=mcp-runtimes
# Observability (required for CE monitoring)
RUNTIME_IBM_CLOUD_LOGS_ID=${IBM_CLOUD_LOGS_INSTANCE_ID}
RUNTIME_IBM_CLOUD_MONITORING_ID=${IBM_CLOUD_MONITORING_INSTANCE_ID}

# GitHub integration
RUNTIME_GITHUB_TOKEN=${GITHUB_TOKEN}
RUNTIME_GITHUB_SSH_KEY_PATH=/secrets/github-ssh-key

# Approval workflow
RUNTIME_APPROVAL_ENABLED=true
RUNTIME_APPROVAL_WEBHOOK_URL=https://slack.example.com/webhook
```

### Runtime Configuration (config.yaml)

```yaml
runtime:
 enabled: true
 default_backend: docker

 backends:
 docker:
 enabled: true
 socket: /var/run/docker.sock
 network: mcp-network
 allowed_registries:
 - docker.io
 - ghcr.io
 - registry.example.com
 default_limits:
 cpu: "0.5"
 memory: "512m"
 pids: 100
 cleanup_orphans: true
 cleanup_interval_hours: 24
 # Full capabilities
 capabilities:
 supports_compose: true
 supports_readonly_fs: true
 supports_custom_capabilities: true

 ibm_code_engine:
 enabled: true
 api_key: ${RUNTIME_IBM_API_KEY}
 region: us-south
 project_id: mcp-runtimes
 # Applied per-deployment (CE doesn't support account-level defaults)
 per_deployment_limits:
 cpu: "0.25"
 memory: "256m"
 # Scaling configuration
 default_min_scale: 0 # Scale to zero (set to 1 to avoid cold starts)
 default_max_scale: 10
 # IBM Cloud Observability integration
 observability:
 cloud_logs_instance_id: ${RUNTIME_IBM_CLOUD_LOGS_ID}
 cloud_monitoring_instance_id: ${RUNTIME_IBM_CLOUD_MONITORING_ID}
 # Limited capabilities (enforced by CE)
 capabilities:
 supports_compose: false
 supports_readonly_fs: false
 supports_custom_capabilities: false
 max_cpu: 12.0
 max_memory_gb: 48
 max_timeout_seconds: 600

 guardrails:
 default_profile: standard

 profiles:
 unrestricted:
 description: "Full access for development (Docker recommended)"
 recommended_backends: [docker] # Warning if used with CE
 network:
 egress_allowed: true
 allowed_hosts: ["*"]
 filesystem:
 read_only_root: false
 capabilities:
 drop_all: false
 resources:
 max_cpu: "2"
 max_memory: "2g"

 standard:
 description: "Balanced security for production (works on all backends)"
 recommended_backends: [docker, ibm_code_engine]
 network:
 egress_allowed: true
 # allowed_hosts ignored on CE (on/off only)
 allowed_hosts:
 - "*.googleapis.com"
 - "api.openai.com"
 - "api.anthropic.com"
 filesystem:
 read_only_root: true # Ignored on CE
 allowed_mounts: ["/data", "/cache"] # Ignored on CE
 capabilities:
 drop_all: true
 add: [NET_BIND_SERVICE] # Ignored on CE
 resources:
 max_cpu: "0.5"
 max_memory: "512m"
 max_pids: 100 # Ignored on CE

 restricted:
 description: "Minimal capabilities for untrusted servers"
 recommended_backends: [docker, ibm_code_engine]
 network:
 egress_allowed: false
 ingress_ports: [8080]
 filesystem:
 read_only_root: true # Ignored on CE
 allowed_mounts: []
 capabilities:
 drop_all: true
 add: []
 resources:
 max_cpu: "0.25"
 max_memory: "256m"
 max_pids: 50 # Ignored on CE
 seccomp: runtime/default
 apparmor: mcp-restricted # Ignored on CE (uses CE default)

 airgapped:
 description: "No network access"
 recommended_backends: [docker, ibm_code_engine]
 network:
 egress_allowed: false
 ingress_ports: []
 allowed_hosts: []
 filesystem:
 read_only_root: true # Ignored on CE
 capabilities:
 drop_all: true
 resources:
 max_cpu: "0.25"
 max_memory: "128m"

 approval:
 enabled: true
 required_for:
 - source_type: github
 - registry_not_in: [docker.io/library, docker.io/mcp]
 - guardrails_profile: unrestricted
 approvers:
 - group: security-team
 - user: admin@example.com
 timeout_hours: 48
 notification:
 webhook_url: ${RUNTIME_APPROVAL_WEBHOOK_URL}
 email_enabled: true

 catalog:
 allow_custom_entries: true
 remote_catalogs:
 - url: https://mcp-catalog.example.com/v1/servers
 sync_interval_hours: 24
 filter:
 categories: [official, verified]
```

### Catalog Entry Examples

```yaml
# Docker image source (works on both backends)
- name: "Filesystem Server"
 slug: "mcp-filesystem"
 description: "MCP server for secure file operations"
 category: "storage"
 tags: ["files", "storage", "official"]
 supported_backends: [docker, ibm_code_engine]
 source:
 type: docker
 image: docker.io/mcp/filesystem-server:1.2.0
 guardrails_profile: restricted
 requires_approval: false
 documentation_url: "https://docs.mcp.example.com/filesystem"

# GitHub repository source (works on both, but no cache on CE)
- name: "Custom Analytics Server"
 slug: "custom-analytics"
 description: "Internal analytics MCP server"
 category: "analytics"
 tags: ["internal", "analytics"]
 supported_backends: [docker, ibm_code_engine]
 source:
 type: github
 repo: org/mcp-analytics-server
 branch: main
 dockerfile: Dockerfile
 build_args:
 PYTHON_VERSION: "3.11"
 # Recommended: push to registry for faster CE deployments
 push_to_registry: true
 registry: us.icr.io/mcp-images
 guardrails_profile: standard
 requires_approval: true

# Docker Compose source (Docker only)
- name: "RAG Server with Vector DB"
 slug: "rag-server"
 description: "RAG server with embedded Qdrant vector database"
 category: "ai"
 tags: ["rag", "vectors", "ai"]
 supported_backends: [docker] # Compose not supported on CE
 source:
 type: compose
 compose_file: |
 services:
 mcp-rag:
 image: ghcr.io/example/mcp-rag:latest
 ports:
 - "8080:8080"
 depends_on:
 - qdrant
 environment:
 - QDRANT_URL=http://qdrant:6333
 qdrant:
 image: qdrant/qdrant:latest
 main_service: mcp-rag
 guardrails_profile: standard
 requires_approval: true
```

---

## IBM Code Engine Reference

### Key Limits

| Resource | Limit | Notes |
|----------|-------|-------|
| Projects per region | 20 | Can be increased via support |
| Apps per project | 40 | |
| CPU per app | 12 vCPU max | |
| Memory per app | 48 GB max | |
| App timeout | 600 seconds | |
| Max instances | 250 per app | |
| Build timeout | 3600 seconds | |
| Secrets per project | 100 | |

### Scaling Behavior

| Setting | Default | Impact |
|---------|---------|--------|
| `min-scale: 0` | Yes | Scale to zero, ~17s cold start |
| `min-scale: 1` | No | Always warm, ~0.2s response |
| Scale-down window | 60s | Sliding window for scaling decisions |
| Concurrency | 100 | Requests per instance before scaling |

### Pricing (as of 2025)

| Resource | Cost |
|----------|------|
| vCPU | $0.00003431/vCPU-second |
| Memory | $0.00000356/GB-second |
| HTTP requests | $0.538/million |
| Scale to zero | No charge |

---

## Open Questions for CE Architect

The following questions could not be definitively answered from public documentation:

1. **Egress filtering granularity**: Can egress be restricted to specific domains, or only on/off?
2. **Static outbound IPs**: Do CE apps get predictable outbound IPs for allowlisting?
3. **API rate limits**: What are the rate limits for CE management API?
4. **Webhooks for state changes**: Can CE send notifications on app state changes?
5. **Programmatic log access**: Can logs be retrieved via API without IBM Cloud Logs instance?
6. **Image scanning**: Does CE scan images for vulnerabilities?
7. **Recommended isolation model**: One CE project per tenant, or shared?

---

## Success Criteria

- [ ] **Core Functionality**: Deploy MCP servers from catalog with one click
- [ ] **Docker Backend**: Full lifecycle management with Docker
- [ ] **Compose Support**: Multi-container deployments work correctly (Docker only)
- [ ] **GitHub Source**: Build and deploy from GitHub repositories
- [ ] **IBM Code Engine**: Deploy to serverless IBM Code Engine
- [ ] **Security Guardrails**: Profiles correctly enforce container security (with backend-aware warnings)
- [ ] **Gateway Integration**: Deployed servers auto-register with gateway
- [ ] **Catalog Management**: CRUD operations for catalog entries
- [ ] **Approval Workflow**: Approval required for sensitive deployments
- [ ] **Monitoring**: Health, metrics, and logs accessible (backend-appropriate)
- [ ] **Admin UI**: Full management through web interface
- [ ] **Documentation**: Complete user and admin guides

---

## Definition of Done

- [ ] Runtime service with backend abstraction and capability checking
- [ ] Docker backend with full lifecycle support
- [ ] Docker Compose multi-container support (Docker only)
- [ ] GitHub repository source support
- [ ] IBM Code Engine backend implementation
- [ ] Security guardrails with preset profiles and backend compatibility
- [ ] Catalog CRUD API and UI with backend filtering
- [ ] Deployment API with status tracking and guardrails warnings
- [ ] Approval workflow for sensitive deployments
- [ ] Automatic gateway registration
- [ ] Admin UI for catalog, runtimes, settings
- [ ] Prometheus metrics for deployments
- [ ] Log aggregation (native + IBM Cloud integration)
- [ ] Unit tests with 80%+ coverage
- [ ] Integration tests for all deployment paths
- [ ] Security tests for guardrails enforcement
- [ ] User and admin documentation
- [ ] API documentation (OpenAPI)
- [ ] Code passes `make verify` checks

---

## References

- [IBM Code Engine Documentation](https://cloud.ibm.com/docs/codeengine)
- [CE Limits and Quotas](https://cloud.ibm.com/docs/codeengine?topic=codeengine-limits)
- [CE Application Scaling](https://cloud.ibm.com/docs/codeengine?topic=codeengine-app-scale)
- [CE Virtual Private Endpoints](https://cloud.ibm.com/docs/codeengine?topic=codeengine-vpe)
- [CE Health Probes](https://cloud.ibm.com/docs/codeengine?topic=codeengine-app-probes)
- [CE Pricing](https://cloud.ibm.com/docs/codeengine?topic=codeengine-pricing)
- [CE Metrics Collector](https://github.com/IBM/CodeEngine/tree/main/metrics-collector)
- [CE Pod Security Standards](https://github.ibm.com/coligo/readme/blob/main/architecture/projectisolation.md#pod-security-standards)
- [Docker Engine API](https://docs.docker.com/engine/api/)
- [Docker Compose Specification](https://docs.docker.com/compose/compose-file/)
- [Container Security Best Practices](https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html)

Capability	Docker	IBM Code Engine	Notes
Deployment Sources
Docker image	✅	✅
GitHub repo build	✅	✅	CE: no build cache, each build starts fresh
Docker Compose	✅	❌	CE: use separate app deployments
Security Guardrails
Network egress control	✅ Configurable	✅ On/Off	CE: project-level, not per-host filtering
Read-only filesystem	✅ Configurable	❌ N/A	CE: ephemeral FS tied to memory allocation
Custom capabilities	✅ Configurable	❌ Fixed	CE: enforces pod security standards
Resource limits	✅ Configurable	✅ Per-deployment	CE: set by ContextForge at deploy time
Seccomp/AppArmor	✅ Configurable	✅ Enforced	CE: automatically applied
Scaling
Min instances	✅	✅ 0-250	CE: 0 = scale-to-zero
Max instances	✅	✅ Max 250
Cold start	N/A	~17s avg	Set min=1 to avoid
Networking
Network isolation	Per container/network	Per CE project
Private endpoints	✅	✅ VPE	CE: `--visibility=private`
Internal DNS	Docker networks	`<app>.<project-id>.svc.cluster.local`
Monitoring
Native metrics	Docker API	IBM Cloud Monitoring
Native logs	Docker logs API	IBM Cloud Logs
Health checks	✅	✅ TCP/HTTP probes
Limits
Max CPU	Host-limited	12 vCPU
Max memory	Host-limited	48 GB
App timeout	Unlimited	600 seconds
Apps per project	N/A	40

Feature	Docker	IBM Code Engine
Metrics source	Docker API	IBM Cloud Monitoring
Logs source	Docker logs API	IBM Cloud Logs (via kubeapi)
Alerting	Custom webhooks	IBM Cloud Logs/Monitoring alerts
Health checks	Custom implementation	Native liveness/readiness probes

Resource	Limit	Notes
Projects per region	20	Can be increased via support
Apps per project	40
CPU per app	12 vCPU max
Memory per app	48 GB max
App timeout	600 seconds
Max instances	250 per app
Build timeout	3600 seconds
Secrets per project	100

Setting	Default	Impact
`min-scale: 0`	Yes	Scale to zero, ~17s cold start
`min-scale: 1`	No	Always warm, ~0.2s response
Scale-down window	60s	Sliding window for scaling decisions
Concurrency	100	Requests per instance before scaling

Resource	Cost
vCPU	$0.00003431/vCPU-second
Memory	$0.00000356/GB-second
HTTP requests	$0.538/million
Scale to zero	No charge

[EPIC][RUNTIME]: Secure MCP runtime - Remote server deployment and catalog integration (Docker, Code Engine) #2110

Description

Secure MCP Runtime - Remote Server Deployment & Catalog Integration

Goal

Why Now?

Backend Capability Matrix

Code Engine Specific Constraints

User Stories

Architecture

System Overview

Backend Abstraction

Deployment Flow

Database Schema

Implementation Tasks

Phase 1: Core Runtime Service

Phase 2: Docker Backend

Phase 3: GitHub Source Support

Phase 4: IBM Code Engine Backend

Phase 5: Catalog API

Phase 6: Runtime Deployment API

Phase 7: Security Guardrails Management

Phase 8: Approval Workflow

Phase 9: Admin UI

Phase 10: Gateway Integration

Phase 11: Testing

Phase 12: Documentation

Configuration Example

Environment Variables

Runtime Configuration (config.yaml)

Catalog Entry Examples

IBM Code Engine Reference

Key Limits

Scaling Behavior

Pricing (as of 2025)

Open Questions for CE Architect

Success Criteria

Definition of Done

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions