-
Notifications
You must be signed in to change notification settings - Fork 615
[EPIC][SECURITY]: Enterprise Security Controls - Credential Protection, SSRF Prevention, Multi-Tenant Isolation & Granular RBAC #2663
Description
[EPIC][SECURITY]: Enterprise Security Controls - Credential Protection, SSRF Prevention, Multi-Tenant Isolation & Granular RBAC
Labels: enhancement, python, security, epic, MUST
Goal
Implement enterprise-grade security controls for production deployments including API credential protection, SSRF prevention for cloud environments, secure multi-tenant isolation via token scoping, and granular RBAC for delegated administration. These capabilities enable ContextForge to meet SOC2, FedRAMP, and enterprise security requirements.
Why Now?
Enterprise customers require robust security controls before production deployment:
- Credential Protection: Enterprises need assurance that API credentials are never exposed in responses, logs, or caches
- Cloud-Native Security: Deployments on AWS/GCP/Azure require SSRF protection against cloud metadata attacks
- Multi-Tenant Isolation: Organizations with multiple teams need cryptographically-enforced resource boundaries
- Delegated Administration: Platform admins want to grant limited admin access without full superuser privileges
- Zero-Trust Architecture: All authentication contexts must flow through WebSocket and RPC layers
These capabilities position ContextForge as enterprise-ready for regulated industries.
📖 User Stories
US-1: Security Engineer - API Credential Protection
As a Security Engineer
I want all API responses to protect sensitive credentials
So that secrets cannot be extracted via API access or response caching
Acceptance Criteria:
Given a gateway is configured with auth credentials:
auth_type: "bearer"
auth_token: "production-secret-token"
When any API returns gateway data (GET, POST, PUT, LIST)
Then the response should contain:
- authToken: "*****" (masked display value)
- authTokenUnmasked: null (never populated)
And cached responses should also be masked
And the pattern applies to all credential fields (token, header, username, password)Capabilities:
GatewayRead.masked()method for consistent credential protection- All service return paths apply masking automatically
- Cache layer returns masked responses
- Applies to create, read, update, list, and cache operations
US-2: Cloud Architect - SSRF Prevention for Cloud Deployments
As a Cloud Architect
I want configurable SSRF protection that blocks cloud metadata access
So that the gateway is safe to deploy on AWS, GCP, and Azure
Acceptance Criteria:
Given the gateway is deployed in a cloud environment
When a tool, gateway, or resource URL targets cloud metadata:
- http://169.254.169.254/latest/meta-data/
- http://metadata.google.internal/
- http://169.254.169.123/ (AWS IMDSv2)
Then the request is rejected with a clear validation error
Given development mode (default):
SSRF_ALLOW_LOCALHOST=true
SSRF_ALLOW_PRIVATE_NETWORKS=true
When targeting localhost or RFC1918 addresses
Then the request is allowed for local development
Given production mode:
SSRF_ALLOW_LOCALHOST=false
SSRF_ALLOW_PRIVATE_NETWORKS=false
When targeting any internal address
Then the request is rejectedCapabilities:
SSRF_PROTECTION_ENABLEDmaster switch (default: true)- Configurable localhost and private network policies
- Hardcoded blocklist for cloud metadata (cannot be disabled)
- IPv4 and IPv6 support including link-local addresses
US-3: Platform Admin - Secure Multi-Tenant Resource Isolation
As a Platform Administrator
I want secure-first token scoping with explicit team boundaries
So that users only access resources they're authorized for
Acceptance Criteria:
Given a JWT token with various team claim states:
Scenario: Missing teams claim (secure default)
When teams claim is absent from token
Then user sees only public resources
And private/team resources are hidden
Scenario: Empty teams array (explicit public-only)
When token has teams: []
Then user sees only public resources
Scenario: Null teams without admin (secure default)
When token has teams: null AND is_admin: false
Then user sees only public resources
Scenario: Null teams with admin (explicit admin bypass)
When token has teams: null AND is_admin: true
Then user sees all resources (admin override)
Scenario: Specific teams (team-scoped access)
When token has teams: ["team-a", "team-b"]
Then user sees public + team-a + team-b resources
And other team resources are hiddenCapabilities:
normalize_token_teams()for consistent token interpretation- Secure-first defaults (ambiguous = minimum access)
- Team-scoped caching (public-only queries cached, team queries not)
- Dict-format team normalization (
[{"id": "t1"}]→["t1"])
US-4: Platform Admin - Granular RBAC for Delegated Administration
As a Platform Administrator
I want to grant specific admin capabilities without full superuser access
So that I can delegate tasks like "manage servers" without exposing other admin functions
Acceptance Criteria:
Given a user with limited permissions:
permissions: ["servers.read", "servers.create", "servers.update"]
When accessing /admin/servers endpoints
Then access is granted for server operations
When accessing /admin/tools or /admin/gateways
Then access is denied with 403 Forbidden
Given a user with is_admin: true flag
When accessing any admin endpoint
Then explicit permission is still required
Because allow_admin_bypass=False on all routes
Given a user with any admin.* permission
When accessing the admin UI entry point
Then the admin middleware allows UI access
And specific operations require their own permissionsCapabilities:
@require_permissiondecorators on all 177 admin routesallow_admin_bypass=Falseprevents superuser overridehas_admin_permission()for UI entry gate- New fine-grained permissions:
admin.overview,admin.dashboard,admin.events,admin.grpc,admin.plugins - Entity permissions:
servers.*,tools.*,gateways.*,resources.*,prompts.*,a2a.*,tags.*
US-5: Integration Developer - End-to-End Auth Context Propagation
As an Integration Developer
I want WebSocket connections to propagate authentication to RPC handlers
So that all request paths enforce consistent authorization
Acceptance Criteria:
Given a WebSocket connection with authenticated token:
ws://gateway:4444/ws?token=<jwt>
When the client sends JSON-RPC requests:
{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}
Then the RPC handler receives:
- Authorization: Bearer <validated-token>
- X-Proxy-User: <user-identity>
And team scoping is enforced on RPC responses
And the same user context applies to all transports (HTTP, WS, RPC)Capabilities:
- WebSocket auth token forwarding to
/rpcendpoint - X-Proxy-User header propagation for user identity
- Consistent auth context across all transport layers
- Token validation before forwarding (no raw passthrough)
US-6: DevOps Engineer - Consistent User Context Across Endpoints
As a DevOps Engineer
I want all endpoints to use a consistent user context format
So that logging, auditing, and debugging show uniform user information
Acceptance Criteria:
Given any authenticated request to any endpoint
When the request is processed
Then the user context should contain:
- email: user identifier
- is_admin: boolean flag
- teams: normalized team list
And the format is consistent across:
- REST API endpoints
- Admin API endpoints
- Teams router
- RBAC router
- RPC handlersCapabilities:
- Standardized
current_user_ctxformat across all routers - Consistent team normalization in all contexts
- Uniform logging and audit trail format
🏗 Architecture
Token Scoping Flow
graph TB
subgraph "Token Claims"
T1[teams: missing]
T2[teams: null + is_admin: false]
T3[teams: null + is_admin: true]
T4[teams: empty array]
T5[teams: list of IDs]
end
subgraph "normalize_token_teams"
N[Normalize Function]
end
subgraph "Access Level"
PO[Public Only]
AB[Admin Bypass - All]
TS[Team Scoped]
end
T1 --> N --> PO
T2 --> N --> PO
T3 --> N --> AB
T4 --> N --> PO
T5 --> N --> TS
SSRF Protection Flow
graph TB
subgraph "URL Validation"
URL[Incoming URL]
PARSE[Parse Host/IP]
CHECK{SSRF Check}
end
subgraph "Always Blocked"
META[Cloud Metadata<br/>169.254.169.254]
GCP[GCP Metadata<br/>metadata.google.internal]
LINK[Link-Local<br/>fe80::/10]
end
subgraph "Configurable"
LOCAL[Localhost<br/>SSRF_ALLOW_LOCALHOST]
PRIVATE[Private Networks<br/>SSRF_ALLOW_PRIVATE]
end
subgraph "Result"
ALLOW[Allow Request]
BLOCK[422 Validation Error]
end
URL --> PARSE --> CHECK
CHECK -->|Cloud Metadata| BLOCK
CHECK -->|Localhost| LOCAL
CHECK -->|Private IP| PRIVATE
LOCAL -->|Allowed| ALLOW
LOCAL -->|Blocked| BLOCK
PRIVATE -->|Allowed| ALLOW
PRIVATE -->|Blocked| BLOCK
CHECK -->|Public IP| ALLOW
📋 Implementation Tasks
Credential Protection ✅
- Implement
GatewayRead.masked()to null out unmasked fields - Apply masking in
create_gateway()response - Apply masking in
update_gateway()response - Apply masking in
get_gateway()response - Apply masking in
list_gateways()response - Apply masking to cached gateway reads
- Add security tests for credential protection
SSRF Prevention ✅
- Add SSRF configuration settings to
config.py - Implement
_validate_ssrf()URL validator - Hardcode cloud metadata blocklist (169.254.x.x, metadata.google.internal)
- Make localhost policy configurable (default: allow)
- Make private network policy configurable (default: allow)
- Document settings in
.env.example - Add Helm chart configuration for Kubernetes
- Add comprehensive documentation
Multi-Tenant Token Scoping ✅
- Implement
normalize_token_teams()inauth.py - Integrate with
_get_token_teams_from_request()inmain.py - Update
token_scoping.pymiddleware - Implement secure caching (only cache public-only queries)
- Apply token scoping to all list endpoints
- Apply token scoping to gateway forwarding
- Add tests for all token claim combinations
Granular Admin RBAC ✅
- Add new permissions to
Permissionsclass indb.py - Add
allow_admin_bypassparameter to RBAC decorators - Implement
has_admin_permission()in permission service - Update
AdminAuthMiddlewareto use capability check - Apply
@require_permissionto all 177 admin routes - Set
allow_admin_bypass=Falseon all admin decorators - Update RBAC documentation
Auth Context Propagation ✅
- Forward Authorization header in WebSocket to RPC
- Forward X-Proxy-User header for identity
- Validate token before forwarding
- Test end-to-end auth flow
Consistent User Context ✅
- Standardize user context format across routers
- Update Teams router endpoints
- Update RBAC router endpoints
- Ensure consistent logging format
⚙️ Configuration
SSRF Protection Settings
# Master switch (default: enabled)
SSRF_PROTECTION_ENABLED=true
# Development-friendly defaults
SSRF_ALLOW_LOCALHOST=true
SSRF_ALLOW_PRIVATE_NETWORKS=true
# Always blocked (hardcoded, cannot be overridden)
# - 169.254.169.254/32 (AWS/Azure metadata)
# - 169.254.169.123/32 (AWS IMDSv2)
# - 169.254.0.0/16 (link-local)
# - metadata.google.internal (GCP)
# - fe80::/10 (IPv6 link-local)Production Hardening
# Strict mode for cloud deployments
SSRF_ALLOW_LOCALHOST=false
SSRF_ALLOW_PRIVATE_NETWORKS=false✅ Success Criteria
- API credentials never exposed in any response path
- Cloud metadata endpoints blocked on all cloud platforms
- SSRF policies configurable for dev vs prod environments
- Tokens with missing/empty teams get public-only access
- Admin bypass requires explicit
teams: null+is_admin: true - All 177 admin routes enforce granular permissions
-
is_adminflag alone cannot bypass permission checks - WebSocket auth propagates to RPC layer
- User context format consistent across all endpoints
- All existing tests pass
- New security tests added for each capability
🏁 Definition of Done
- Credential masking implemented and tested
- SSRF protection with configurable policies
- Secure-first token scoping with
normalize_token_teams() - Granular RBAC on all admin routes
- Token-scoped filtering on list endpoints and gateway forwarding
- WebSocket auth forwarding to RPC
- Consistent user context across all endpoints
- Documentation updated (configuration, RBAC guide)
- Code passes
make verifychecks
📝 Additional Notes
🔹 Secure-First Design: Ambiguous token states (missing teams, empty teams) default to minimum access. This prevents privilege escalation from malformed tokens.
🔹 Cloud Metadata Protection: The blocklist for cloud metadata IPs is hardcoded and cannot be disabled via configuration. This ensures protection even if operators misconfigure SSRF settings.
🔹 Strict RBAC: The is_admin flag no longer bypasses permission checks on admin routes. Admins must have explicit permissions granted, enabling fine-grained delegation.
🔹 Backward Compatibility: Properly-formed tokens with explicit team claims continue to work unchanged. Only edge cases with missing/null claims are affected.
🔹 Database Sessions: All endpoints use db: Session = Depends(get_db). Never use current_user_ctx["db"] which is None by design.