Skip to content

Cyfr 94358 context forge resync 09 03 2026 old#6

Open
aidbutlr wants to merge 2489 commits intomainfrom
CYFR-94358-ContextForge-Resync-09-03-2026_old
Open

Cyfr 94358 context forge resync 09 03 2026 old#6
aidbutlr wants to merge 2489 commits intomainfrom
CYFR-94358-ContextForge-Resync-09-03-2026_old

Conversation

@aidbutlr
Copy link
Copy Markdown
Owner

@aidbutlr aidbutlr commented Mar 9, 2026

🔗 Related Issue

Closes #


📝 Summary

What does this PR do and why?


🏷️ Type of Change

  • Bug fix
  • Feature / Enhancement
  • Documentation
  • Refactor
  • Chore (deps, CI, tooling)
  • Other (describe below)

🧪 Verification

Check Command Status
Lint suite make lint
Unit tests make test
Coverage ≥ 80% make coverage

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • Tests added/updated for changes
  • Documentation updated (if applicable)
  • No secrets or credentials committed

📓 Notes (optional)

Screenshots, design decisions, or additional context.

crivetimihai and others added 30 commits February 1, 2026 22:37
* test: improve cache coverage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: improve coverage for cli and runtime paths

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: fix toolops permission stubs

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: expand coverage for tool helpers and admin servers

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: extend coverage for low-coverage services

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: extend coverage for services

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: expand coverage for grpc oauth metrics

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: expand unit coverage for admin and services

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: expand observability and oauth coverage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Fix flaky test

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* 80% threshold

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Docs update for testing

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: expand coverage for transports, plugins, wrapper

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Fix tests

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Fix tests

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Fix tests

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Test improvements

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Increase coverage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: expand coverage for observability and services

* test: expand bulk registration coverage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Increase coverage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Increase coverage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
* chore: unignore documentation files in .gitignore

* chore: unignore FEATURES.md documentation files

* docs: update oauth design and remove empty blog index

* docs: cleanup placeholders, update statuses, and fix navigation

* typo

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Documentation review & update

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
…BM#2549)

Replace long-lived database sessions in RBAC middleware with
fresh_db_session() context manager to prevent session accumulation
under high concurrent load.

Changes:
- Remove db parameter from get_current_user_with_permissions()
- Use fresh_db_session() context manager for short-lived DB access
- Keep "db": None in user context for backward compatibility
- Add deprecation warnings to get_db() and get_permission_service()
- Update all permission decorators to use fresh_db_session() fallback
- Update PermissionChecker to use fresh_db_session() pattern
- Simplify db.py by reusing get_db() generator for fresh_db_session

Security fixes:
- Use named kwargs (user, _user, current_user, current_user_ctx) for
  user context extraction instead of scanning all dicts for "email"
  to prevent request body injection attacks

Performance fixes:
- PermissionChecker.has_any_permission now uses single session for
  all permission checks instead of opening N sessions

This prevents idle-in-transaction bottlenecks where sessions were
held for entire request duration instead of milliseconds.

Closes IBM#2340

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
* feat: unified PDP plugin for issue IBM#2223

Adds a single plugin entry point that orchestrates access-control
decisions across multiple policy engines (Native RBAC, MAC, OPA, Cedar).

- plugins/unified_pdp/unified_pdp.py  — Plugin class, hooks into
  tool_pre_invoke and resource_pre_fetch
- plugins/unified_pdp/pdp.py          — PolicyDecisionPoint orchestrator
- plugins/unified_pdp/pdp_models.py   — Pydantic models (Subject, Resource,
  Context, AccessDecision, config types)
- plugins/unified_pdp/adapter.py      — Abstract engine adapter base class
- plugins/unified_pdp/cache.py        — TTL-aware decision cache
- plugins/unified_pdp/engines/        — Four engine adapters: native_engine,
  mac_engine, opa_engine, cedar_engine
- plugins/unified_pdp/default_rules.json — Starter RBAC ruleset
- tests/unit/plugins/test_unified_pdp.py — 46 unit tests
- plugins/config.yaml                 — Plugin registration (mode: disabled)
- MANIFEST.in                         — Added recursive-include plugins *.json

Combination modes: all_must_allow | any_allow | first_match
Native RBAC and MAC work out of the box. OPA and Cedar require their
respective sidecars (see README).

Closes IBM#2223

Signed-off-by: yiannis2804 <yiannis2804@gmail.com>

* test: add plugin class unit tests, coverage 86%

13 tests covering UnifiedPDPPlugin hook methods (tool_pre_invoke,
resource_pre_fetch), subject extraction (dict/string/None user),
action string formatting, resource type mapping, and _build_pdp.

unified_pdp.py now at 100% coverage. Remaining gaps are in OPA and
Cedar engine adapters which require external sidecars to test.

Signed-off-by: yiannis2804 <yiannis2804@gmail.com>

* docs: add detailed README for unified PDP plugin

Signed-off-by: yiannis2804 <yiannis2804@gmail.com>

* fix(unified-pdp): fix bugs and improve tests

- Fix undefined variable eng_type in pdp.py:get_effective_permissions()
- Add shutdown() lifecycle method to UnifiedPDPPlugin to properly close
  HTTP clients for OPA/Cedar engines
- Convert tests from respx to pytest-httpx (project standard)
- Add test for shutdown() method

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* chore(unified-pdp): fix linting issues

- Remove unused import List from mac_engine.py
- Remove unused variable first_deny from pdp.py

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(unified-pdp): address review findings from additional security review

- Cache key now includes user_agent and context.extra to prevent incorrect
  cached decisions when policies depend on these fields (MAC operation
  override, OPA/Cedar context-based rules)
- Plugin now extracts IP and user_agent from HTTP headers and passes to
  PDP context for policy evaluation
- Plugin passes tool args to context.extra and resource metadata to
  resource.annotations for fine-grained policy checks
- Exception handling in _evaluate_parallel/_evaluate_sequential now
  catches all exceptions (not just TimeoutError/PolicyEvaluationError)
  to prevent crashing the whole request on unexpected errors
- Native RBAC docstring corrected: only JSON files are supported (not YAML)

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(unified-pdp): extract classification_level for MAC engine

Extract classification_level from tool args and resource metadata
so MAC engine can make proper Bell-LaPadula decisions instead of
always denying due to missing classification.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* docs: add docstrings for 100% interrogate coverage

Add missing docstrings to all public functions and methods in the
unified_pdp plugin to satisfy the project's 100% docstring coverage
requirement enforced by interrogate.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* docs: add comprehensive Google-style docstrings to unified_pdp

Add complete Args, Returns, Raises, and Attributes documentation to all
public functions and methods in the unified_pdp plugin, matching the
project's docstring style with full parameter descriptions.

Files updated:
- adapter.py: PolicyEvaluationError, PolicyEngineAdapter methods
- cache.py: _build_cache_key, _CacheEntry, DecisionCache methods
- pdp.py: PolicyDecisionPoint and all evaluation/combination methods
- engines/cedar_engine.py: CedarEngineAdapter and all methods
- engines/mac_engine.py: MACEngineAdapter and all methods
- engines/native_engine.py: NativeRBACAdapter and all methods
- engines/opa_engine.py: OPAEngineAdapter and all methods
- unified_pdp.py: shutdown lifecycle method

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* docs: add __init__ docstring to PolicyEvaluationError

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: yiannis2804 <yiannis2804@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
* feat(api): standardize gateway response format

- Set *_unmasked fields to null in GatewayRead.masked()
- Apply masking consistently across all gateway return paths
- Mask credentials on cache reads
- Update admin UI to indicate stored secrets are write-only
- Update tests to verify masking behavior

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* delete artifact sbom

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* feat(gateway): add configurable URL validation for gateway endpoints

Add comprehensive URL validation with configurable network access controls
for gateway and tool URL endpoints. This allows operators to control which
network ranges are accessible based on their deployment environment.

New configuration options:
- SSRF_PROTECTION_ENABLED: Master switch for URL validation (default: true)
- SSRF_ALLOW_LOCALHOST: Allow localhost/loopback (default: true for dev)
- SSRF_ALLOW_PRIVATE_NETWORKS: Allow RFC 1918 ranges (default: true)
- SSRF_DNS_FAIL_CLOSED: Reject unresolvable hostnames (default: false)
- SSRF_BLOCKED_NETWORKS: CIDR ranges to always block
- SSRF_BLOCKED_HOSTS: Hostnames to always block

Features:
- Validates all resolved IP addresses (A and AAAA records)
- Normalizes hostnames (case-insensitive, trailing dot handling)
- Blocks cloud metadata endpoints by default (169.254.169.254, etc.)
- Dev-friendly defaults with strict mode available for production
- Full documentation and Helm chart support

Also includes minor admin UI formatting improvements.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* feat(auth): add token-scoped filtering for list endpoints and gateway forwarding

- Add token_teams parameter to list_servers and list_gateways endpoints
  for proper scoping based on JWT token team claims
- Update server_service.list_servers() and gateway_service.list_gateways()
  to filter results by token scope (public-only, team-scoped, or unrestricted)
- Skip caching for token-scoped queries to prevent cross-user data leakage
- Update gateway forwarding (_forward_request_to_all) to respect token team scope
- Fix public-only token handling in create endpoints (tools, resources, prompts,
  servers, gateways, A2A agents) to reject team/private visibility
- Preserve None vs [] distinction in SSE/WebSocket for proper admin bypass
- Update get_team_from_token to distinguish missing teams (legacy fallback)
  from explicit empty teams (public-only access)
- Add request.state.token_teams storage in all auth paths for downstream access

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* feat(auth): add normalize_token_teams for consistent token scoping

Introduces a centralized `normalize_token_teams()` function in auth.py
that provides consistent token team normalization across all code paths:

- Missing teams key → empty list (public-only access)
- Explicit null teams + admin flag → None (admin bypass)
- Explicit null teams without admin → empty list (public-only)
- Empty teams array → empty list (public-only)
- Team list → normalized string IDs (team-scoped)

Additional changes:
- Update _get_token_teams_from_request() to use normalized teams
- Fix caching in server/gateway services to only cache public-only queries
- Fix server creation visibility parameter precedence
- Update token_scoping middleware to use normalize_token_teams()
- Add comprehensive unit tests for token normalization behavior

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* feat(websocket): forward auth credentials to /rpc endpoint

The WebSocket /ws endpoint now propagates authentication credentials
when making internal requests to /rpc:

- Forward JWT token as Authorization header when present
- Forward proxy user header when trust_proxy_auth is enabled
- Enables WebSocket transport to work with AUTH_REQUIRED=true

Also adds unit tests to verify auth credential forwarding behavior.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* feat(rbac): add granular permission checks to all admin routes

- Add @require_permission decorators to all 177 admin routes with
  allow_admin_bypass=False to enforce explicit permission checks
- Add allow_admin_bypass parameter to require_permission and
  require_any_permission decorators for configurable admin bypass
- Add has_admin_permission() method to PermissionService for checking
  admin-level access (is_admin, *, or admin.* permissions)
- Update AdminAuthMiddleware to use has_admin_permission() for
  coarse-grained admin UI access control
- Create shared test fixtures in tests/unit/mcpgateway/conftest.py
  for mocking PermissionService across unit tests
- Update test files to use proper user context dict format

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* docs(rbac): comprehensive update to authentication and RBAC documentation

Update documentation to accurately reflect the two-layer security model
(Token Scoping + RBAC) and correct token scoping behavior.

rbac.md:
- Rewrite overview with two-layer security model explanation
- Fix token scoping matrix (missing teams key = PUBLIC-ONLY, not UNRESTRICTED)
- Add admin bypass requirements warning (requires BOTH teams:null AND is_admin:true)
- Add public-only token limitations (cannot access private resources even if owned)
- Add Permission System section with categories and fallback permissions
- Add Configuration Safety section (AUTH_REQUIRED, TRUST_PROXY_AUTH warnings)
- Update enforcement points matrix with Token Scoping and RBAC columns

multitenancy.md:
- Add Token Scoping Model section with secure-first defaults
- Add Two-Layer Security Model section with request flow diagram
- Add Enforcement Points Matrix
- Add Token Scoping Invariants
- Document multi-team token behavior (first team used for request.state.team_id)

oauth-design.md & oauth-authorization-code-ui-design.md:
- Add scope clarification notes (gateway OAuth delegation vs user auth)
- Add Token Verification section
- Add cross-references to RBAC and multitenancy docs

AGENTS.md:
- Add Authentication & RBAC Overview section with quick reference

llms/mcpgateway.md & llms/api.md:
- Add token scoping quick reference and examples
- Add links to full documentation

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(rbac): add explicit db dependency to RBAC-protected routes

Address load test findings from RCA #1 and #2:

- Add `db: Session = Depends(get_db)` to routes in email_auth.py,
  llm_config_router.py, and teams.py that use @require_permission
- Fix test files to pass mock_db parameter after signature changes
- Add shm_size: 256m to PostgreSQL in docker-compose.yml
- Remove non-serializable content from resource update events
- Disable CircuitBreaker plugin for consistent load testing

These changes fix the NoneType errors (~33,700) observed under 4000
concurrent users where current_user_ctx["db"] was always None.

Remaining critical issue: Transaction leak in streamablehttp_transport.py
causing idle-in-transaction connections (see todo/rca2.md for details).

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(db): resolve transaction leak and connection pool exhaustion

Critical fixes for load test failures at 4000 concurrent users:

Issue #1 - Transaction leak in streamablehttp_transport.py (CRITICAL):
- Add explicit asyncio.CancelledError handling in get_db() context manager
- When MCP handlers are cancelled (client disconnect, timeout), the finally
  block may not execute properly, leaving transactions "idle in transaction"
- Now explicitly rollback and close before re-raising CancelledError
- Add rollback in direct SessionLocal usage at line ~1425

Issue #2 - Missing db parameter in admin routes (HIGH):
- Add `db: Session = Depends(get_db)` to 73 remaining admin routes
- Routes with @require_permission but no db param caused decorator to
  create fresh session via fresh_db_session() for EVERY permission check
- This doubled connection usage for affected routes under load

Issue #3 - Slow recovery from transaction leaks (MEDIUM):
- Reduce IDLE_TRANSACTION_TIMEOUT from 300s to 30s in docker-compose.yml
- Reduce CLIENT_IDLE_TIMEOUT from 300s to 60s
- Leaked transactions now killed faster, preventing pool exhaustion

Root cause confirmed: list_resources() MCP handler was primary source,
with 155+ connections stuck on `SELECT resources.*` for up to 273 seconds.

See todo/rca2.md for full analysis including live test data showing
connection leak progression and 606+ idle transaction timeout errors.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(teams): use consistent user context format across all endpoints

- Update request_to_join_team and leave_team to use dict-based user context
- Fix teams router to use get_current_user_with_permissions consistently
- Move /discover route before /{team_id} to prevent route shadowing
- Update test fixtures to use mock_user_context dict format
- Add transaction commits in resource_service to prevent connection leaks
- Add missing docstring parameters for flake8 compliance

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(db): add explicit db.commit/close to prevent transaction leaks

Add explicit db.commit(); db.close() calls to 100+ endpoints across
all routers to prevent PostgreSQL connection leaks under high load.

Problem: Under high concurrency, FastAPI's Depends(get_db) cleanup
runs after response serialization, causing transactions to remain
in 'idle in transaction' state for 20-30+ seconds, exhausting the
connection pool.

Solution: Explicitly commit and close database sessions immediately
after database operations complete, before response serialization.

Routers fixed:
- tokens.py: 10 endpoints (create, list, get, update, revoke, usage, admin, team tokens)
- llm_config_router.py: 14 endpoints (provider/model CRUD, health, gateway models)
- sso.py: 5 endpoints (SSO provider CRUD)
- email_auth.py: 3 endpoints (user create/update/delete)
- oauth_router.py: 1 endpoint (delete_registered_client)
- teams.py: 18 endpoints (team CRUD, members, invitations, join requests)
- rbac.py: 12 endpoints (roles, user roles, permissions)
- main.py: 14 CUD + 3 list + 7 RPC handlers

Also fixes:
- admin.py: Rename 21 unused db params to _db (pylint W0613)
- test_teams*.py: Add mock_db fixture to tests calling router functions directly
- Add llms/audit-db-transaction-management.md for future audits

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* ci(coverage): lower doctest coverage threshold to 30%

Reduce the required doctest coverage from 34% to 30% to accommodate
current coverage levels (32.17%).

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(rpc): fix list_gateways tuple unpacking and add token scoping

The RPC list_gateways handler had two bugs:
1. Did not unpack the tuple (gateways, next_cursor) returned by
   gateway_service.list_gateways(), causing 'list' object has no
   attribute 'model_dump' error
2. Was missing token scoping via _get_rpc_filter_context(), which
   was the original R-02 security fix

Also fixed all callers of list_gateways that expected a list but
now receive a tuple:
- mcpgateway/admin.py: get_gateways_section()
- mcpgateway/services/import_service.py: 3 call sites

Updated test mocks to return (list, None) tuples instead of lists.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(teams): build response before db.close() to avoid lazy-load errors

The teams router was calling db.commit(); db.close() before building
the TeamResponse, but TeamResponse includes team.get_member_count()
which needs an active session. When the session is closed, the fallback
in get_member_count() tries to access self.members (lazy-loaded),
causing "Parent instance is not bound to a Session" errors.

Fixed by building TeamResponse BEFORE calling db.close() in:
- create_team
- get_team
- update_team

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(teams): fix update_team expecting team object but getting bool

The service's update_team() returns bool, but the router was treating
the return value as a team object and trying to access .id, .name, etc.

Fixed by:
1. Checking the boolean return value for success
2. Fetching the team again after successful update to build the response

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(teams): fix update_member_role return type mismatch

The service's update_member_role() returns bool, but the router
treated it as a member object. Fixed by:
1. Checking the boolean success
2. Added get_member() method to TeamManagementService
3. Fetching the updated member to build the response

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* Fix teams return

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Removed unreleased security changes regarding gateway credentials from CHANGELOG.
* fix: add PERMISSION_AUDIT_ENABLED toggle for RBAC auditing

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* chore: clarify permission audit settings docstring

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* chore: remove unrelated CHANGELOG.md changes

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
…rce (IBM#2718)

* fix: eliminate redundant DB queries in read_resource and invoke_resource

Steps 3-4 of load test RCA: reduce per-request query count from 6 to 2
for resource-by-ID requests.

Step 3: After Q2 (db.get), check enabled in Python and guard Q3/Q4 with
resource_db is None so they only run for URI-only lookups.

Step 4: Add joinedload(DbResource.gateway) to Q2, pass pre-fetched
resource_obj and gateway_obj to invoke_resource() to skip Q5/Q6.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* lint

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
* Add x-mcp-session-id to default identity headers
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Pass x-mcp-session-id to mcp_session_pool headers and prioritize if found

* wip sa

Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* add e2e test

Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* flake8 fix
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* remove plan

Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* pylint fix
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Implement multi worker
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Implement multi worker for mcp session pool
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* linting fixes
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Minor bug fixes
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix critical bugs
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix sse session_id, add logging and fix test
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* fix url of rpc from nginx
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* add stateful sessions in http
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* WIP fixes to streamable http
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix streamable http
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Updated ADR
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Update ADR
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* black fixes
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix failing doctests
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix more tests
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* flake8 fixes
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* pylint fixes
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* pylint fixes
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix streamable http for single gunicorn
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Revert base_url
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix test
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* revert replica count
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix bandit test
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* remove plan

Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix bug for local
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Update ADR and remove print
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix lint issues
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Fix test
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Remove accidental utf-8 headers from incorrect rebase

Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* fix: replace debug print statements with logger calls in session affinity code

Convert print() statements to appropriate logger.debug()/logger.info()/logger.warning()
calls for proper log management in the multi-worker session affinity feature.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: harden session affinity and redis event store

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* lint

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: avoid broad exception in streamable http header parsing

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* lint

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* docs: add missing docstrings for interrogate compliance

Add docstrings to _pool_owner_key, _rehydrate_content_items, and
send_with_capture to achieve 100% docstring coverage.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: add missing newline at end of redis_event_store.py

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* docs: complete docstrings with Args and Returns sections

Fix darglint DAR101/DAR201 errors by adding missing parameter
and return documentation to docstrings.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* lint

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
…M#2638)

* fix: prompts are an Optional[set[str]] - set of prompt names.

Signed-off-by: habeck <habeck@us.ibm.com>

* revert: llmguard plugins.conditions.prompts

Signed-off-by: habeck <habeck@us.ibm.com>

* feat: add external plugin metrics endpoint

Signed-off-by: habeck <habeck@us.ibm.com>

* perf: use rapidfuzz.distance instead of word-wise Levenshtein distance, add metrics for scan duration seconds

Signed-off-by: habeck <habeck@us.ibm.com>

* perf: add metric for policy compile duration seconds

Signed-off-by: habeck <habeck@us.ibm.com>

* perf: policy singleton

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: missed commit to add rapidfuzz dependency

Signed-off-by: habeck <habeck@us.ibm.com>

* perf: add scan caching

Signed-off-by: habeck <habeck@us.ibm.com>

* enh: make _create_new_vault_on_expiry async

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: lint fixes

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: lint fix

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: lint fixes

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: add doc comments

Signed-off-by: habeck <habeck@us.ibm.com>

* fix: pin transformers to 4.55.1 to prevent TFPreTrainedModel error

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: lint fix

Signed-off-by: habeck <habeck@us.ibm.com>

* fix: Since prompt_ids are only known after creation, apply to all so that the plugin works out of the box.

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: test fix

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: remove duplicate import

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: lint fix

Signed-off-by: habeck <habeck@us.ibm.com>

* enh:
Key Improvements:
Code Quality: Reduced cyclomatic complexity by ~50%
Performance: Vault retrieval moved outside message loop (eliminates redundant async cache lookups)
Consistency: All processing methods follow same pattern as input methods
Maintainability: Clear separation of concerns, easier to test individual components
Zero Breaking Changes: Maintains exact functional behavior

Signed-off-by: habeck <habeck@us.ibm.com>

* fix: use lazy evaluation rather than f-strings

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: enable snatizers by default

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: add env var to disable TensorFlow in plugin startup.

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: fix return type on __update_context api.

Signed-off-by: habeck <habeck@us.ibm.com>

* enh: run the cache cleanup in a background thread rather than on every scan.

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: lint fix

Signed-off-by: habeck <habeck@us.ibm.com>

* fix: test case for Test _handle_vault_caching handles case when no vault exists.

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: add unit tests for new code

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: test coverage for llmguard.py to 94% from 80%

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: policy.py coverage to 100%

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: cache.py tests to 100%

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: lint fixes

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: add missing class doc to test_llmguardplugin.py

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: update readme

Signed-off-by: habeck <habeck@us.ibm.com>

* chore: clearer comment for plugin.conditions.prompts

Signed-off-by: habeck <habeck@us.ibm.com>

---------

Signed-off-by: habeck <habeck@us.ibm.com>
* Fix compose-tls for certs with passphrase
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* Update documentation
Signed-off-by: Madhav Kandukuri <madhav165@gmail.com>

* fix: improve security and validation for passphrase-protected keys

- Use env:KEY_FILE_PASSWORD instead of pass: to avoid exposing
  password in process listings
- Add validation to ensure cert.pem exists when key-encrypted.pem
  is provided, preventing silent key overwrite with self-signed cert

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Closes IBM#2563

This commit fixes two issues:

1. Gateway Tags Returned as Empty List (IBM#2563):
   - Fixed type annotation mismatch in validate_tags_field() to correctly
     return List[Dict[str, str]] instead of List[str]
   - Added passthrough logic for already-formatted tag dictionaries in
     TagValidator.validate_list()
   - Updated GatewayCreate.tags and GatewayUpdate.tags to accept both
     legacy string format and new dict format
   - Fixed parenthesis placement in get_gateway_by_url() to correctly
     call masked() on GatewayRead instead of DbGateway

2. Transport Field Reset During Gateway Update:
   - Changed GatewayUpdate.transport from default="SSE" to None to
     prevent overwriting existing values when field is omitted in
     PUT/PATCH requests

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: oaslananka <169144131+oaslananka@users.noreply.github.com>
Signed-off-by: oaslananka <oaslananka@users.noreply.github.com>
Co-authored-by: oaslananka <oaslananka@users.noreply.github.com>
The conditional expression always returned the same value regardless
of the condition. Simplified to direct assignment.

Closes IBM#2367

Signed-off-by: ChaiAndCode <saaiaravindhraja@gmail.com>
* Resource tags are being displayed

Signed-off-by: NAYANAR <nayana.r7813@gmail.com>

* Fix tag display bug where {{ tag.id }} crashes if tags are plain strings; render tags defensively

Signed-off-by: NAYANAR <nayana.r7813@gmail.com>

* fix: Apply defensive tag pattern consistently across all templates

Apply the `{% if tag is mapping %}{{ tag.id }}{% else %}{{ tag }}{% endif %}`
pattern to resources_partial.html, plugins_partial.html, and
tools_with_pagination.html that were missing the defensive check.

Also fix accidental indentation change in admin.html.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: NAYANAR <nayana.r7813@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Add Prompt ID visibility to the admin panel's Prompts page:
- Added "Prompt ID" column header in the prompts table
- Display prompt.id in the table row with monospace styling
- Added "Prompt ID" field as the first item in the view prompt modal

Closes IBM#2656

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
…BM#2713)

* fix: ovelapping Authorize and Fetch Tool buttons on MCP servers page

Signed-off-by: Marek Dano <mk.dano@gmail.com>

* fix: center the header text Actions in the MCP Servers Gateways table

Signed-off-by: Marek Dano <mk.dano@gmail.com>

---------

Signed-off-by: Marek Dano <mk.dano@gmail.com>
The Edit User modal's password requirement icons remained unchanged when
typing because both Create User and Edit User forms used identical
element IDs (req-length, req-uppercase, etc.). When JavaScript called
document.getElementById(), it returned the Create form's elements
instead of the Edit form's elements.

Renamed Edit User form element IDs to use 'edit-' prefix to ensure
uniqueness and updated the corresponding JavaScript functions in
admin.js to reference the new IDs.

Closes IBM#2702

Signed-off-by: Gabriel Costa <gabrielcg@proton.me>
Updates the HX-Trigger in admin_update_team to use 'refreshUnifiedTeamsList'
instead of 'refreshTeamsList', ensuring the UI updates correctly after
editing a team.

Signed-off-by: Adnan Vahora <adnan.vahora1@motorolasolutions.com>
Co-authored-by: Adnan Vahora <adnan.vahora1@motorolasolutions.com>
This commit addresses multiple loading spinner issues in the admin UI:

1. Fixed double loading spinners on initial page load/refresh
   - Removed redundant initial placeholder spinners from Gateways, Catalog,
     Tools, and Tool Operations panels
   - Now relies solely on HTMX indicators for loading states
   - Affected files: mcpgateway/templates/admin.html

2. Fixed spurious spinners triggered by background requests
   - Added CSS rules to prevent all .htmx-indicator elements from showing
     on unrelated requests
   - Scoped indicators to specific panels
   - Only show indicators when explicitly targeted via hx-indicator attribute
   - Uses proper CSS specificity to ensure targeted indicators are shown
   - Prevents spinners from appearing during background /trace requests
   - Affected files: mcpgateway/static/admin.css

3. Standardized Resources panel loading indicator
   - Replaced simple spinner div with proper HTMX indicator matching other panels
   - Added animated SVG spinner with "Loading resources..." text
   - Affected files: mcpgateway/templates/admin.html

4. Aligned Prompts panel implementation with other panels
   - Removed dual loading state (inline + external indicator)
   - Standardized to single external HTMX indicator for consistency
   - Changed spinner color to indigo for consistency with other panels
   - Affected files: mcpgateway/templates/admin.html

5. Fixed Tool Operations panel loading indicator
   - Added indicator to admin.html (outside swap target) so it exists on
     initial page load
   - Removed duplicate indicator from toolops_partial.html to avoid ID conflict
   - Affected files: mcpgateway/templates/admin.html,
     mcpgateway/templates/toolops_partial.html

All panels now have consistent loading behavior:
- Single loading indicator per panel
- No spurious spinners on background requests
- Proper HTMX indicator visibility control via CSS

Fixes IBM#2689

Signed-off-by: Gabriel Costa <gabrielcg@proton.me>
…BM#2701)

The Edit User modal was hidden when HTMX tried to swap content into
#user-edit-modal-content, causing htmx:targetError console errors.

Changes:
- Add hx-on::before-request to Edit button to show modal synchronously
- Remove global htmx:afterRequest listener that showed modal after request
- Remove hx-target from form since content swaps into modal while visible

The modal is now visible before HTMX requests, preventing targetError.

Closes IBM#2693

Signed-off-by: Marek Dano <mk.dano@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
…pre-commit to pass on Linux or Mac. (IBM#2740)

* fix 2731 - Change file permissions on test/client/__init__.py

Signed-off-by: Brian Hussey <redacted@ie.ibm.com>

* fix 2732 - Change file permissions on executable scripts and python files that are self executable

Signed-off-by: Brian Hussey <redacted@ie.ibm.com>

* fix 2733 - set pre-commit detect-private-key to ignore the specific files with tests for the not allowed terms

Signed-off-by: Brian Hussey <redacted@ie.ibm.com>

* fix 2734 - correct config of check-yaml to allow multiple files and to fix yaml linting issue with tests/performance/plugins/config.yaml

Signed-off-by: Brian Hussey <redacted@ie.ibm.com>

* fix 2735 - correct config of name-tests-test to exclude further tests that are procedural and not specific units, including jmeter, loadtest(locust) and client

Signed-off-by: Brian Hussey <redacted@ie.ibm.com>

---------

Signed-off-by: Brian Hussey <redacted@ie.ibm.com>
Co-authored-by: Brian Hussey <redacted@ie.ibm.com>
…nd (IBM#2654)

- Backend: Extract root cause from BaseExceptionGroup when MCP SDK uses
  TaskGroup, ensuring actual HTTP errors (401, 405, etc.) are shown
  instead of generic "Failed to initialize gateway" messages
- Backend: Change HTTP status from 503 to 502 for GatewayConnectionError
  as 502 Bad Gateway more accurately represents upstream server failures
- Backend: Include sanitized error details in GatewayConnectionError
  messages for better debugging while protecting sensitive URL params
- Backend: Add userinfo (user:pass@host) redaction to sanitize_url_for_logging
  as defense-in-depth against credential leakage in error messages
- Frontend: Add safeParseJsonResponse() helper to validate response
  status and Content-Type before parsing JSON, preventing crashes when
  proxies or auth redirects return HTML error pages
- Frontend: Update extractApiError() to also check error.message field
- Frontend: Detect HTML responses and show user-friendly message instead
  of raw HTML; truncate long text responses to 200 chars
- Apply safeParseJsonResponse to 12 high-risk form handlers (POST/PUT):
  Gateway, Resource, Prompt, Server, A2A Agent, Tool (add & edit each)

Closes IBM#2562

Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
* 2346 - Fixed root buttons of view, edit and export

Signed-off-by: Mihai-Vlad Rusu <vladrusu@MacBookPro.lan>

* Fixed lint, test and format issue

Signed-off-by: Mihai-Vlad Rusu <vladrusu@MacBookPro.lan>

* chore: remove unrelated files from rebase

Signed-off-by: Mihai-Vlad Rusu <vladrusu@MacBookPro.lan>

* fix: correct route ordering for /roots/changes endpoint

- Move /changes endpoint before catch-all /{root_uri:path} to fix routing
- Remove debug print statement from update_root
- Restore correct test expectations for SSE endpoint

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: address code review findings

- Normalize URIs in get/update/remove_root to match storage key
- Fix JS null-deref when Content-Disposition header is missing
- Fix misleading 'getting' log message in update_root
- Return Root object directly instead of dict for type consistency

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* lint

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* lint

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mihai-Vlad Rusu <vladrusu@MacBookPro.lan>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai-Vlad Rusu <vladrusu@MacBookPro.lan>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Shoumi <shoumimukherjee@gmail.com>
…ion (IBM#2649)

Add guard check for empty decoded_auth_value before accessing dict keys
in convert_tool_to_read for authheaders auth type. When auth_value
decrypts to an empty dict, set auth to None instead of crashing.

Closes IBM#1430

Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
* fix(auth): add missing fields to EmailUser instantiations

Signed-off-by: Shoumi <shoumimukherjee@gmail.com>

* add regression test for API token /me endpoint serialization

Signed-off-by: Shoumi <shoumimukherjee@gmail.com>

* test fixes

Signed-off-by: Shoumi <shoumimukherjee@gmail.com>

---------

Signed-off-by: Shoumi <shoumimukherjee@gmail.com>
…n updating user details (IBM#2736)

* fix: add password and full_name fields as optional for update user request

Signed-off-by: Marek Dano <mk.dano@gmail.com>

* fix: add is_admin field as optional for the update user request

Signed-off-by: Marek Dano <mk.dano@gmail.com>

* fix: lint issue when running make flake8

Signed-off-by: Marek Dano <mk.dano@gmail.com>

---------

Signed-off-by: Marek Dano <mk.dano@gmail.com>
aidbutlr and others added 30 commits March 4, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.