Skip to content

Fixes #2018 - REST /tools list endpoint returns stale visibility data after tool update in multi-worker deployments#2020

Merged
crivetimihai merged 2 commits intomainfrom
2018-cache-invalidation-tools-list
Jan 10, 2026
Merged

Fixes #2018 - REST /tools list endpoint returns stale visibility data after tool update in multi-worker deployments#2020
crivetimihai merged 2 commits intomainfrom
2018-cache-invalidation-tools-list

Conversation

@crivetimihai
Copy link
Copy Markdown
Member

Closes #2018 - follow-up from #1915

Root Cause

Cache invalidation code existed and published messages to Redis pubsub channel mcpgw:cache:invalidate, but no subscriber existed to listen for these messages. In multi-worker deployments (e.g., 3 gunicorn workers behind nginx), when Worker A updated a tool's visibility and invalidated its local cache, Workers B and C continued serving stale cached data.

Solution

Added CacheInvalidationSubscriber class that:

  • Subscribes to Redis pubsub channel mcpgw:cache:invalidate on application startup
  • Listens for invalidation messages in a background asyncio task
  • Clears local in-memory caches when messages are received
  • Handles graceful shutdown and Redis unavailability

Message Formats Supported

Message Action
registry:tools Clear all tools from RegistryCache
registry:prompts Clear all prompts from RegistryCache
registry:resources Clear all resources from RegistryCache
registry:agents Clear all agents from RegistryCache
tool_lookup:{name} Clear specific tool from ToolLookupCache
tool_lookup:gateway:{id} Clear all tools for gateway from ToolLookupCache
admin:{prefix} Clear admin stats with prefix from AdminStatsCache

Changes

  • mcpgateway/cache/registry_cache.py: Added CacheInvalidationSubscriber class (~220 lines)
  • mcpgateway/main.py: Start/stop subscriber in application lifespan
  • tests/unit/mcpgateway/cache/test_cache_invalidation_subscriber.py: Added 14 regression tests
  • scripts/test_mcp_token_scoping.py: Manual MCP client test for token scoping verification

Testing

Automated Tests (14 tests)

pytest tests/unit/mcpgateway/cache/test_cache_invalidation_subscriber.py -v

Tests cover:

  • Registry cache invalidation (tools, prompts, resources)
  • Tool lookup cache invalidation (by name, by gateway)
  • Admin stats cache invalidation
  • Graceful Redis unavailability handling
  • Concurrent invalidation message processing
  • Main regression test: Tool visibility update clears caches across workers

Manual Verification

python scripts/test_mcp_token_scoping.py

Results (all passing):

Test Expected Actual
Admin no teams key 5 tools 5 tools
Admin teams:null 5 tools 5 tools
Admin teams:[] 4 tools 4 tools
Admin + correct team 5 tools 5 tools
Admin + wrong team Rejected Rejected
Non-admin no teams 4 tools 4 tools
Non-admin + team 5 tools 5 tools
Non-admin teams:[] 4 tools 4 tools

Deployment Notes

  • Requires Redis for cross-worker cache synchronization
  • Subscriber starts automatically on application boot
  • Gracefully handles Redis unavailability (logs warning, continues without cross-worker sync)
  • No configuration changes required

Checklist

  • Linting passes (make flake8 pylint bandit)
  • Type checking passes (make mypy pyright)
  • Unit tests pass (14 new tests)
  • Manual MCP client tests pass (8/8 RPC + 4/4 transport)
  • No breaking changes to existing APIs

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Add unit tests for CacheInvalidationSubscriber to prevent regression of
the stale visibility data bug (#2018).

Tests verify:
- Registry cache invalidation for tools/prompts/resources
- Tool lookup cache invalidation by name and gateway
- Admin stats cache invalidation
- Graceful handling of Redis unavailability
- Concurrent invalidation message processing
- Main regression scenario: tool visibility update clears caches

Also updates .flake8 to ignore DAR docstring warnings for test file.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
@crivetimihai crivetimihai marked this pull request as ready for review January 10, 2026 12:38
@crivetimihai crivetimihai merged commit 40e3d8a into main Jan 10, 2026
52 checks passed
@crivetimihai crivetimihai deleted the 2018-cache-invalidation-tools-list branch January 10, 2026 12:41
kcostell06 pushed a commit to kcostell06/mcp-context-forge that referenced this pull request Feb 24, 2026
… data after tool update in multi-worker deployments (IBM#2020)

* Cache fix for token scoping

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* test: add regression tests for cross-worker cache invalidation

Add unit tests for CacheInvalidationSubscriber to prevent regression of
the stale visibility data bug (IBM#2018).

Tests verify:
- Registry cache invalidation for tools/prompts/resources
- Tool lookup cache invalidation by name and gateway
- Admin stats cache invalidation
- Graceful handling of Redis unavailability
- Concurrent invalidation message processing
- Main regression scenario: tool visibility update clears caches

Also updates .flake8 to ignore DAR docstring warnings for test file.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: REST /tools list endpoint returns stale visibility data after tool update

1 participant