Skip to content

fix(gateway): lazy-install rebind incomplete in slack/feishu/matrix adapters (follow-up to #25014) #25028

@kshitijk4poor

Description

@kshitijk4poor

Summary

PR #25014 wired tools.lazy_deps.ensure() into the check_*_requirements() functions for Slack, Matrix, DingTalk, and Feishu, matching the existing Discord/Telegram pattern. The plumbing is correct, but for three of the four adapters the module-level globals that the adapter actually uses are not rebound after lazy-install succeeds. A user whose deps are missing on first start will see lazy_deps.ensure() install everything fine, check_*_requirements() return True, and then the adapter blow up at runtime with NameError / TypeError: 'NoneType' object is not callable because the names still point at stubs (or are unbound) from the original module-level try: ... except ImportError: block.

DingTalk is the gold-standard reference — it explicitly rebinds every name it uses. Slack/Feishu/Matrix should match.

The bug is masked in Docker because the gateway typically restarts after first-run install, picking up real imports cleanly on the second start. But for any long-lived gateway process (most non-Docker deployments), this surfaces immediately.

1. Slack — aiohttp not rebound (NameError)

gateway/platforms/slack.py lines 21–31 import four names at module top: AsyncApp, AsyncSocketModeHandler, AsyncWebClient, and aiohttp. PR #25014 specifically added aiohttp==3.13.3 to LAZY_DEPS["platform.slack"] so the lazy-install would pull it. But the new check_slack_requirements() (lines 75–102) declares:

global SLACK_AVAILABLE, AsyncApp, AsyncSocketModeHandler, AsyncWebClient

aiohttp is missing from global AND from the rebind block. Because the original try raises ImportError on import slack_bolt BEFORE reaching import aiohttp, the except branch leaves aiohttp unbound at module scope. The first call into _handle_file_upload-style code (gateway/platforms/slack.py:464 uses aiohttp.ClientSession(), line 468 uses aiohttp.ClientTimeout(...)) raises NameError: name 'aiohttp' is not defined.

Minimal repro of the pattern:

try:
    import nonexistent_pkg as fake_slack_bolt   # raises
    import json as aiohttp                       # never reached
except ImportError:
    pass

def check_slack_requirements():
    global AsyncApp                              # forgot aiohttp
    AsyncApp = object
    return True

check_slack_requirements()
aiohttp.ClientSession                            # NameError
NameError: name 'aiohttp' is not defined

Fix

global SLACK_AVAILABLE, AsyncApp, AsyncSocketModeHandler, AsyncWebClient, aiohttp
...
from slack_sdk.web.async_client import AsyncWebClient as _Client
import aiohttp as _aiohttp
...
aiohttp = _aiohttp

Related: pyproject.toml slack extra divergence

slack = ["slack-bolt==1.27.0", "slack-sdk==3.40.1"] does NOT include aiohttp, but LAZY_DEPS["platform.slack"] now does. pip install hermes-agent[slack] still produces a slack adapter that NameErrors on file uploads — only the lazy-install path includes aiohttp. Pre-existing inconsistency, but easy to fix in the same PR — add aiohttp==3.13.3 to the slack extra.

2. Feishu — lark_oapi symbols never rebound (TypeError on adapter init)

gateway/platforms/feishu.py lines 86–124 import ~25 names from lark_oapi (lark, CreateFileRequest, CreateMessageRequest, GetMessageRequest, EventDispatcherHandler, FeishuWSClient, FEISHU_DOMAIN, LARK_DOMAIN, CallBackCard, P2CardActionTriggerResponse, AccessTokenType, HttpMethod, BaseRequest, …). On ImportError the except branch sets all of them to None.

The new check_feishu_requirements() only rebinds FEISHU_AVAILABLE:

global FEISHU_AVAILABLE
...
import lark_oapi  # noqa: F401
FEISHU_AVAILABLE = True
return True

So lark_oapi ends up imported into sys.modules, but the local module's globals (lark, CreateMessageRequest, etc.) stay bound to None. FeishuAdapter instantiation hits:

  • gateway/platforms/feishu.py:4377FeishuWSClient(...)TypeError: 'NoneType' object is not callable
  • gateway/platforms/feishu.py:4380lark.LogLevel.INFOAttributeError: 'NoneType' object has no attribute 'LogLevel'
  • gateway/platforms/feishu.py:4409lark.Client.builder() → same
  • gateway/platforms/feishu.py:4507if "GetMessageRequest" in globals(): return GetMessageRequest.builder()... — the guard returns True (the name IS in globals, bound to None), so the .builder() call hits AttributeError: 'NoneType' object has no attribute 'builder'.

Fix

After lazy-install, rebind every name the adapter uses. Roughly:

global FEISHU_AVAILABLE, lark, CreateFileRequest, CreateFileRequestBody, CreateImageRequest, CreateImageRequestBody, CreateMessageRequest, CreateMessageRequestBody, GetChatRequest, GetMessageRequest, GetMessageResourceRequest, P2ImMessageMessageReadV1, ReplyMessageRequest, ReplyMessageRequestBody, UpdateMessageRequest, UpdateMessageRequestBody, AccessTokenType, HttpMethod, FEISHU_DOMAIN, LARK_DOMAIN, BaseRequest, CallBackCard, P2CardActionTriggerResponse, EventDispatcherHandler, FeishuWSClient, GetApplicationRequest
...
import lark_oapi as _lark
from lark_oapi.api.application.v6 import GetApplicationRequest as _GAR
from lark_oapi.api.im.v1 import (CreateFileRequest as _CFR, ...)
...
lark = _lark
GetApplicationRequest = _GAR
CreateFileRequest = _CFR
# ...etc
FEISHU_AVAILABLE = True

(Or factor the imports into a helper that returns a dict and globals().update(...).)

3. Matrix — mautrix.types enums stay as stubs

gateway/platforms/matrix.py lines 42–94 import a set of types from mautrix.types (EventType, RoomID, EventID, ContentURI, SyncToken, UserID, PaginationDirection, PresenceState, RoomCreatePreset, TrustState). On ImportError they're bound to stub strings/classes. The new check_matrix_requirements() (lines 226–270) verifies with import mautrix # noqa but does not rebind any of those names.

MatrixAdapter._run calls client.add_event_handler(EventType.ROOM_MESSAGE, ...) (line 814) — after lazy-install in the same process, this passes the stub class's "m.room.message" string instead of the real mautrix.types.EventType.ROOM_MESSAGE enum member. Whether this functions depends on whether mautrix accepts the string by value, but at minimum:

  • TrustState.UNVERIFIED (line 697) — stub has UNVERIFIED = 0, real value comes from a different code path; mismatch.
  • RoomCreatePreset.PRIVATE (line 2271–2274) — stub returns the string "private_chat", mautrix expects its enum.

This is less catastrophic than slack/feishu because some matrix paths re-import locally inside methods (from mautrix.api import HTTPAPI at line 551), but the type-level checks throughout the long-running adapter event loop will misbehave.

Fix

Same shape as Feishu — declare and rebind every imported name from mautrix.types after the lazy-install succeeds.

4. (Bonus) Refactor opportunity

The six check_*_requirements() functions have grown into copy-paste variants of the same lazy-install template, and this issue is the direct consequence of "remembered to rebind name X but forgot name Y." A helper in tools/lazy_deps.py like:

def ensure_and_bind(key: str, importer: Callable[[], dict[str, Any]],
                   target_globals: dict, *, prompt: bool = False) -> bool:
    """ensure(key); then run importer() → dict of name→value;
    target_globals.update(dict); return True on success."""

would centralize the pattern and make "did you remember every name" a one-line check rather than a per-platform audit.

Suggested test

tests/gateway/test_lazy_install_paths.py (new). For each of slack/feishu/matrix/dingtalk:

  1. Pre-poison the module's globals to None / undefined for the names the adapter uses.
  2. Monkeypatch tools.lazy_deps.ensure to a no-op (or have it actually run in CI).
  3. Call check_*_requirements().
  4. Assert: every name listed in the test is bound to a non-None value in the module's globals.

This would have caught the slack aiohttp and feishu lark gaps before merge, and gives a clean contract for any future platform added to LAZY_DEPS.

Repro priority

  1. Slack — guaranteed NameError on first file upload after lazy-install, no Docker-restart workaround inside the same process.
  2. Feishu — guaranteed TypeError on FeishuAdapter instantiation after lazy-install (the adapter is built immediately after check_feishu_requirements() returns True, so this fires before any user message is processed).
  3. Matrix — partial; depends on which code paths the deployment exercises.

DingTalk is unaffected — its rebind block is correct and serves as the reference.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliveryplatform/feishuFeishu / Lark adapterplatform/matrixMatrix adapter (E2EE)platform/slackSlack app adaptertype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions