Skip to content

[Feature]: Support Feishu/Lark document comment intelligent reply #11465

@liujinkun2025

Description

@liujinkun2025

Problem or Use Case

Hermes currently integrates with Feishu IM via gateway/platforms/feishu.py, but Feishu's document comments — one of the most
common collaboration surfaces inside Lark/Feishu — are not handled at all.

Teams who use Feishu for docs (spec reviews, doc Q&A, inline feedback) have no way to get AI replies on comment threads. The only
workaround is copy/pasting doc context into an IM chat, which loses the inline anchor (quoted paragraph, comment thread history) and
forces a context switch.

Concretely, three scenarios are unsupported today:

  1. User @mentions the bot on a local comment (pinned to a selected paragraph) and asks a question about that paragraph.
  2. User @mentions the bot on a whole-document comment asking about the doc overall.
  3. Multi-turn follow-up within the same doc — e.g. asking "what did you mean by X earlier?" in a different comment thread on the same
    doc.

In our internal usage this is by far the most natural place to invoke an AI assistant inside Feishu — more natural than DMs for
doc-centric work.

Proposed Solution

A new gateway handler for drive.notice.comment_add_v1 events. High-level design:

  1. Event routing — filter by notice_type (only add_comment / add_reply), skip self-authored events, skip events not
    addressed to the bot.
  2. Context assembly — parallel fetch of doc metadata + comment details, build either a local-reply timeline or whole-doc timeline
    (with size caps: 20 local / 12 whole), strip @bot noise, resolve embedded docs_link / wiki links via GET /wiki/v2/spaces/get_node.
  3. Agent — same LLM resolution path as IM messages (_resolve_model_and_runtime); agent has access to feishu_doc_read +
    feishu_drive_* tools for deeper lookup.
  4. Reply delivery — local comments → reply_to_comment (with automatic fallback to whole-comment on 1069302); whole comments →
    new_comments. Long replies auto-chunked at 4000 chars. OK reaction added/removed for typing indication.
  5. Per-doc session memory — key comment-doc:{file_type}:{file_token}, 50-msg cap, 1h TTL, so follow-ups across threads on the
    same doc stay coherent.
  6. Access control — three-tier rule resolution (exact docwiki:{token}* wildcard → top-level → defaults),
    field-by-field fallback for enabled/policy/allow_from, two policy modes:
    • allowlist — only listed open_ids
    • pairing — listed open_ids plus CLI-approved users
      Every user that triggers a reply must be explicitly listed — there is no implicit allow-all mode. Config file is mtime-cached
      (hot reload, no restart). Ships with a CLI helper: python -m gateway.platforms.feishu_comment_rules {status|check|pairing}.

New files:

  • tools/feishu_doc_tool.py (1 tool)
  • tools/feishu_drive_tool.py (4 tools)
  • gateway/platforms/feishu_comment.py (handler + prompt + orchestration)
  • gateway/platforms/feishu_comment_rules.py (access control + CLI)

Required Feishu app scopes (tenant-level):
drive:drive.metadata:readonly, docs:document.comment:read, docs:document.comment:create, docs:document.comment:write_only,
docx:document:readonly.

⚠️ A PR implementing this is already open: #11023 — would love a review on the approach.

Alternatives Considered

  1. Route everything through Feishu IM — loses the inline anchor (which paragraph is the user asking about?) and forces the user
    to re-paste context.
  2. Build this as an external skill instead of a bundled gateway handler — rejected because comment handling is driven by platform
    webhook events, not user-invoked skills. Per CONTRIBUTING.md, gateway/platform integrations belong in the core repo; this is
    analogous to existing feishu.py, slack.py, discord.py handlers.
  3. Per-user session memory instead of per-doc — rejected because multiple users commonly comment on the same doc and expect
    shared context within that doc.

Feature Type

Gateway / messaging improvement

Scope

Large (new module or significant refactor)

Contribution

  • I'd like to implement this myself and submit a PR

Debug Report (optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions