Skip to content

[DO NOT MERGE] Draft PR: improving hermes middleware and telemetry with NeMo-Flow#28611

Draft
bbednarski9 wants to merge 4 commits into
NousResearch:mainfrom
bbednarski9:bbednarski/nmf-124-hermes-agent-nemoflow-refactor
Draft

[DO NOT MERGE] Draft PR: improving hermes middleware and telemetry with NeMo-Flow#28611
bbednarski9 wants to merge 4 commits into
NousResearch:mainfrom
bbednarski9:bbednarski/nmf-124-hermes-agent-nemoflow-refactor

Conversation

@bbednarski9

@bbednarski9 bbednarski9 commented May 19, 2026

Copy link
Copy Markdown
Contributor

DRAFT PR - EXPERIMENTAL - DO NOT MERGE

What does this PR do?

This PR adds first-party NeMo-Flow telemetry support to Hermes as a host-owned
bridge on top of the existing Hermes middleware hook bus.

Hermes already exposes useful native middleware hooks, and the Langfuse plugin
proved that those hooks can support observability without patching the agent
loop. The gap is that Hermes did not have a canonical telemetry service that
could translate those native hooks into a stable, reusable telemetry stream for
ATOF, ATIF, OpenTelemetry, OpenInference, Langfuse, and future observer
plugins.

This change makes NeMo-Flow the required telemetry runtime dependency and adds a
non-invasive bridge that records Hermes lifecycle hooks into NeMo-Flow spans,
scopes, and events. The bridge is not implemented as a Hermes plugin, so plugin
callbacks continue to work through the existing PluginManager.invoke_hook()
path. This preserves the current Langfuse integration while enabling NeMo-Flow
as the canonical telemetry layer that future plugins can build on.

The PR also ships a default plugins.toml for NeMo-Flow observability. The
default initializes the NeMo-Flow observability component but keeps ATOF, ATIF,
OpenTelemetry, and OpenInference exporters disabled, matching Hermes' previous
no-export-by-default behavior. Users can opt into exporter output with explicit
Hermes config/env settings or by supplying/discovering a NeMo-Flow
plugins.toml.

Related Issue

#6642
#6741

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • Added nemo-flow==0.2.0rc3 as a required Hermes dependency in pyproject.toml
    and updated uv.lock.
  • Added hermes_cli/nemo_flow_telemetry.py, a host-owned bridge that translates
    Hermes middleware hooks into NeMo-Flow session scopes, turn scopes, LLM spans,
    tool spans, approval events, subagent scopes, ATOF, and ATIF.
  • Added hermes_cli/telemetry.py, a Hermes-facing observer facade for stable
    NeMo-Flow telemetry observers. It no-ops safely until the NeMo-Flow
    telemetry_v1 API is available in a released wheel.
  • Updated hermes_cli/plugins.py so the NeMo-Flow bridge records hook payloads
    before regular plugin callbacks run, while preserving existing callback return
    values and plugin behavior.
  • Updated hermes_cli/config.py with telemetry.nemo_flow defaults for enabling
    the bridge, payload limits, plugins.toml discovery, and direct ATOF/ATIF
    fallback settings.
  • Added bundled hermes_cli/nemo_flow_plugins.toml and included it in package
    data so installed Hermes distributions have a default NeMo-Flow observability
    config.
  • Added startup support for explicit and discovered NeMo-Flow plugins.toml
    files via:
    • HERMES_NEMO_FLOW_PLUGINS_TOML
    • HERMES_NEMO_FLOW_DISCOVER_PLUGINS_TOML
    • telemetry.nemo_flow.plugins_toml_path
    • telemetry.nemo_flow.discover_plugins_toml
  • Preserved Hermes-native exporter behavior:
    • bundled default plugins.toml does not suppress direct Hermes ATOF/ATIF
      fallback settings
    • explicit or discovered user plugins.toml makes NeMo-Flow own exporter setup
      end to end
  • Added tests in tests/hermes_cli/test_nemo_flow_telemetry.py covering:
    • API and tool hook translation into NeMo-Flow spans
    • blocked tool span closure
    • plugin dispatcher return-value preservation
    • public telemetry facade behavior with and without telemetry_v1
    • explicit plugins.toml initialization and cleanup
    • bundled default plugins.toml behavior with manual ATOF fallback
    • direct ATIF export fallback
    • Langfuse plugin hooks running through PluginManager.invoke_hook() while the
      NeMo-Flow bridge records the same hooks
  • Updated the NeMo-Flow/Langfuse planning doc with the default plugins.toml
    startup model and local trajectory validation results.

How to Test

  1. Run focused static checks:

    uv run ruff check \
      hermes_cli/nemo_flow_telemetry.py \
      hermes_cli/telemetry.py \
      tests/hermes_cli/test_nemo_flow_telemetry.py \
      tests/plugins/test_langfuse_plugin.py \
      hermes_cli/plugins.py \
      pyproject.toml
    
    python3 -m py_compile \
      hermes_cli/nemo_flow_telemetry.py \
      hermes_cli/telemetry.py \
      tests/hermes_cli/test_nemo_flow_telemetry.py
    
    git diff --check
  2. Run the focused telemetry and plugin test suite:

    uv run pytest -q \
      tests/hermes_cli/test_nemo_flow_telemetry.py \
      tests/hermes_cli/test_plugins.py \
      tests/plugins/test_observer_middleware_contract.py \
      tests/test_model_tools.py \
      tests/plugins/test_langfuse_plugin.py

    Expected result from local validation: 112 passed.

  3. Verify the bundled default config is available through the installed package:

    python3 -c "from importlib.resources import files; p = files('hermes_cli').joinpath('nemo_flow_plugins.toml'); print(p.is_file(), str(p))"

    Expected result: True followed by the resolved nemo_flow_plugins.toml
    path.

  4. Optional trajectory smoke with a locally built NeMo-Flow 0.3.0 wheel:

    • Enable HERMES_NEMO_FLOW_TELEMETRY=1.
    • Point HERMES_NEMO_FLOW_PLUGINS_TOML at a TOML config that enables ATOF
      and ATIF.
    • Run representative Hermes hook sequences for two sessions covering LLM
      success, LLM error, tool success, approval events, session finalization,
      and exporter flush.
    • Confirm ATOF JSONL and ATIF ATIF-v1.6 trajectory files are emitted.

    Local smoke result from this branch:

    • nemo_flow_version: 0.3.0
    • telemetry_available: true
    • stable event count: 22
    • schema versions: ["nemo_flow.telemetry.v1"]
    • ATOF event count: 22
    • ATIF trajectory files: 2

Checklist

  • Added a bundled default NeMo-Flow plugins.toml.
  • Included the bundled TOML in package data.
  • Preserved existing Hermes plugin callback behavior.
  • Verified the Langfuse plugin remains functional through the default
    NeMo-Flow startup path.
  • Added focused tests for default TOML, explicit TOML, exporter fallback,
    public telemetry facade, and Langfuse compatibility.
  • Ran focused lint, py_compile, whitespace, packaging, and pytest checks.
  • Documented the startup model and validation results.

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform:

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

For New Skills

  • This skill is broadly useful to most users (if bundled) — see Contributing Guide
  • SKILL.md follows the standard format (frontmatter, trigger conditions, steps, pitfalls)
  • No external dependencies that aren't already available (prefer stdlib, curl, existing Hermes tools)
  • I've tested the skill end-to-end: hermes --toolsets skills -q "Use the X skill to do Y"

Screenshots / Logs

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…, while using the new request/response object shape

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…s_cli/nemo_flow_plugins.toml:1. It

  initializes NeMo-Flow observability with ATOF, ATIF, OpenTelemetry, and OpenInference exporters disabled,
  matching Hermes’ prior no-export-by-default behavior.

  Updated the bridge in hermes-agent/hermes_cli/nemo_flow_telemetry.py:370 so it always loads the bundled
  default, then overlays explicit/discovered plugins.toml files. The important behavior is:

  - bundled default alone does not suppress existing Hermes ATOF/ATIF fallback
  - explicit or discovered user TOML does own exporter setup
  - explicit TOML now suppresses both direct ATOF and direct ATIF fallback paths
  - the TOML is included in package data via hermes-agent/pyproject.toml:223

  Added tests in hermes-agent/tests/hermes_cli/test_nemo_flow_telemetry.py:472, including a real
  PluginManager.invoke_hook() path using the actual Langfuse plugin hooks with a fake Langfuse client. That
  verifies Langfuse and the NeMo-Flow bridge can run on the same default config path.

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/plugins Plugin system and bundled plugins labels May 19, 2026
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants