[DO NOT MERGE] Draft PR: improving hermes middleware and telemetry with NeMo-Flow#28611
Draft
bbednarski9 wants to merge 4 commits into
Draft
Conversation
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…, while using the new request/response object shape Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…s_cli/nemo_flow_plugins.toml:1. It initializes NeMo-Flow observability with ATOF, ATIF, OpenTelemetry, and OpenInference exporters disabled, matching Hermes’ prior no-export-by-default behavior. Updated the bridge in hermes-agent/hermes_cli/nemo_flow_telemetry.py:370 so it always loads the bundled default, then overlays explicit/discovered plugins.toml files. The important behavior is: - bundled default alone does not suppress existing Hermes ATOF/ATIF fallback - explicit or discovered user TOML does own exporter setup - explicit TOML now suppresses both direct ATOF and direct ATIF fallback paths - the TOML is included in package data via hermes-agent/pyproject.toml:223 Added tests in hermes-agent/tests/hermes_cli/test_nemo_flow_telemetry.py:472, including a real PluginManager.invoke_hook() path using the actual Langfuse plugin hooks with a fake Langfuse client. That verifies Langfuse and the NeMo-Flow bridge can run on the same default config path. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
This was referenced May 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DRAFT PR - EXPERIMENTAL - DO NOT MERGE
What does this PR do?
This PR adds first-party NeMo-Flow telemetry support to Hermes as a host-owned
bridge on top of the existing Hermes middleware hook bus.
Hermes already exposes useful native middleware hooks, and the Langfuse plugin
proved that those hooks can support observability without patching the agent
loop. The gap is that Hermes did not have a canonical telemetry service that
could translate those native hooks into a stable, reusable telemetry stream for
ATOF, ATIF, OpenTelemetry, OpenInference, Langfuse, and future observer
plugins.
This change makes NeMo-Flow the required telemetry runtime dependency and adds a
non-invasive bridge that records Hermes lifecycle hooks into NeMo-Flow spans,
scopes, and events. The bridge is not implemented as a Hermes plugin, so plugin
callbacks continue to work through the existing
PluginManager.invoke_hook()path. This preserves the current Langfuse integration while enabling NeMo-Flow
as the canonical telemetry layer that future plugins can build on.
The PR also ships a default
plugins.tomlfor NeMo-Flow observability. Thedefault initializes the NeMo-Flow observability component but keeps ATOF, ATIF,
OpenTelemetry, and OpenInference exporters disabled, matching Hermes' previous
no-export-by-default behavior. Users can opt into exporter output with explicit
Hermes config/env settings or by supplying/discovering a NeMo-Flow
plugins.toml.Related Issue
#6642
#6741
Type of Change
Changes Made
nemo-flow==0.2.0rc3as a required Hermes dependency inpyproject.tomland updated
uv.lock.hermes_cli/nemo_flow_telemetry.py, a host-owned bridge that translatesHermes middleware hooks into NeMo-Flow session scopes, turn scopes, LLM spans,
tool spans, approval events, subagent scopes, ATOF, and ATIF.
hermes_cli/telemetry.py, a Hermes-facing observer facade for stableNeMo-Flow telemetry observers. It no-ops safely until the NeMo-Flow
telemetry_v1API is available in a released wheel.hermes_cli/plugins.pyso the NeMo-Flow bridge records hook payloadsbefore regular plugin callbacks run, while preserving existing callback return
values and plugin behavior.
hermes_cli/config.pywithtelemetry.nemo_flowdefaults for enablingthe bridge, payload limits,
plugins.tomldiscovery, and direct ATOF/ATIFfallback settings.
hermes_cli/nemo_flow_plugins.tomland included it in packagedata so installed Hermes distributions have a default NeMo-Flow observability
config.
plugins.tomlfiles via:
HERMES_NEMO_FLOW_PLUGINS_TOMLHERMES_NEMO_FLOW_DISCOVER_PLUGINS_TOMLtelemetry.nemo_flow.plugins_toml_pathtelemetry.nemo_flow.discover_plugins_tomlplugins.tomldoes not suppress direct Hermes ATOF/ATIFfallback settings
plugins.tomlmakes NeMo-Flow own exporter setupend to end
tests/hermes_cli/test_nemo_flow_telemetry.pycovering:telemetry_v1plugins.tomlinitialization and cleanupplugins.tomlbehavior with manual ATOF fallbackPluginManager.invoke_hook()while theNeMo-Flow bridge records the same hooks
plugins.tomlstartup model and local trajectory validation results.
How to Test
Run focused static checks:
Run the focused telemetry and plugin test suite:
Expected result from local validation:
112 passed.Verify the bundled default config is available through the installed package:
python3 -c "from importlib.resources import files; p = files('hermes_cli').joinpath('nemo_flow_plugins.toml'); print(p.is_file(), str(p))"Expected result:
Truefollowed by the resolvednemo_flow_plugins.tomlpath.
Optional trajectory smoke with a locally built NeMo-Flow
0.3.0wheel:HERMES_NEMO_FLOW_TELEMETRY=1.HERMES_NEMO_FLOW_PLUGINS_TOMLat a TOML config that enables ATOFand ATIF.
success, LLM error, tool success, approval events, session finalization,
and exporter flush.
ATIF-v1.6trajectory files are emitted.Local smoke result from this branch:
nemo_flow_version:0.3.0telemetry_available:true22["nemo_flow.telemetry.v1"]222Checklist
plugins.toml.NeMo-Flow startup path.
public telemetry facade, and Langfuse compatibility.
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/AFor New Skills
hermes --toolsets skills -q "Use the X skill to do Y"Screenshots / Logs