feat(observability): observer-grade telemetry hooks + NeMo-Relay plugin (salvage #29722)#38190
Closed
kshitijk4poor wants to merge 3 commits into
Closed
feat(observability): observer-grade telemetry hooks + NeMo-Relay plugin (salvage #29722)#38190kshitijk4poor wants to merge 3 commits into
kshitijk4poor wants to merge 3 commits into
Conversation
Adds backend-neutral observer hooks for plugins: session, turn, API request, tool, approval, and subagent lifecycle events with stable correlation IDs (session_id, task_id, turn_id, api_request_id, tool_call_id, parent/child subagent ids). Extends VALID_HOOKS with api_request_error and subagent_start. Hot path is zero-cost when no plugin subscribes: has_hook()/presence checks gate all payload construction, request payloads are returned by reference when no middleware rewrites, and the sanitized response payload no longer embeds raw response objects. Bundles the optional NeMo-Relay observability plugin (plugins/observability/nemo_relay) as an in-repo consumer of the new hooks, peer to the existing langfuse plugin. Fails open when the optional nemo-relay package is not installed. Authored-by: Bryan Bednarski <bbednarski@nvidia.com> Salvaged from NousResearch#29722 onto current main.
The salvaged PR incidentally stripped a trailing blank line from two unrelated test files (test_file_tools_cwd_resolution.py, test_tool_search.py). Restore them to keep the salvage diff scoped to the observability feature.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvages #29722 onto current
main. Authorship preserved (Bryan Bednarski's commit), with a small follow-up cleanup commit from me.What this is
Phase 1 of the NeMo telemetry stack, restructured per the #29722 review into an observer-only PR with an immediate in-repo consumer:
session_id,task_id,turn_id,api_request_id,tool_call_id, parent/child subagent ids). ExtendsVALID_HOOKSwithapi_request_errorandsubagent_start.plugins/observability/nemo_relay) as a real consumer of the new hooks, peer to the existinglangfuseplugin. Fails open when the optionalnemo-relaypackage isn't installed.Review concerns from #29722 — all addressed
has_hook()presence checks; request payloads return by reference when no middleware rewrites; the sanitized response payload no longer embeds raw response objects; redundant deepcopies removed. Contributor benchmark: no-listener path dropped from ~20 ms → ~0.0004 ms at 5 MB context.nemo_relayas the in-repo consumer.invoke_hook(setdefault), not ~20 hand-edited call sites.post_tool_calldual-emit: hardcoded set replaced with sharedAGENT_RUNTIME_POST_HOOK_TOOL_NAMESfrozenset +agent_runtime_owns_post_tool_hook()helper.api_request_idformat: docs now state it's opaque — "do not parse its string format."Follow-up commit (mine)
test_file_tools_cwd_resolution.py,test_tool_search.py) that the original branch incidentally stripped, to keep the salvage scoped to the feature.Testing
tests/run_agent/green.HERMES_HOME): verified (1)has_hook()returns False with no plugin → hot path skipped; (2) a registered observer fires withturn_id/api_request_idand auto-injected schema version; (3)nemo_relayregister() succeeds and all hooks no-op (don't crash) when thenemo-relaypackage is absent →_get_runtime()returns None.Attribution
Cherry-picked from #29722 with Bryan Bednarski's authorship preserved in git log. Requires the
AUTHOR_MAPchore PR to merge first socontributor_audit.pypasses.Closes #29722