Skip to content

test: hermeticize inference and CLI env-sensitive tests#4

Merged
DIZ-admin merged 2 commits into
mainfrom
fix/hermetic-inference-tests-pr-clean
May 16, 2026
Merged

test: hermeticize inference and CLI env-sensitive tests#4
DIZ-admin merged 2 commits into
mainfrom
fix/hermetic-inference-tests-pr-clean

Conversation

@DIZ-admin

@DIZ-admin DIZ-admin commented May 16, 2026

Copy link
Copy Markdown
Owner

Summary

  • make provider discovery count assertion resilient to new bundled profiles
  • isolate runtime-provider tests from live CUSTOM_BASE_URL/OPENROUTER_BASE_URL env leakage
  • harden gateway and web-server CLI tests against host-specific systemd/PATH state

Validation

  • pytest -q tests/providers tests/hermes_cli/test_runtime_provider_resolution.py tests/hermes_cli/test_ollama_cloud_auth.py
  • pytest -x -q tests/hermes_cli

Notes

This is a fork PR for test isolation only. Changes are minimal and scoped to tests.

Summary by CodeRabbit

  • Tests
    • Improved test isolation by clearing provider-related environment variables to prevent interference from local developer configurations.
    • Enhanced test determinism for systemd unit generation and provider discovery validation.
    • Updated provider profile discovery tests to enforce minimum count validation for more robust verification.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 16, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@DIZ-admin has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 42 minutes and 25 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ad7ab07-b717-4918-9614-aca9e59a6106

📥 Commits

Reviewing files that changed from the base of the PR and between 30b5af62847ee8d4d519dae829cec9e2fcf99a84 and 3f310cb.

📒 Files selected for processing (2)
  • tests/conftest.py
  • tests/hermes_cli/test_web_server.py
📝 Walkthrough

Walkthrough

This PR adds environment variable cleanup, deterministic monkeypatching, and assertion relaxation across the test suite to reduce flakiness from local developer overrides, implicit environment resolution, and brittle exact-count assertions.

Changes

Test Suite Hermeticization and Determinism Improvements

Layer / File(s) Summary
Global test fixture environment cleanup
tests/conftest.py
The autouse _hermetic_environment fixture now unsets additional provider base-URL override variables (CUSTOM_BASE_URL, OPENROUTER_BASE_URL, OPENCODE_ZEN_BASE_URL, OPENCODE_GO_BASE_URL) to isolate tests from local developer configuration.
Provider auth environment cleanup in CLI tests
tests/hermes_cli/test_ollama_cloud_auth.py, tests/hermes_cli/test_runtime_provider_resolution.py
Specific CLI tests for Ollama cloud credentials and custom runtime provider resolution now explicitly clean up CUSTOM_BASE_URL and OPENROUTER_BASE_URL before running validation, preventing prior environment state from interfering with credential handling assertions.
Systemd unit tests with deterministic identity and home mocking
tests/hermes_cli/test_gateway_service.py
Two systemd unit generation tests now monkeypatch _system_service_identity and _hermes_home_for_target_user to establish deterministic target-user identity and HERMES_HOME path resolution, removing dependency on implicit home directory environment resolution.
Test isolation through monkeypatching and assertion relaxation
tests/hermes_cli/test_web_server.py, tests/providers/test_plugin_discovery.py
Wrapper-alias creation test stubs check_alias_collision to no-op to bypass collision detection; provider discovery test relaxes exact profile count assertion from 33 to at least 34 to accommodate registration variations.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

  • DIZ-admin/hermes-agent#3: Overlapping test hermeticization changes including environment variable cleanup in conftest.py and specific CLI tests, systemd identity/home mocking, and provider discovery count relaxation.

Poem

🐰 Hermetic tests now run serene,
No env vars lurking unseen,
Mocked identities, assertions loose,
Flaky failures now on the moose! 🫖

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test: hermeticize inference and CLI env-sensitive tests' directly and specifically describes the main objective of the PR: making test cases hermetic by isolating them from environment variables and host-specific dependencies.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/hermetic-inference-tests-pr-clean

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/conftest.py (1)

356-359: ⚡ Quick win

Remove redundant OPENROUTER_BASE_URL deletion.

OPENROUTER_BASE_URL is already listed in _CREDENTIAL_NAMES (line 145) and will be removed by the credential filter loop (lines 302-304). The explicit deletion here is redundant and contradicts the comment stating these are env vars "that don't match the generic credential-shaped env-var filter above."

The other three variables (CUSTOM_BASE_URL, OPENCODE_ZEN_BASE_URL, OPENCODE_GO_BASE_URL) are correctly placed here since they're not in _CREDENTIAL_NAMES.

♻️ Remove redundant line
 monkeypatch.delenv("CUSTOM_BASE_URL", raising=False)
-monkeypatch.delenv("OPENROUTER_BASE_URL", raising=False)
 monkeypatch.delenv("OPENCODE_ZEN_BASE_URL", raising=False)
 monkeypatch.delenv("OPENCODE_GO_BASE_URL", raising=False)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/conftest.py` around lines 356 - 359, The explicit call to
monkeypatch.delenv("OPENROUTER_BASE_URL", raising=False) is redundant because
OPENROUTER_BASE_URL is included in the _CREDENTIAL_NAMES set and already removed
by the credential filter loop (see the loop that iterates over
_CREDENTIAL_NAMES); remove only the redundant
monkeypatch.delenv("OPENROUTER_BASE_URL", raising=False) line and leave the
other deletions for CUSTOM_BASE_URL, OPENCODE_ZEN_BASE_URL, and
OPENCODE_GO_BASE_URL in place so the comment about non-credential-shaped env
vars remains accurate.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/providers/test_plugin_discovery.py`:
- Around line 49-59: The test_all_profiles_register currently asserts a
hardcoded count (expected_min_profiles) which flakes; instead after calling
_clear_provider_caches() and list_providers() compute discovered names (as you
already do into names) and assert that the set of bundled provider names is a
subset of discovered names—e.g. obtain the canonical bundled list from Providers
(add or call a helper such as providers.get_bundled_provider_names() or
providers.list_bundled_provider_manifests() or read
providers.BUNDLED_PLUGIN_DIR) and replace the len(names) >=
expected_min_profiles assertion with assert
set(bundled_names).issubset(set(names)), keeping references to
test_all_profiles_register, _clear_provider_caches, list_providers and names.

---

Nitpick comments:
In `@tests/conftest.py`:
- Around line 356-359: The explicit call to
monkeypatch.delenv("OPENROUTER_BASE_URL", raising=False) is redundant because
OPENROUTER_BASE_URL is included in the _CREDENTIAL_NAMES set and already removed
by the credential filter loop (see the loop that iterates over
_CREDENTIAL_NAMES); remove only the redundant
monkeypatch.delenv("OPENROUTER_BASE_URL", raising=False) line and leave the
other deletions for CUSTOM_BASE_URL, OPENCODE_ZEN_BASE_URL, and
OPENCODE_GO_BASE_URL in place so the comment about non-credential-shaped env
vars remains accurate.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 38e0ef5b-745a-4131-9b80-f41829cef10b

📥 Commits

Reviewing files that changed from the base of the PR and between c128257 and 30b5af62847ee8d4d519dae829cec9e2fcf99a84.

📒 Files selected for processing (6)
  • tests/conftest.py
  • tests/hermes_cli/test_gateway_service.py
  • tests/hermes_cli/test_ollama_cloud_auth.py
  • tests/hermes_cli/test_runtime_provider_resolution.py
  • tests/hermes_cli/test_web_server.py
  • tests/providers/test_plugin_discovery.py

Comment on lines +49 to +59
def test_all_profiles_register():
"""After discovery, the registry must contain at least the expected bundled profiles."""
_clear_provider_caches()
from providers import list_providers

profiles = list_providers()
names = sorted(p.name for p in profiles)
assert len(names) == 33, f"Expected 33 profiles, got {len(names)}: {names}"
expected_min_profiles = 34
assert len(names) >= expected_min_profiles, (
f"Expected at least {expected_min_profiles} profiles, got {len(names)}: {names}"
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Replace count-threshold assertion with an invariant over discovered plugin names.

This still behaves like a change-detector test and will flap as provider inventory evolves. Assert that discovered provider names are a superset of bundled plugin directories (or manifest names) instead of enforcing >= 34.

Proposed refactor
 def test_all_profiles_register():
-    """After discovery, the registry must contain at least the expected bundled profiles."""
+    """After discovery, all bundled provider plugins should be registered."""
     _clear_provider_caches()
     from providers import list_providers

     profiles = list_providers()
     names = sorted(p.name for p in profiles)
-    expected_min_profiles = 34
-    assert len(names) >= expected_min_profiles, (
-        f"Expected at least {expected_min_profiles} profiles, got {len(names)}: {names}"
-    )
+    plugins_dir = REPO_ROOT / "plugins" / "model-providers"
+    bundled = sorted(
+        p.name for p in plugins_dir.iterdir()
+        if p.is_dir() and (p / "plugin.yaml").exists() and (p / "__init__.py").exists()
+    )
+    missing = sorted(set(bundled) - set(names))
+    assert not missing, f"Bundled providers missing from registry: {missing}"

     # Spot-check representative providers from different categories
     for required in (

As per coding guidelines, “DO NOT write change-detector tests (fail whenever expected-to-change data updates: model catalogs, config version, enumeration counts, hardcoded lists); … write invariant tests instead.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/providers/test_plugin_discovery.py` around lines 49 - 59, The
test_all_profiles_register currently asserts a hardcoded count
(expected_min_profiles) which flakes; instead after calling
_clear_provider_caches() and list_providers() compute discovered names (as you
already do into names) and assert that the set of bundled provider names is a
subset of discovered names—e.g. obtain the canonical bundled list from Providers
(add or call a helper such as providers.get_bundled_provider_names() or
providers.list_bundled_provider_manifests() or read
providers.BUNDLED_PLUGIN_DIR) and replace the len(names) >=
expected_min_profiles assertion with assert
set(bundled_names).issubset(set(names)), keeping references to
test_all_profiles_register, _clear_provider_caches, list_providers and names.

@DIZ-admin DIZ-admin force-pushed the fix/hermetic-inference-tests-pr-clean branch from 4d7758d to 3f310cb Compare May 16, 2026 19:30
@DIZ-admin DIZ-admin merged commit 2237893 into main May 16, 2026
4 checks passed
@DIZ-admin DIZ-admin deleted the fix/hermetic-inference-tests-pr-clean branch May 16, 2026 19:34
DIZ-admin added a commit that referenced this pull request May 16, 2026
* test: hermeticize web server wrapper alias check

* test: remove redundant openrouter env cleanup

---------

Co-authored-by: Hermes Agent <hermes-agent@users.noreply.github.com>
DIZ-admin pushed a commit that referenced this pull request May 26, 2026
…y prefixed

Companion to the NousResearchGH-25255 incoming-strip fix from @hayka-pacha. Without
this, build_anthropic_kwargs unconditionally added 'mcp_' to every tool
name in step 3, so a native MCP server tool registered as
'mcp_composio_X' was sent as 'mcp_mcp_composio_X' on the wire. The
incoming strip only removes ONE prefix, which still worked on first
call, but on subsequent calls the model pattern-matched the
single-prefixed form from message history and produced names that
stripped to 'composio_X' — registry miss, dispatch fail.

The history-rewrite block (#4) already has this guard. Apply the same
guard to the schema-rewrite block (#3) so round-trip is symmetric.

Added 4 outgoing-side tests. Existing 7 incoming-side tests still pass.

Author map: hayka-pacha added for PR NousResearch#25270 salvage attribution.

Refs NousResearchGH-25255.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants