[Bug]: custom_providers.models.context_length not propagated to auxiliary compression feasibility check

## Bug Description

When `context_length` is set in `custom_providers.models`, it correctly applies to the **main model** context window, but the **auxiliary compression feasibility check** (`_check_compression_model_feasibility`) does NOT resolve it for the compression model when the compression model falls back to the main model.

This produces an **incorrect warning** about context mismatch, and **auto-lowers the compression threshold** unnecessarily:

```
⚠ Compression model (glm-5.1) context is 128,000 tokens, but the main model's compression threshold was 130,000 tokens. Auto-lowered this session's threshold to 128,000 tokens so compression can run.
```

Even though the user has configured `context_length: 200000` for the model.

## Steps to Reproduce

1. Configure `custom_providers` with a model that has a `context_length` override different from the built-in default:

```yaml
model:
  default: glm-5.1
  provider: custom
  base_url: http://localhost:8317/v1
  api_key: sk-xxx

custom_providers:
  - name: Local (localhost:8317)
    base_url: http://localhost:8317/v1
    api_key: sk-xxx
    model: glm-5.1
    models:
      glm-5.1:
        context_length: 200000
```

2. Do NOT set `auxiliary.compression.model` or `auxiliary.compression.context_length` (so compression falls back to the main model).
3. Set `compression.threshold: 0.65` (default).
4. Start Hermes.
5. Observe the warning on startup — 200K context should be recognized but 128K is reported instead.

## Expected Behavior

When the compression model matches a model defined in `custom_providers.models`, the feasibility check should resolve the `context_length` from `custom_providers` the same way the main model does (lines 1499-1536). No warning should appear since `0.65 × 200,000 = 130,000 < 200,000`.

## Actual Behavior

The auxiliary feasibility check calls `get_model_context_length()` with `config_context_length=None`, falling back to the built-in default (128K for glm-5.1), ignoring the user's `custom_providers.models.glm-5.1.context_length: 200000`. This triggers a false warning and auto-lowers the compression threshold.

## Affected Component

- [x] Agent Core (conversation loop, context compression, memory)
- [x] Configuration (config.yaml, .env, hermes setup)

## Messaging Platform (if gateway-related)

N/A (CLI only)

## Debug Report

Running on self-hosted VPS with CLIProxyAPI as local LLM gateway. Issue reproduced on CLI and Discord gateway.

## Operating System

Ubuntu 24.04 VPS

## Python Version

3.11

## Hermes Version

v0.1.0 (editable install from NousResearch/hermes-agent main branch)

## Root Cause Analysis

In `run_agent.py`, line ~2080-2085, the feasibility check calls:

```python
aux_context = get_model_context_length(
    aux_model,
    base_url=aux_base_url,
    api_key=aux_api_key,
    config_context_length=getattr(self, "_aux_compression_context_length_config", None),
)
```

This only passes `_aux_compression_context_length_config` (from `auxiliary.compression.context_length` in config), but does **NOT** resolve `custom_providers.models` context_length for the auxiliary model.

Meanwhile, the **main model** (lines 1499-1536) correctly resolves `custom_providers.models` context_length and stores it in `_config_context_length`. The auxiliary path skips this resolution entirely.

When the compression model is the same as the main model (default behavior), `get_model_context_length` receives `config_context_length=None` and falls back to built-in defaults (128K for glm-5.1).

## Proposed Fix

In `_check_compression_model_feasibility`, before calling `get_model_context_length`, resolve the `custom_providers.models` context_length for `aux_model` (mirroring the logic at lines 1499-1536) and pass it as `config_context_length`.

Alternatively, extract the `custom_providers.models` context resolution into a reusable function that both the main model and auxiliary paths call.

## Are you willing to submit a PR for this?

- [ ] I'd like to fix this myself and submit a PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: custom_providers.models.context_length not propagated to auxiliary compression feasibility check #12977

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Root Cause Analysis

Proposed Fix

Are you willing to submit a PR for this?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: custom_providers.models.context_length not propagated to auxiliary compression feasibility check #12977

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Root Cause Analysis

Proposed Fix

Are you willing to submit a PR for this?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions