The Inference.complete host function hardcodes max_tokens: 1024 for the Anthropic provider (OpenAI and Moonshot use provider defaults). Users have no way to override this from Vera code or via environment variable.
Scope
- Add
VERA_INFERENCE_MAX_TOKENS env var (integer, applied to all providers)
- Add
VERA_INFERENCE_TEMPERATURE env var (float, applied to all providers)
- Longer term: a richer
InferenceOptions struct or additional effect operations
Current workaround
None — responses are capped at 1024 tokens for Anthropic calls regardless of prompt length or use case.
Related
Introduced in v0.0.101 as part of #61.
The
Inference.completehost function hardcodesmax_tokens: 1024for the Anthropic provider (OpenAI and Moonshot use provider defaults). Users have no way to override this from Vera code or via environment variable.Scope
VERA_INFERENCE_MAX_TOKENSenv var (integer, applied to all providers)VERA_INFERENCE_TEMPERATUREenv var (float, applied to all providers)InferenceOptionsstruct or additional effect operationsCurrent workaround
None — responses are capped at 1024 tokens for Anthropic calls regardless of prompt length or use case.
Related
Introduced in v0.0.101 as part of #61.