Inference effect: configurable max_tokens / temperature

The `Inference.complete` host function hardcodes `max_tokens: 1024` for the Anthropic provider (OpenAI and Moonshot use provider defaults). Users have no way to override this from Vera code or via environment variable.

## Scope

- Add `VERA_INFERENCE_MAX_TOKENS` env var (integer, applied to all providers)
- Add `VERA_INFERENCE_TEMPERATURE` env var (float, applied to all providers)
- Longer term: a richer `InferenceOptions` struct or additional effect operations

## Current workaround

None — responses are capped at 1024 tokens for Anthropic calls regardless of prompt length or use case.

## Related

Introduced in v0.0.101 as part of [#61](https://github.com/aallan/vera/issues/61).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference effect: configurable max_tokens / temperature #370

Scope

Current workaround

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Inference effect: configurable max_tokens / temperature #370

Description

Scope

Current workaround

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions