Skip to content

Add <Inference> algebraic effect — LLM calls as typed, verified effects (v0.0.101)#374

Merged
aallan merged 4 commits into
mainfrom
feat/inference-effect
Mar 27, 2026
Merged

Add <Inference> algebraic effect — LLM calls as typed, verified effects (v0.0.101)#374
aallan merged 4 commits into
mainfrom
feat/inference-effect

Conversation

@aallan

@aallan aallan commented Mar 27, 2026

Copy link
Copy Markdown
Owner

Summary

  • Implements Inference.complete(String → Result<String, String>) as a built-in host-backed algebraic effect, following the Http pattern exactly
  • Multi-provider support: Anthropic (claude-haiku-4-5-20251001), OpenAI (gpt-4o-mini), Moonshot (moonshot-v1-8k) — auto-detected from whichever API key env var is set; override with VERA_INFERENCE_PROVIDER / VERA_INFERENCE_MODEL
  • Browser runtime returns a rich Err explaining why API keys cannot be safely embedded in client-side JavaScript (recommends Http proxy pattern)
  • Zero new dependencies — uses urllib.request + json from Python stdlib
  • Version bump 0.0.100 → 0.0.101; conformance suite 63 → 64; examples 29 → 30
  • Removes broken CodeRabbit badge

New files

  • examples/inference.vera — sentiment classification example
  • tests/conformance/ch09_inference.vera — conformance test (level: check)

Compiler changes

  • vera/environment.py — registers Inference effect with complete op
  • vera/codegen/ — compilability, _inference_ops_used tracking, WASM import emission
  • vera/wasm/ — qualified call translation, type inference
  • vera/codegen/api.pyhost_inference_complete + _call_inference_provider() helper
  • vera/browser/runtime.mjs — browser Inference stub

Limitations filed

  • #370max_tokens hardcoded to 1024 for Anthropic; no temperature override
  • #371embed operation (vector embeddings) deferred; blocked on #373
  • #372 — no user-defined handle[Inference] blocks
  • #373 — float array host-alloc infrastructure missing

Test plan

  • All 3087 tests pass (pytest tests/ -v)
  • mypy clean (mypy vera/)
  • All 64 conformance programs pass (python scripts/check_conformance.py)
  • All 30 examples pass (python scripts/check_examples.py)
  • All validation scripts pass (pre-commit hooks verified locally)
  • Live smoke test: VERA_ANTHROPIC_API_KEY=sk-ant-... vera run examples/inference.vera

Closes #61

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added built-in Inference effect with complete operation for LLM calls.
    • Support for Anthropic, OpenAI, and Moonshot providers via environment variables.
    • Type-safe Result<String, String> returns for error handling.
    • Browser runtime returns errors (server-side proxy recommended).
  • Documentation

    • Updated specification and guides with Inference effect usage and limitations.
    • Added example inference programs.
  • Tests

    • New conformance test and expanded type-checker/code-generation coverage.
  • Chores

    • Version bumped to 0.0.101.

…ts (v0.0.101)

Implements Inference.complete(String → Result<String, String>) as a built-in
host-backed effect following the Http pattern. Provider auto-detected from
VERA_ANTHROPIC_API_KEY, VERA_OPENAI_API_KEY, or VERA_MOONSHOT_API_KEY; override
with VERA_INFERENCE_PROVIDER and VERA_INFERENCE_MODEL. Browser runtime returns
a rich Err explaining why API keys cannot be safely embedded in client-side JS.

Type system: registers Inference effect in environment.py with complete op.
Codegen: compilability scan, _inference_ops_used tracking through core/functions/
  modules/assembly, WASM import emission, qualified call translation, type inference.
Host runtime: _call_inference_provider() dispatches to Anthropic/OpenAI/Moonshot
  via urllib.request (zero new dependencies). host_inference_complete registered
  with wasmtime linker when inference_ops_used is non-empty.
Tests: TestInferenceChecker (6), TestInferenceCollection (6), conformance test
  ch09_inference.vera (level: check), examples/inference.vera (sentiment classify).
Docs: spec §9.5.5 overhauled, SKILL.md Inference section, AGENTS.md, README.md
  feature list + roadmap, vera/README.md, docs/index.html LLM Integration section
  + features grid entry + browser note, CHANGELOG, ROADMAP #61 struck through.
Version: 0.0.100 → 0.0.101. Conformance: 63 → 64. Examples: 29 → 30.
Limitations filed: #370 (max_tokens), #371 (embed), #372 (user handlers),
  #373 (float array host-alloc infrastructure). Removes CodeRabbit badge.

Closes #61

Co-Authored-By: Claude <noreply@anthropic.invalid>
@coderabbitai

coderabbitai Bot commented Mar 27, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR implements the <Inference> algebraic effect for LLM inference as a built-in host-backed effect. The compiler now recognises Inference.complete(String) → Result<String, String>, tracks inference operations through the codegen pipeline, emits WASM host imports, and dispatches at runtime to Anthropic/OpenAI/Moonshot APIs via environment variables. Browser runtime returns Err client-side with guidance to use server-side proxying.

Changes

Cohort / File(s) Summary
Version & Exports
pyproject.toml, vera/__init__.py
Bumped version from 0.0.100 to 0.0.101.
Specification & Guidance
spec/00-introduction.md, spec/09-standard-library.md, AGENTS.md, SKILL.md, ROADMAP.md
Marked Inference feature as implemented (v0.0.101) in introduction table. Replaced "Future" Inference placeholder with detailed specification of Inference.complete(String → Result<String, String>) operation, provider selection via environment variables (VERA_ANTHROPIC_API_KEY, VERA_OPENAI_API_KEY, VERA_MOONSHOT_API_KEY), browser restrictions, and limitations (only complete, no embed/streaming/system prompt). Updated AGENTS.md rules to require explicit effects(<Inference>) for LLM calls. Extended SKILL.md with Inference documentation and examples.
Project Documentation
README.md, CLAUDE.md, TESTING.md, vera/README.md
Updated feature list to highlight Inference as typed, contract-verifiable, mockable effect. Replaced example research_topic function to compose Http.get with Inference.complete returning Result<String, String>. Incremented test/conformance/example counts: tests 3065→3087, conformance 63→64, examples 29→30. Expanded Inference limitations documentation in vera/README.md (no embed, no token/temperature controls, no user handlers, no float array host-alloc). Updated CHANGELOG.md with v0.0.101 release notes.
Test Infrastructure & Allowlists
scripts/check_readme_examples.py, scripts/check_skill_examples.py, scripts/check_spec_examples.py, tests/conformance/manifest.json, tests/test_html.py
Removed allowlist entry for README line 736 (Inference future block now valid). Shifted line numbers in SKILL.md and spec/09 allowlists due to added documentation content; added new SKILL.md allowlist entry for Inference effect declarations fragment. Added conformance test manifest entry ch09_inference (level: check, spec_ref: 9.5.5, features: inference_complete/effect_declaration/result). Updated HTML block count assertion 2→4.
Type Checking & Effect Registration
vera/environment.py, tests/test_checker.py
Registered built-in Inference effect with complete(STRING) → Result<STRING, STRING> operation in type environment. Added TestInferenceChecker test class: validates correct type/arity/effect annotation for Inference.complete, tests composition with IO and Http effects, asserts errors on wrong arity/type/missing effect annotation.
WASM Codegen & Host Imports
vera/codegen/api.py, vera/codegen/assembly.py, vera/codegen/compilability.py, vera/codegen/core.py, vera/codegen/functions.py
Added CompileResult.inference_ops_used: set[str] tracking field propagated through compiler pipeline. Implemented _call_inference_provider(provider, model, prompt) → str helper dispatching to Anthropic/OpenAI/Moonshot via HTTP POST. In execute(), conditionally register host function vera.inference_complete(ptr, len) → i32 that: selects provider from VERA_INFERENCE_PROVIDER env var or auto-detects from set API key (VERA_ANTHROPIC_API_KEY/VERA_OPENAI_API_KEY/VERA_MOONSHOT_API_KEY), reads VERA_INFERENCE_MODEL for optional model override, calls provider, wraps Ok/Err via _alloc_result_ok_string/_alloc_result_err_string. Updated compilability to flag _needs_memory = True for Inference effect and track inference_{op_name} in _inference_ops_used. Modified assembly to conditionally import inference_complete and export $alloc when inference ops present.
WASM Instruction Translation
vera/wasm/calls.py, vera/wasm/context.py, vera/wasm/inference.py
Added tracking set _inference_ops_used: set[str] to WasmContext. Updated _translate_qualified_call to handle Inference qualifier, emitting WASM calls $vera.inference_{op_name} and setting _needs_alloc = True. Extended _infer_qualified_call_wasm_type to recognise Inference qualifier, returning "i32" (heap pointer to Result<String, String>).
Browser Runtime
vera/browser/runtime.mjs
Added conditional WASM import binding for inference_complete that returns heap-allocated Result.Err(...) with message explaining client-side execution is prohibited; recommends server-side proxy via Http effect instead.
Code Generation Tests
tests/test_codegen.py, tests/test_verifier.py
Added TestInferenceCollection class: verifies inference_complete host import emission, validates success/failure branch execution via mocked provider, tests error handling when no API key configured. Updated test_overall_tier_counts: tier1 verified 167→170, tier3 runtime 20→21, total 187→191.

Sequence Diagram(s)

sequenceDiagram
    participant Compiler as Compiler<br/>(vera.codegen)
    participant WasmModule as WASM Module<br/>(compiled)
    participant HostRuntime as Host Runtime<br/>(vera.codegen.api)
    participant Provider as API Provider<br/>(Anthropic/OpenAI/<br/>Moonshot)
    participant Heap as WASM Heap

    Compiler->>Compiler: Register Inference effect<br/>in environment
    Compiler->>WasmModule: Emit import<br/>inference_complete(ptr, len)->i32
    
    Note over WasmModule: Runtime execution
    
    WasmModule->>Heap: Allocate prompt string
    WasmModule->>HostRuntime: call $vera.inference_complete<br/>(prompt_ptr, prompt_len)
    
    HostRuntime->>HostRuntime: Read prompt from<br/>WASM memory
    HostRuntime->>HostRuntime: Detect provider from<br/>env vars (API_KEY)
    HostRuntime->>Provider: POST /complete<br/>with prompt
    
    alt Provider Success
        Provider-->>HostRuntime: completion text
        HostRuntime->>Heap: Allocate Ok(completion)
        HostRuntime-->>WasmModule: i32 Result pointer
    else Provider Error
        Provider-->>HostRuntime: exception
        HostRuntime->>Heap: Allocate Err(message)
        HostRuntime-->>WasmModule: i32 Result pointer
    else No API Key
        HostRuntime->>Heap: Allocate Err(no provider)
        HostRuntime-->>WasmModule: i32 Result pointer
    end
    
    WasmModule->>WasmModule: match Result<String, String><br/>Ok/Err branches
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

The implementation spans multiple coordinated compiler/codegen modules (environment registration, type checking, WASM code generation, import declaration, host-import plumbing). Dense logic in vera/codegen/api.py (provider dispatch, API calls, result wrapping) and careful coordination across vera/wasm/* modules requires verification of: effect tracking propagation through the pipeline, WASM import/export correctness, Result type handling, environment variable precedence, and browser runtime error semantics.

Possibly related issues

Possibly related PRs

  • Add Http effect with get and post operations (#57) #357: Added the Http algebraic effect with parallel host-import plumbing infrastructure (environment registration, codegen tracking fields, WASM import/export coordination, provider dispatch), establishing the architectural pattern this PR replicates for Inference.
  • Add Map<K, V> collection type (#62, PR 1/3) #332: Added Map as a built-in effect and made identical structural changes to vera/codegen/api.py (CompileResult tracking fields), vera/codegen/assembly.py (conditional alloc export), and vera/environment.py (builtin registration), so this PR follows the same pattern.
  • Add Http effect with get and post operations (#57) #357: Both PRs modify vera/codegen/assembly.py to gate memory/alloc exports based on effect usage flags, and coordinate the same compiler/runtime infrastructure for built-in effects.

Suggested labels

compiler, feature, spec, tests, effects

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title accurately summarises the main change: implementing Inference as a built-in algebraic effect for LLM calls, with version bump to v0.0.101.
Linked Issues check ✅ Passed PR fulfils issue #61 requirements: Inference.complete(String → Result<String, String>) implemented as built-in effect with multi-provider support, contracts preserved, testability enabled, and placed in standard library.
Out of Scope Changes check ✅ Passed All changes are tightly scoped to Inference effect implementation: effect registration, codegen tracking/emission, WASM lowering, documentation, tests, and version/test-count updates.
Docstring Coverage ✅ Passed Docstring coverage is 88.24% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/inference-effect

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Mar 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 86.40777% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.30%. Comparing base (cfb8457) to head (5023312).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
vera/browser/runtime.mjs 33.33% 12 Missing ⚠️
vera/wasm/inference.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #374      +/-   ##
==========================================
- Coverage   90.32%   90.30%   -0.03%     
==========================================
  Files          49       49              
  Lines       18999    19100     +101     
  Branches      219      220       +1     
==========================================
+ Hits        17161    17248      +87     
- Misses       1834     1848      +14     
  Partials        4        4              
Flag Coverage Δ
javascript 50.58% <33.33%> (-0.15%) ⬇️
python 95.31% <97.64%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…age tests

- Fix check_spec_examples.py allowlist: nan (982→992) and url_decode (1388→1398)
  shifted 10 lines when the Inference §9.5.5 section was added
- Add TestInferenceProviderDispatch: unit tests for all three provider branches
  (anthropic/openai/moonshot) in _call_inference_provider, plus custom model
  passthrough and unknown provider error — covers lines 153–210 of api.py
- Add openai/moonshot auto-detection and explicit VERA_INFERENCE_PROVIDER
  override tests to TestInferenceCollection — covers lines 2025/2027 of api.py
- Update TESTING.md counts: 3087→3095 tests, test_codegen.py 830→838 tests

Co-Authored-By: Claude <noreply@anthropic.invalid>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
vera/wasm/calls.py (1)

365-374: ⚠️ Potential issue | 🟠 Major

Use self.needs_alloc, not _needs_alloc, in qualified host-call paths.

Line 373 sets self._needs_alloc = True, which is inconsistent with the canonical allocator flag and can prevent allocator wiring from being marked when Inference.* is used.

Proposed fix
         if call.qualifier == "Http":
             wasm_name = f"http_{call.name}"
             self._http_ops_used.add(wasm_name)
-            self._needs_alloc = True
+            self.needs_alloc = True
             instructions.append(f"call $vera.{wasm_name}")
         elif call.qualifier == "Inference":
             wasm_name = f"inference_{call.name}"
             self._inference_ops_used.add(wasm_name)
-            self._needs_alloc = True
+            self.needs_alloc = True
             instructions.append(f"call $vera.{wasm_name}")

Based on learnings: In the allan/vera Python implementation, the canonical “allocator-needed” flag is self.needs_alloc (no underscore), including for host-import call sites in vera/wasm.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vera/wasm/calls.py` around lines 365 - 374, In the host-call handling
branches for call.qualifier == "Http" and "Inference" (the code that builds
wasm_name like f"http_{call.name}" and f"inference_{call.name}" and appends
instructions with call $vera.{wasm_name}), replace assignments to
self._needs_alloc = True with self.needs_alloc = True so the canonical allocator
flag is set; make this change in both the Http and Inference branches and keep
the rest of the logic (adding to self._http_ops_used / self._inference_ops_used
and appending the call instruction) unchanged.
vera/codegen/compilability.py (1)

43-52: ⚠️ Potential issue | 🟡 Minor

Update the unsupported-effect rationale.

Lines 50-51 still say only pure, IO, Http, State<T>, Exn<E>, and Async are compilable. After this change that E603 rationale is now stale for users; please include Inference there as well.

✏️ Proposed fix
-                            rationale="Only pure, IO, Http, State<T>, "
-                            "Exn<E>, and Async effects are compilable.",
+                            rationale="Only pure, IO, Http, Inference, "
+                            "State<T>, Exn<E>, and Async effects are compilable.",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vera/codegen/compilability.py` around lines 43 - 52, The rationale message
passed to self._warning for error_code "E603" is stale: it lists compilable
effects but omits "Inference" which is now supported; update the rationale
string in the same call (the branch handling effects where eff.name !=
"Inference" and not handled) to include "Inference" alongside "pure, IO, Http,
State<T>, Exn<E>, and Async" so the message accurately reflects current
compilable effects (refer to the self._warning call that uses decl, the
f"Function '{decl.name}' uses unsupported effect '{eff.name}' — skipped."
message, and error_code="E603").
vera/codegen/functions.py (1)

167-187: ⚠️ Potential issue | 🔴 Critical

Set self.needs_alloc (not self._needs_alloc) for Inference and Http qcalls.

Lines 368 and 373 in vera/wasm/calls.py incorrectly set self._needs_alloc = True for Http and Inference qualifiers. The canonical allocator flag, as defined at vera/wasm/context.py:129 and propagated by functions.py:168, is self.needs_alloc (no underscore). All other call paths (arrays, strings, regex, decimals, map, set, json, html) correctly use self.needs_alloc = True.

Change:

  • vera/wasm/calls.py:368: self._needs_alloc = Trueself.needs_alloc = True
  • vera/wasm/calls.py:373: self._needs_alloc = Trueself.needs_alloc = True

The export is currently generated as a side effect of ops_used propagation, but this violates the authoritative pattern and is fragile.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vera/codegen/functions.py` around lines 167 - 187, Two qcall handlers (the
Http and Inference qualifier cases) incorrectly set the private flag
self._needs_alloc instead of the canonical allocator flag self.needs_alloc;
update the assignments in the Http and Inference qcall handling code to set
self.needs_alloc = True (replacing self._needs_alloc = True) so they match other
paths (arrays, strings, regex, decimals, map, set, json, html) and align with
the WasmContext propagation used by functions.py and context.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Line 18: Update the CHANGELOG bullet that currently reads "No token limits or
temperature controls — uses provider defaults" to accurately reflect that
Anthropic's max_tokens is hardcoded in the codebase; specifically replace or
append that line to mention "Anthropic: max_tokens is hardcoded" (or similar
wording) so users are not misled that all providers use defaults, referencing
the existing bullet text and the Anthropic max_tokens configuration.

In `@README.md`:
- Line 738: Remove the roadmap-status sentence starting with "The features on
the roadmap — \<Http\> ... All four pieces are now complete." from README.md and
instead add or update that status entry in ROADMAP.md (preserving the content
and any strike-throughs/issue links); then replace the removed sentence in
README.md with a short link/reference to ROADMAP.md (e.g., "See ROADMAP.md for
current feature status") so README only links to the canonical roadmap and
ROADMAP.md holds the authoritative status text.
- Line 66: The README line claiming "mockable" for LLM inference is too broad;
update the phrase referencing Inference.complete to qualify that mocking is only
supported on the host/test side (e.g., "host-side test mocking") and explicitly
note that handle[Inference] is still unsupported until issue `#372`, or remove the
"mockable" claim entirely until `#372` lands; specifically edit the sentence
mentioning Inference.complete to either append a parenthetical like "(host-side
test mocking only; handle[Inference] unsupported until `#372`)" or replace
"mockable" with a qualified phrase as described.

In `@ROADMAP.md`:
- Line 23: Update the v0.0.100 summary to remove or correct features that
actually belong to v0.0.101: specifically, detach the "<Inference>" effect and
the "64-program conformance suite" from the v0.0.100 sentence and either move
them to a v0.0.101 entry or rephrase the v0.0.100 line to exclude those items;
edit the text referencing "v0.0.100" so it only lists features that truly land
in that release and add a separate v0.0.101 bullet containing "<Inference>" and
the 64-program conformance count if needed.

In `@SKILL.md`:
- Around line 1089-1099: safe_classify calls classify(`@String.0`) without
ensuring classify's precondition string_length(`@String.0`) > 0; fix by either
strengthening safe_classify's requires clause to require
string_length(`@String.0`) > 0 or by guarding the call (check
string_length(`@String.0`) > 0 and only call classify when true, returning
"unknown" otherwise); reference the functions safe_classify and classify and the
predicate string_length(`@String.0`) > 0 when making the change.

In `@spec/09-standard-library.md`:
- Around line 525-565: Examples in the spec still treat Inference.complete as
returning a plain String but its signature is Result<String, String>; update
every example (notably the classify snippet and any lines that use "let `@String`
= Inference.complete(...)" or similar) to handle the Result type by binding the
result (e.g., let `@Result` = Inference.complete(...)) and then pattern-matching
or unwrapping (Ok(completion) / Err(message)) before using the completion;
ensure functions like classify either propagate the Result (return
Result<String,String>) or explicitly match and return the inner String on Ok or
propagate/convert Err, and apply the same change to all downstream examples that
call Inference.complete so they are valid Vera code against the new API
contract.

In `@tests/test_codegen.py`:
- Around line 9615-9623: The test currently calls os.environ.clear() then
os.environ.update(orig) in the finally block (around the execute(result) call
and the orig snapshot), which wipes the entire process environment and can break
parallel tests; instead, restore only the keys you changed: compute the set of
keys added during the test (set(os.environ) - set(orig)) and delete those, and
for keys present in orig restore their original values via
os.environ.update(orig) or use a test helper/monkeypatch to set only VER
A_ANTHROPIC_API_KEY; replace the os.environ.clear()/update(orig) sequence with
targeted removal/restoration of modified keys (reference the orig variable, the
os.environ.clear()/update(orig) calls, and the execute(result) test block to
locate where to apply the change).
- Around line 9566-9660: Add unit tests in TestInferenceCollection to exercise
provider/model selection and overrides: create tests that (1) simulate OpenAI
and Moonshot auto-detection by patching environment and the internal
_call_inference_provider to verify the chosen provider path (e.g.,
test_inference_auto_detect_openai, test_inference_auto_detect_moonshot), (2)
assert that setting VERA_INFERENCE_PROVIDER forces the provider used, and (3)
assert that VERA_INFERENCE_MODEL overrides the model selection; use the existing
patterns in test_inference_complete_mocked_success/failure for patching
"vera.codegen.api._call_inference_provider", manipulating os.environ, compiling
_CLASSIFY_SOURCE via _compile_ok and executing via execute, and restore
os.environ in finally blocks so tests clean up after themselves.

In `@vera/codegen/api.py`:
- Around line 170-172: The HTTP calls use _urlreq.urlopen without a timeout and
should be bounded: add a finite default timeout constant (e.g.,
DEFAULT_PROVIDER_TIMEOUT) and pass it into _urlreq.urlopen(req,
timeout=DEFAULT_PROVIDER_TIMEOUT) for the POST call that builds req via
_urlreq.Request; wrap the call in the same error handling used elsewhere so
timeout/socket/URLError exceptions are caught and returned as Err with an
expiry/timeout marker (use the existing Err type/symbol), and apply the
identical change to the other two provider branches that also call
_urlreq.urlopen (the occurrences around the other Request/urlopen blocks).
- Around line 2016-2041: host_inference_complete currently reads _os.environ
directly (VERA_INFERENCE_PROVIDER, VERA_ANTHROPIC_API_KEY, VERA_OPENAI_API_KEY,
VERA_MOONSHOT_API_KEY, VERA_INFERENCE_MODEL) which ignores the
execute(env_vars=...) sandbox; change it to resolve env via the same mechanism
as host_get_env or the env dict passed through execute, e.g., replace
_os.environ.get(...) calls with the host_get_env(key, default) helper or lookup
in the execute-provided env map and then use those values when selecting
provider and calling _call_inference_provider so tests and guest IO see the same
environment.

In `@vera/environment.py`:
- Around line 418-429: The Inference effect registration only defines the
"complete" operation but is missing the "embed" operation, so update the
effects["Inference"] EffectInfo to include an OpInfo entry for "embed" alongside
"complete"; add an "embed" key to the operations mapping (similar to how
"complete" is created with OpInfo) and supply the correct parameter and return
types per Issue `#61` (match the expected type signature used by the standard
library), ensuring the operation name "embed" and the EffectInfo/OpInfo usage
remain consistent with the existing patterns in effects["Inference"],
EffectInfo, and OpInfo.

---

Outside diff comments:
In `@vera/codegen/compilability.py`:
- Around line 43-52: The rationale message passed to self._warning for
error_code "E603" is stale: it lists compilable effects but omits "Inference"
which is now supported; update the rationale string in the same call (the branch
handling effects where eff.name != "Inference" and not handled) to include
"Inference" alongside "pure, IO, Http, State<T>, Exn<E>, and Async" so the
message accurately reflects current compilable effects (refer to the
self._warning call that uses decl, the f"Function '{decl.name}' uses unsupported
effect '{eff.name}' — skipped." message, and error_code="E603").

In `@vera/codegen/functions.py`:
- Around line 167-187: Two qcall handlers (the Http and Inference qualifier
cases) incorrectly set the private flag self._needs_alloc instead of the
canonical allocator flag self.needs_alloc; update the assignments in the Http
and Inference qcall handling code to set self.needs_alloc = True (replacing
self._needs_alloc = True) so they match other paths (arrays, strings, regex,
decimals, map, set, json, html) and align with the WasmContext propagation used
by functions.py and context.py.

In `@vera/wasm/calls.py`:
- Around line 365-374: In the host-call handling branches for call.qualifier ==
"Http" and "Inference" (the code that builds wasm_name like f"http_{call.name}"
and f"inference_{call.name}" and appends instructions with call
$vera.{wasm_name}), replace assignments to self._needs_alloc = True with
self.needs_alloc = True so the canonical allocator flag is set; make this change
in both the Http and Inference branches and keep the rest of the logic (adding
to self._http_ops_used / self._inference_ops_used and appending the call
instruction) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 875eac77-af6c-4b77-ab32-0b85092058c8

📥 Commits

Reviewing files that changed from the base of the PR and between cfb8457 and 7803068.

⛔ Files ignored due to path filters (6)
  • docs/index.html is excluded by !docs/**
  • docs/index.md is excluded by !docs/**
  • docs/llms-full.txt is excluded by !docs/**
  • docs/llms.txt is excluded by !docs/**
  • examples/inference.vera is excluded by !**/*.vera
  • tests/conformance/ch09_inference.vera is excluded by !**/*.vera
📒 Files selected for processing (30)
  • AGENTS.md
  • CHANGELOG.md
  • CLAUDE.md
  • README.md
  • ROADMAP.md
  • SKILL.md
  • TESTING.md
  • pyproject.toml
  • scripts/check_readme_examples.py
  • scripts/check_skill_examples.py
  • scripts/check_spec_examples.py
  • spec/00-introduction.md
  • spec/09-standard-library.md
  • tests/conformance/manifest.json
  • tests/test_checker.py
  • tests/test_codegen.py
  • tests/test_html.py
  • tests/test_verifier.py
  • vera/README.md
  • vera/__init__.py
  • vera/browser/runtime.mjs
  • vera/codegen/api.py
  • vera/codegen/assembly.py
  • vera/codegen/compilability.py
  • vera/codegen/core.py
  • vera/codegen/functions.py
  • vera/environment.py
  • vera/wasm/calls.py
  • vera/wasm/context.py
  • vera/wasm/inference.py
💤 Files with no reviewable changes (1)
  • scripts/check_readme_examples.py

Comment thread CHANGELOG.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread ROADMAP.md Outdated
Comment thread SKILL.md
Comment thread tests/test_codegen.py
Comment thread tests/test_codegen.py Outdated
Comment thread vera/codegen/api.py
Comment thread vera/codegen/api.py Outdated
Comment thread vera/environment.py
aallan and others added 2 commits March 27, 2026 16:37
…imeout, docs

Bugs fixed:
- vera/wasm/calls.py: self._needs_alloc -> self.needs_alloc for Http and
  Inference qualified calls (was writing to an orphaned attribute, not the
  canonical WasmContext field read by codegen/functions.py propagation)
- vera/codegen/api.py host_inference_complete: read provider/API keys from
  the execute() env_vars dict when provided, falling back to os.environ

Improvements:
- vera/codegen/api.py: add _INFERENCE_TIMEOUT = 60s; pass to all urlopen calls
- tests/test_codegen.py: pass env_vars= directly to execute(), no os.environ mutation
- vera/codegen/compilability.py: add Inference to E603 rationale string
- CHANGELOG.md: clarify max_tokens hardcoded for Anthropic specifically
- README.md: qualify mockable as host-side mockable; rewrite roadmap paragraph
- ROADMAP.md: correct v0.0.100 blurb (removed v0.0.101 features that belong there)
- spec/09-standard-library.md: fix #61 -> #371 in limitations list
- SKILL.md safe_classify: add requires(string_length(@String.0) > 0)
- scripts/check_spec_examples.py: remove stale FUTURE allowlist entries for Inference

Co-Authored-By: Claude <noreply@anthropic.invalid>
Section 9.7.3 still showed `let @string = Inference.complete(...)` and
said it "returns String" — stale from before the Result<String, String>
API was finalised. Updated to bind @Result<String, String> and match on
Ok/Err before calling md_parse.

Co-Authored-By: Claude <noreply@anthropic.invalid>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LLM inference effect (<Inference>)

1 participant