Summary
Brokered Azure Linux lease allocation can time out after the CLI/coordinator wait window with no lease was returned, even though a lease from the same Azure path can later appear as active/ready and be usable via crabbox run --id.
This makes OpenClaw's Azure-backed Crabbox default unreliable for proof runs: the user-facing command fails as unavailable, while the underlying Azure VM may still be provisioning or may become usable too late for the caller.
Environment
- Date observed: 2026-06-05
- Client OS/shell: Windows, PowerShell
- Crabbox CLI:
0.26.0
- Broker:
https://crabbox.openclaw.ai
- Auth: GitHub broker auth for org
openclaw
- Repository/worktree:
C:\oc-work\oc-87735
- Repo config: OpenClaw
.crabbox.yaml
- Default OpenClaw provider in that repo:
azure
- Azure config in repo:
location: eastus2
- Tested provider/type:
azure, Standard_D4ads_v6, market=on-demand, target=linux
- Local
rsync: installed and runnable (C:\Users\marti\.local\bin\rsync.cmd, rsync 3.4.2)
crabbox doctor --provider azure reached the broker/provider and reported provider=azure coordinator_secrets=ready. The only local doctor failure was the existing Windows config permission warning:
failed config C:\Users\marti\AppData\Roaming\crabbox\config.yaml: permissions 0666 want 0600
ok broker auth=github owner=martin_cleary@yahoo.co.uk org=openclaw default_type=Standard_D32ads_v6
ok provider provider=azure coordinator_secrets=ready
Reproduction
From C:\oc-work\oc-87735:
pnpm crabbox:run -- --type Standard_D4ads_v6 --market on-demand --idle-timeout 10m --ttl 20m --timing-json --no-sync --no-hydrate --stop-after always --shell -- "echo CRABBOX_AZURE_SMOKE_OK; uname -srm; whoami; pwd"
This is intentionally a tiny no-sync/no-hydrate command so the result isolates lease allocation/SSH readiness rather than repo sync or test setup.
Observed Behavior
The command waited for a coordinator lease for the full 10-minute acquire window, then failed:
[crabbox] bin=..\..\Users\marti\.local\bin\crabbox.exe version=0.26.0 provider=azure providers=...
recording run run_87a63bdd35c2
coordinator lease class=standard preferred_type=Standard_D4ads_v6 keep=false slug=amber-barnacle idle_timeout=10m0s ttl=20m0s
waiting for coordinator lease provider=azure slug=amber-barnacle elapsed=30s timeout=10m0s
...
waiting for coordinator lease provider=azure slug=amber-barnacle elapsed=9m30s timeout=10m0s
timed out waiting for coordinator lease after 10m0s provider=azure target=linux type=Standard_D4ads_v6 slug=amber-barnacle lease=cbx_ddea6cab6b52; no lease was returned; next_action=check coordinator/cloud logs and retry, then run `crabbox stop --provider azure --target linux --id cbx_ddea6cab6b52` if a late lease appears
Immediately after the timeout, the hinted late lease id was not visible to the user:
crabbox status --provider azure --id cbx_ddea6cab6b52
coordinator GET /v1/leases/cbx_ddea6cab6b52: http 404: {"error":"not_found"}
In the same troubleshooting session, a separate Azure attempt from another chat showed the more worrying late-lease behavior directly: the command timed out after 10 minutes, but the reported late lease later appeared in the user-visible lease list as active/ready:
crabbox-harbor-crab-78988ccd active Standard_D4ads_v6 20.101.44.161 lease=cbx_2513f241d618 slug=harbor-crab keep=false target=linux
Status/inspect showed it was ready:
cbx_2513f241d618 slug=harbor-crab provider=azure target=linux state=active type=Standard_D4ads_v6 host=20.101.44.161 ready=true has_host=true idle_timeout=1h30m0s
A no-sync attach command against that late/ready Azure lease succeeded:
pnpm crabbox:run -- --provider azure --id cbx_2513f241d618 --no-sync --no-hydrate --timing-json --stop-after never --shell -- "echo CRABBOX_AZURE_REUSE_OK; uname -srm; whoami; pwd"
Output:
CRABBOX_AZURE_REUSE_OK
Linux 7.0.0-1004-azure x86_64
crabbox
/work/crabbox/cbx_2513f241d618/oc-87735
Timing summary:
{"provider":"azure","leaseId":"cbx_2513f241d618","slug":"harbor-crab","syncMs":0,"syncSkipped":true,"commandMs":1845,"totalMs":2353,"exitCode":0,"runId":"run_c85ec4db1c50","machineType":"Standard_D4ads_v6"}
Additional Cleanup Evidence
After filing this issue, the portal showed both late Azure leases as active:
cbx_2513f241d618 / harbor-crab
cbx_ddea6cab6b52 / amber-barnacle
harbor-crab released successfully:
crabbox stop --provider azure --target linux --id cbx_2513f241d618
released lease=cbx_2513f241d618 server=crabbox-harbor-crab-78988ccd
amber-barnacle is more concerning. It is visible as active/ready even though keep=false, idle_timeout=10m0s, and expiresAt is already in the past:
cbx_ddea6cab6b52 slug=amber-barnacle provider=azure target=linux state=active type=Standard_D4ads_v6 host=52.157.75.123 ready=true has_host=true idle_for=28m2s idle_timeout=10m0s expires=2026-06-05T18:37:13.214Z
A manual release attempt with a long local timeout failed at the broker release endpoint:
crabbox stop --provider azure --target linux --id cbx_ddea6cab6b52
Post "https://crabbox.openclaw.ai/v1/leases/cbx_ddea6cab6b52/release": context deadline exceeded
A follow-up list/status/inspect still showed it active/ready. So this issue covers both late lease visibility and a cleanup/release timeout for at least one late Azure lease.
Expected Behavior
One of these should happen:
- Azure allocation returns the lease once the VM becomes SSH-ready, within the configured wait window for normal OpenClaw proof runs.
- If Azure provisioning is legitimately slow, the CLI/coordinator reports a precise capacity/provisioning-delay state instead of a generic acquire timeout.
- If a lease is still provisioning after the caller times out, late lease cleanup/status is reliable: the hinted lease id should be visible, inspectable, and stoppable once it exists.
- The CLI should not leave the operator in a state where the proof run fails but a paid Azure lease later becomes active outside the failed run's control path.
Why This Looks Reportable
This does not appear to be a local user-auth problem:
- Broker auth is configured and works for the
openclaw org.
crabbox doctor --provider azure reaches the broker/provider and reports Azure coordinator secrets ready.
- An already-ready Azure lease can be attached to and used successfully.
- AWS brokered leases are usable from the same machine/session.
This also does not appear to be repo sync/test setup, because the failing repro uses --no-sync --no-hydrate and only tries to run echo, uname, whoami, and pwd.
Related PR history suggests small Azure Linux brokered warmups have previously completed well inside 10 minutes:
The current symptom is therefore either a real Azure capacity/provisioning latency issue that needs better surfacing, or a coordinator/CLI late-lease lifecycle bug.
above screenshot from the UI, which i could see the boxes available afterwards
Acceptance Criteria
- A fresh brokered Azure no-sync smoke either succeeds or returns a clear, actionable capacity/provisioning status.
- Late Azure leases created by timed-out attempts are consistently visible to
status/inspect/list once they exist.
- Timed-out attempts do not leave untracked active Azure leases, or the CLI provides a reliable cleanup command that works after late provisioning completes.
- If the right fix is a longer/default Azure acquire timeout for
Standard_D4ads_v6/managed OS disk paths, document that expectation in the provider docs and OpenClaw .crabbox.yaml guidance.
Summary
Brokered Azure Linux lease allocation can time out after the CLI/coordinator wait window with
no lease was returned, even though a lease from the same Azure path can later appear as active/ready and be usable viacrabbox run --id.This makes OpenClaw's Azure-backed Crabbox default unreliable for proof runs: the user-facing command fails as unavailable, while the underlying Azure VM may still be provisioning or may become usable too late for the caller.
Environment
0.26.0https://crabbox.openclaw.aiopenclawC:\oc-work\oc-87735.crabbox.yamlazurelocation: eastus2azure,Standard_D4ads_v6,market=on-demand,target=linuxrsync: installed and runnable (C:\Users\marti\.local\bin\rsync.cmd, rsync3.4.2)crabbox doctor --provider azurereached the broker/provider and reportedprovider=azure coordinator_secrets=ready. The only local doctor failure was the existing Windows config permission warning:Reproduction
From
C:\oc-work\oc-87735:This is intentionally a tiny no-sync/no-hydrate command so the result isolates lease allocation/SSH readiness rather than repo sync or test setup.
Observed Behavior
The command waited for a coordinator lease for the full 10-minute acquire window, then failed:
Immediately after the timeout, the hinted late lease id was not visible to the user:
In the same troubleshooting session, a separate Azure attempt from another chat showed the more worrying late-lease behavior directly: the command timed out after 10 minutes, but the reported late lease later appeared in the user-visible lease list as active/ready:
Status/inspect showed it was ready:
A no-sync attach command against that late/ready Azure lease succeeded:
Output:
Timing summary:
{"provider":"azure","leaseId":"cbx_2513f241d618","slug":"harbor-crab","syncMs":0,"syncSkipped":true,"commandMs":1845,"totalMs":2353,"exitCode":0,"runId":"run_c85ec4db1c50","machineType":"Standard_D4ads_v6"}Additional Cleanup Evidence
After filing this issue, the portal showed both late Azure leases as active:
cbx_2513f241d618/harbor-crabcbx_ddea6cab6b52/amber-barnacleharbor-crabreleased successfully:amber-barnacleis more concerning. It is visible as active/ready even thoughkeep=false,idle_timeout=10m0s, andexpiresAtis already in the past:A manual release attempt with a long local timeout failed at the broker release endpoint:
A follow-up
list/status/inspectstill showed it active/ready. So this issue covers both late lease visibility and a cleanup/release timeout for at least one late Azure lease.Expected Behavior
One of these should happen:
Why This Looks Reportable
This does not appear to be a local user-auth problem:
openclaworg.crabbox doctor --provider azurereaches the broker/provider and reports Azure coordinator secrets ready.This also does not appear to be repo sync/test setup, because the failing repro uses
--no-sync --no-hydrateand only tries to runecho,uname,whoami, andpwd.Related PR history suggests small Azure Linux brokered warmups have previously completed well inside 10 minutes:
Standard_D2ads_v6warmup around 2m25s.Standard_D2ads_v6warmup around 1m55s.The current symptom is therefore either a real Azure capacity/provisioning latency issue that needs better surfacing, or a coordinator/CLI late-lease lifecycle bug.
above screenshot from the UI, which i could see the boxes available afterwards
Acceptance Criteria
status/inspect/listonce they exist.Standard_D4ads_v6/managed OS disk paths, document that expectation in the provider docs and OpenClaw.crabbox.yamlguidance.