The NextExternalIp query should be idempotent to be usable in the context of sagas - if an IP address is allocated with the same UUID, the same underlying object should be returned.
This mostly works already - we even have a test to check for idempotency, by re-invoking an IP provision with the same UUID:
|
async fn test_insert_external_ip_is_idempotent() { |
However, there's an edge case I noticed where idempotency breaks down: If the IP pool has no free addresses, idempotent IP provisioning fails.
The gist of the CTE used for insertion is:
WITH
next_external_ip as (...) -- Create a suitable record to be inserted, by finding an IP address from a free pool
external_ip as (...) -- Perform the INSERTION, with an ON CONFLICT clause that always returns something
updated_pool as (...) -- Update the parent object
SELECT * FROM external_ip
I believe the underlying cause is this part of the "next IP address provision" CTE; in particular, part of calculating the next_external_ip:
The only IPs we would consider allocating are from cases where:
- We consider an IP from the pool
- For that candidate IP, no matching row exists in instance_external_ip.
This is our way of "finding a free address within a pool" - look for "all possible values", LEFT OUTER JOIN that with "everything that's currently allocated", and find the choices where the ip IS NULL -- in other words, where no row from the instance_external_ip table matches.
This means that, in the past, the ON CONFLICT clause in the INSERT clause actually avoided inserting a row with the same UUID, but instead found a different (free) IP address.
If we instead update the WHERE clause on line 124-125 to:
WHERE
(ip IS NULL) OR (id = <ip_id>)
Then we would look for addresses that are either free or have previously been allocated. This would correctly implement idempotency, even for this edge case.
The NextExternalIp query should be idempotent to be usable in the context of sagas - if an IP address is allocated with the same UUID, the same underlying object should be returned.
This mostly works already - we even have a test to check for idempotency, by re-invoking an IP provision with the same UUID:
omicron/nexus/src/db/queries/external_ip.rs
Line 975 in 1c55cc9
However, there's an edge case I noticed where idempotency breaks down: If the IP pool has no free addresses, idempotent IP provisioning fails.
The gist of the CTE used for insertion is:
I believe the underlying cause is this part of the "next IP address provision" CTE; in particular, part of calculating the
next_external_ip:omicron/nexus/src/db/queries/external_ip.rs
Lines 124 to 125 in 1c55cc9
The only IPs we would consider allocating are from cases where:
This is our way of "finding a free address within a pool" - look for "all possible values",
LEFT OUTER JOINthat with "everything that's currently allocated", and find the choices where theip IS NULL-- in other words, where no row from theinstance_external_iptable matches.This means that, in the past, the
ON CONFLICTclause in theINSERTclause actually avoided inserting a row with the same UUID, but instead found a different (free) IP address.If we instead update the
WHEREclause on line 124-125 to:Then we would look for addresses that are either free or have previously been allocated. This would correctly implement idempotency, even for this edge case.