During the release 18 install on rack 3 we found that index creation for the re-ordered console session table failed (https://github.com/oxidecomputer/colo/issues/147). This left the system in a state where console_session rows could not be ready without a full table scan.
Ultimately this was a result of CRDB exhausting its memory limits while attempting to create the indexes. These indexes where being created over a table with about 3.3 million rows. This is more of an issue due to the way indexes are created in CRDB. We are using CREATE IF NOT EXISTS and upon issuing that statement the index will be immediately created, but without any data in it. An asynchronous job will then will the index. Problematically, anyone else that issues that query will observe that the index has been created, but will have no idea if the data for it has been backfilled. In this case we ended up with three indexes over this table being created which led to memory exhaustion. There is plenty more depth to document here in this issue.
More work and discussion is needed to figure out how this can be resolved holistically.
Thank you to @david-crespo , @sunshowers , @smklein , @sudomateo for debugging this and tracking down a root cause.
Recording: ...
Nexus logs: catacomb:/staff/rack3/2026-02-16
During the release 18 install on rack 3 we found that index creation for the re-ordered console session table failed (https://github.com/oxidecomputer/colo/issues/147). This left the system in a state where console_session rows could not be ready without a full table scan.
Ultimately this was a result of CRDB exhausting its memory limits while attempting to create the indexes. These indexes where being created over a table with about 3.3 million rows. This is more of an issue due to the way indexes are created in CRDB. We are using
CREATE IF NOT EXISTSand upon issuing that statement the index will be immediately created, but without any data in it. An asynchronous job will then will the index. Problematically, anyone else that issues that query will observe that the index has been created, but will have no idea if the data for it has been backfilled. In this case we ended up with three indexes over this table being created which led to memory exhaustion. There is plenty more depth to document here in this issue.More work and discussion is needed to figure out how this can be resolved holistically.
Thank you to @david-crespo , @sunshowers , @smklein , @sudomateo for debugging this and tracking down a root cause.
Recording: ...
Nexus logs:
catacomb:/staff/rack3/2026-02-16