[reconfigurator] Call clickhouse-admin API from SMF services#6533
Conversation
| File::create(self.config_dir.join("replica-server-config.xml"))?; | ||
| f.write_all(config.to_xml().as_bytes())?; | ||
| f.flush()?; | ||
| rename(f.path(), self.config_dir.join("replica-server-config.xml"))?; |
There was a problem hiding this comment.
Sadly, this rename produces an error when running via the clickhouse-admin API inside an SMF service:
{
"request_id": "a529336d-e113-429f-b90e-5bb80f7912fc",
"error_code": "Internal",
"message": "clickward XML generation failure: Cross-device link (os error 18)"
}There was a problem hiding this comment.
We still want to create a temporary file and rename, otherwise we can end up with partially written configurations that do not get corrected without support and/or a config change. There are ways to work around this, but atomic rename is typically the simplest. And it will help once we have generation numbers and need to write those to disk as well.
My guess is that this is failing because NamedUtf8TempFile is putting the temp file on a /tmpfs device and you can't rename it to the zfs filesystem on an actual U.2 device. You should be able to fix this by not using NamedUtf8TempFile and instead just creating a file named replica-server-config.xml.tmp in self.config_dir() and then doing the atomic rename.
There was a problem hiding this comment.
To emphasize what can go wrong. If we partially write the file, yes we will get an error, and yes we can retry. But the config will automatically get loaded by clickhouse and may crash it or cause other issues. It's better to have a stable old config and only load valid configs.
There was a problem hiding this comment.
There's also the atomicwrites crate which is pretty good; e.g.,
omicron/dev-tools/openapi-manager/src/spec.rs
Line 482 in 3d3f6d7
There was a problem hiding this comment.
There's also the atomicwrites crate which is pretty good; e.g.,
omicron/dev-tools/openapi-manager/src/spec.rs
Line 482 in 3d3f6d7
Oh nice. TIL
There was a problem hiding this comment.
Ohhh that's a very useful crate! I've updated the code
clickhouse-admin API from SMF servicesclickhouse-admin API from SMF services
Overview
This commit replaces the old replicated ClickHouse server and keeper configuration templates with calls to the
clickhouse-adminAPI that generate said configuration files.Purpose
While the end goal is to have Nexus make the API calls to generate the configuration files, we'd like to have a working implementation of the
clickhouse-adminAPI via the SMF services. Usingcurlis not what the finished work will look like, but rather it is the simplest way to have a working implementation in the mean time.Testing
Deployed this branch on a Helios machine with the following results
Replica 1
Replica 2
Keeper 1
Keeper 2
Keeper 3
Related: #5999
Closes: #3824