Skip to content

[oximeter] Write to both single node and cluster #7565

Merged
karencfv merged 21 commits into
oxidecomputer:mainfrom
karencfv:oximeter-writes
Mar 6, 2025
Merged

[oximeter] Write to both single node and cluster #7565
karencfv merged 21 commits into
oxidecomputer:mainfrom
karencfv:oximeter-writes

Conversation

@karencfv

@karencfv karencfv commented Feb 20, 2025

Copy link
Copy Markdown
Contributor

As part of phase one of rolling out the replicated ClickHouse cluster, we'll be writing to both the single node and replicated cluster when it exists. In our dogfood rack we'll enable the replicated cluster to perform long running tests.

Manual tests on a local machine

With a fresh omicron deployment that only contains a single node clickhouse server we have a happy oximeter that has no trouble executing without a replicated cluster:

root@oxz_oximeter_e64069da:~# tail -F $(svcs -L oximeter) | looker
08:16:28.794Z DEBG oximeter (oximeter-agent): inserting 1001 samples into database
    collector_id = adf7542a-c3fb-46cf-82d5-894e94774275
    collector_ip = fd00:1122:3344:101::d
08:16:28.794Z DEBG oximeter (oximeter-agent): unrolling 1001 total samples
    collector_id = adf7542a-c3fb-46cf-82d5-894e94774275
    collector_ip = fd00:1122:3344:101::d
    id = 74af0133-57b1-444f-a3a0-0a73dea6b162
08:16:28.829Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = adf7542a-c3fb-46cf-82d5-894e94774275
    collector_ip = fd00:1122:3344:101::d
    id = 74af0133-57b1-444f-a3a0-0a73dea6b162
    n_rows = 12
    table_name = fields_ipaddr
08:16:28.835Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = adf7542a-c3fb-46cf-82d5-894e94774275
    collector_ip = fd00:1122:3344:101::d
    id = 74af0133-57b1-444f-a3a0-0a73dea6b162
    n_rows = 1236
    table_name = fields_string
08:16:28.837Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = adf7542a-c3fb-46cf-82d5-894e94774275
    collector_ip = fd00:1122:3344:101::d
    id = 74af0133-57b1-444f-a3a0-0a73dea6b162
    n_rows = 29
    table_name = fields_u16
^C
root@oxz_oximeter_e64069da:~# cat $(svcs -L oximeter) | grep error | looker
root@oxz_oximeter_e64069da:~# cat $(svcs -L oximeter) | grep failed | looker
root@oxz_oximeter_e64069da:~# 

And on the ClickHouse server itself:

oxz_clickhouse_661b3b06-8978-4f69-8a65-a06c8473eeb8.local :) select * from oximeter.fields_string limit 5

SELECT *
FROM oximeter.fields_string
LIMIT 5

Query id: f35ad598-9352-4952-85be-e951b4b95959

┌─timeseries_name─────────────────────────┬───────timeseries_key─┬─field_name─┬─field_value─┐
│ ddm_router:originated_tunnel_endpoints  │ 10941694599792570795 │ hostname   │ centzon     │
│ ddm_router:originated_tunnel_endpoints  │  3950329218159376760 │ hostname   │ oxz_switch  │
│ ddm_router:originated_underlay_prefixes │  6849359716726285930 │ hostname   │ centzon     │
│ ddm_router:originated_underlay_prefixes │   860939520243044258 │ hostname   │ oxz_switch  │
│ ddm_session:advertisements_received     │ 12326296782677083959 │ hostname   │ centzon     │
└─────────────────────────────────────────┴──────────────────────┴────────────┴─────────────┘

5 rows in set. Elapsed: 0.005 sec. Processed 1.92 thousand rows, 165.91 KB (414.22 thousand rows/s., 35.72 MB/s.)
Peak memory usage: 82.35 KiB.

We now set and enable the ClickHouse policy to have both a single node and cluster running side by side.

Select the last five rows of the same table on each ClickHouse installation to verify they are the same:

On a single node

oxz_clickhouse_e3f5c588-1dd1-4591-96e8-65b49f955c46.local :) select * from oximeter.measurements_u64 order by timestamp desc limit 5

SELECT *
FROM oximeter.measurements_u64
ORDER BY timestamp DESC
LIMIT 5

Query id: 6cc27e41-b0ae-4fdc-ac36-cf985b9525a6

┌─timeseries_name────────────────────────┬───────timeseries_key─┬─────────────────────timestamp─┬─datum─┐
│ mg_lower:routes_blocked_by_link_state  │ 17155820722314279312 │ 2025-02-20 08:32:40.742970761 │     0 │
│ switch_rib:active_routes               │  1431861529849152348 │ 2025-02-20 08:32:40.742929888 │     1 │
│ ddm_session:imported_tunnel_endpoints  │ 15253212790285366369 │ 2025-02-20 08:32:40.741352140 │     0 │
│ ddm_session:imported_underlay_prefixes │  4313083866723829930 │ 2025-02-20 08:32:40.741321058 │     0 │
│ ddm_session:imported_tunnel_endpoints  │ 15253212790285366369 │ 2025-02-20 08:32:40.740982224 │     0 │
└────────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────┘

5 rows in set. Elapsed: 0.033 sec. Processed 7.08 thousand rows, 499.52 KB (216.73 thousand rows/s., 15.28 MB/s.)
Peak memory usage: 416.54 KiB.

On the replicated cluster

oximeter_cluster_1 :) select * from oximeter.measurements_u64 order by timestamp desc limit 5

SELECT *
FROM oximeter.measurements_u64
ORDER BY timestamp DESC
LIMIT 5

Query id: 52593d3f-fe25-4064-98b4-3fe1c5f9425f

┌─timeseries_name────────────────────────┬───────timeseries_key─┬─────────────────────timestamp─┬─datum─┐
│ mg_lower:routes_blocked_by_link_state  │ 17155820722314279312 │ 2025-02-20 08:32:05.745118967 │     0 │
│ switch_rib:active_routes               │  1431861529849152348 │ 2025-02-20 08:32:05.745073424 │     1 │
│ ddm_session:imported_tunnel_endpoints  │ 15253212790285366369 │ 2025-02-20 08:32:05.743545099 │     0 │
│ ddm_session:imported_underlay_prefixes │  4313083866723829930 │ 2025-02-20 08:32:05.743506216 │     0 │
│ ddm_session:imported_tunnel_endpoints  │ 15253212790285366369 │ 2025-02-20 08:32:05.743068996 │     0 │
└────────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────┘

5 rows in set. Elapsed: 0.028 sec. Processed 6.70 thousand rows, 471.80 KB (243.17 thousand rows/s., 17.12 MB/s.)
Peak memory usage: 375.72 KiB.

Closes: #7419

@karencfv karencfv left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnaecker @andrewjstone I'd really appreciate a thorough review as I'm not that familiar with this part of the codebase. Want to make sure I'm not missing anything!

@karencfv

karencfv commented Feb 21, 2025

Copy link
Copy Markdown
Contributor Author

Ha! Looks like this PR is completely unnecessary and oximeter is already writing to both single node and cluster.

Here is a timeline of my discovery:

  • I realised there was an error in my code. For ClickhouseNative to be only available in the single node implemetation then it should not exist for a cluster:
    self.service_backend_zone(
    ServiceName::ClickhouseNative,
    &zone,
    CLICKHOUSE_TCP_PORT,
    )?;
    if http_service == ServiceName::ClickhouseServer {
    self.service_backend_zone(
    ServiceName::ClickhouseClusterNative,
    &zone,
    CLICKHOUSE_TCP_PORT,
    )?;
    self.service_backend_zone(
    ServiceName::ClickhouseAdminServer,
    &zone,
    CLICKHOUSE_ADMIN_PORT,
    )?;
    }
        if http_service == ServiceName::Clickhouse {
            self.service_backend_zone(
                 ServiceName::ClickhouseNative,
                 &zone,
                 CLICKHOUSE_TCP_PORT,
             )?;
        }
    
         if http_service == ServiceName::ClickhouseServer {
             self.service_backend_zone(
                 ServiceName::ClickhouseClusterNative,
                 &zone,
                 CLICKHOUSE_TCP_PORT,
             )?;
             self.service_backend_zone(
                 ServiceName::ClickhouseAdminServer,
                 &zone,
                 CLICKHOUSE_ADMIN_PORT,
             )?;
         }
  • I then proceeded to "fix" my code and try it out on my local machine. And surprise! Oximeter was no longer writing to the ClickHouse cluster at all. This made no sense at all. Unless...
  • Was oximeter already writing to both????
  • I deployed omicron from the main branch, and yep! oximeter was happily already writing to both databases 🙃
oxz_clickhouse_68ab5fed-c9a5-423c-9a92-51aca53fcafc.local :) select * from oximeter.measurements_u64 order by timestamp desc limit 5

SELECT *
FROM oximeter.measurements_u64
ORDER BY timestamp DESC
LIMIT 5

Query id: 990e700c-3653-4734-8c29-f1e923077ac0

┌─timeseries_name─────────────────────────┬──────timeseries_key─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:imported_tunnel_endpoints   │ 1773657864559831261 │ 2025-02-21 00:23:17.145342035 │     0 │
│ ddm_session:imported_underlay_prefixes  │ 2630938741805735703 │ 2025-02-21 00:23:17.145313073 │     5 │
│ ddm_router:originated_tunnel_endpoints  │ 7317262939213848358 │ 2025-02-21 00:23:17.144859612 │     1 │
│ ddm_router:originated_underlay_prefixes │ 4365815392617391907 │ 2025-02-21 00:23:17.144730893 │     1 │
│ ddm_session:imported_tunnel_endpoints   │ 9708227563491034517 │ 2025-02-21 00:23:17.144437503 │     0 │
└─────────────────────────────────────────┴─────────────────────┴───────────────────────────────┴───────┘

5 rows in set. Elapsed: 0.003 sec. Processed 5.08 thousand rows, 348.96 KB (1.47 million rows/s., 100.61 MB/s.)
Peak memory usage: 300.64 KiB.
oximeter_cluster_1 :) select * from oximeter.measurements_u64 order by timestamp desc limit 5

SELECT *
FROM oximeter.measurements_u64
ORDER BY timestamp DESC
LIMIT 5

Query id: d0ec4b2b-2703-41b2-b84c-04291bf1eb4f

┌─timeseries_name─────────────────────────┬──────timeseries_key─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:imported_tunnel_endpoints   │ 1773657864559831261 │ 2025-02-21 00:23:07.146672589 │     0 │
│ ddm_session:imported_underlay_prefixes  │ 2630938741805735703 │ 2025-02-21 00:23:07.146642557 │     5 │
│ ddm_router:originated_tunnel_endpoints  │ 7317262939213848358 │ 2025-02-21 00:23:07.146274031 │     1 │
│ ddm_router:originated_underlay_prefixes │ 4365815392617391907 │ 2025-02-21 00:23:07.146162243 │     1 │
│ ddm_session:imported_tunnel_endpoints   │ 9708227563491034517 │ 2025-02-21 00:23:07.144880295 │     0 │
└─────────────────────────────────────────┴─────────────────────┴───────────────────────────────┴───────┘

5 rows in set. Elapsed: 0.003 sec. Processed 1.88 thousand rows, 129.71 KB (543.99 thousand rows/s., 37.57 MB/s.)
Peak memory usage: 117.94 KiB.

So yeah, I'll be closing this PR. Before I do so, I have a few questions:

  • Does this mean oximeter is alternating between each ClickHouse installation and sometimes writing to the single-node and sometimes to the cluster? It appears not, since the last 5 items of the table are exactly the same. (I also tried the same query with larger limits and the results remained consistent)
  • Is Oximeter sometimes reading from the cluster as well already?

@bnaecker WDYT about this behaviour?

@bnaecker

Copy link
Copy Markdown
Collaborator

I'm pretty sure something is off and that this deserves more investigation.

oximeter right now doesn't really know anything about the replicated database. It's given a client, which has a single TCP connection to...something...and it calls Client::insert_samples() to write data. If the server on the other end of that TCP connection has a replicated, Distributed table called oximeter.measurements_u64, then that's where it writes. If it's a single-node table, then that's where it writes too. We intentionally designed the names of the tables so that oximeter would not need to change the SQL it uses to insert data into either kind of table. Unless I'd wildly misremembering the code I wrote 4 years ago (which is definitely possible!) there is no way for it to write to both databases. It has one TCP connection, and always emits the same SQL for the same set of samples.

It's not really clear to me from your SQL shell what databases you're selecting from. How do you know those tables are really different? Could you DESCRIBE each of them from the same shell? Also, the records you've shown are not exactly the same, they have different timestamps. Is that important?

@bnaecker

Copy link
Copy Markdown
Collaborator

One other thing to look at is DNS. I said that oximeter has one TCP connection which is only sort of true. We also use qorb to do connection pooling, so we actually have N connections, one to each backend. These are considered equivalent, so we could use multiple actual TCP connections to write different pieces of data.

However, if we somehow have AAAA records in DNS for the single-node and replicated servers, both returned when we lookup the same SRV name, then that could conceivably cause those different TCP connections to actually be to different servers. We should definitely confirm that the DNS records here are all pointing to the same "kind" of ClickHouse server, all single-node or all replicated.

@karencfv

karencfv commented Feb 21, 2025

Copy link
Copy Markdown
Contributor Author

It's not really clear to me from your SQL shell what databases you're selecting from. How do you know those tables are really different? Could you DESCRIBE each of them from the same shell? Also, the records you've shown are not exactly the same, they have different timestamps. Is that important?

I've already destroyed that deployment, but I think I didn't explain myself very well. By "different databases" I meant different ClickHouse installation types (single-node or cluster). The prompt for the single node starts with oxz_clickhouse_68ab5fed-c9a5-423c-9a92-51aca53fcafc.local :) and the one for the cluster is oximeter_cluster_1 :).
I don't think the timestamps are that important in this case. For the case of having both installations running side by side, what we'll be testing is performance, availability etc. None of our users will have both of them running side by side, we'll only have this setup on dogfood.

However, if we somehow have AAAA records in DNS for the single-node and replicated servers, both returned when we lookup the same SRV name, then that could conceivably cause those different TCP connections to actually be to different servers.

I considered this, but wouldn't that mean all timeseries would be inserted in one of single-node or cluster, but not both? I would expect that the timeseries stored in each one would be different no?

These are the samples that I took from that deployment with larger limits:

single-node

oxz_clickhouse_68ab5fed-c9a5-423c-9a92-51aca53fcafc.local :) select * from oximeter.measurements_u64 order by timestamp desc limit 20

SELECT *
FROM oximeter.measurements_u64
ORDER BY timestamp DESC
LIMIT 20

Query id: bbbeefc2-21ed-4f6b-b24a-ae739e9fa656

┌─timeseries_name─────────────────────────┬───────timeseries_key─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:imported_tunnel_endpoints   │  1773657864559831261 │ 2025-02-21 00:39:32.097297883 │     0 │
│ ddm_session:imported_tunnel_endpoints   │  9708227563491034517 │ 2025-02-21 00:39:32.097231948 │     0 │
│ ddm_session:imported_underlay_prefixes  │  2630938741805735703 │ 2025-02-21 00:39:32.097219908 │     5 │
│ ddm_session:imported_underlay_prefixes  │  8797923826375925960 │ 2025-02-21 00:39:32.097150063 │     0 │
│ ddm_router:originated_tunnel_endpoints  │  7317262939213848358 │ 2025-02-21 00:39:32.096205998 │     1 │
│ ddm_session:imported_tunnel_endpoints   │  9708227563491034517 │ 2025-02-21 00:39:32.096196077 │     0 │
│ ddm_session:imported_underlay_prefixes  │  8797923826375925960 │ 2025-02-21 00:39:32.096112151 │     1 │
│ ddm_router:originated_underlay_prefixes │  4365815392617391907 │ 2025-02-21 00:39:32.095992783 │     1 │
│ ddm_router:originated_tunnel_endpoints  │  9039498691245586535 │ 2025-02-21 00:39:32.095073710 │     0 │
│ ddm_router:originated_underlay_prefixes │ 17399952546143151644 │ 2025-02-21 00:39:32.094859585 │     5 │
│ mg_lower:routes_blocked_by_link_state   │ 12636176154868661118 │ 2025-02-21 00:39:32.092703546 │     0 │
│ switch_rib:active_routes                │  6539613388865110263 │ 2025-02-21 00:39:32.092656973 │     1 │
│ ddm_session:imported_tunnel_endpoints   │  1773657864559831261 │ 2025-02-21 00:39:31.097915436 │     0 │
│ ddm_session:imported_underlay_prefixes  │  2630938741805735703 │ 2025-02-21 00:39:31.097838811 │     5 │
│ ddm_router:originated_tunnel_endpoints  │  7317262939213848358 │ 2025-02-21 00:39:31.096877575 │     1 │
│ ddm_router:originated_underlay_prefixes │  4365815392617391907 │ 2025-02-21 00:39:31.096613676 │     1 │
│ ddm_session:imported_tunnel_endpoints   │  9708227563491034517 │ 2025-02-21 00:39:31.095195089 │     0 │
│ ddm_session:imported_underlay_prefixes  │  8797923826375925960 │ 2025-02-21 00:39:31.095170217 │     0 │
│ ddm_session:imported_tunnel_endpoints   │  9708227563491034517 │ 2025-02-21 00:39:31.094886087 │     0 │
│ ddm_session:imported_underlay_prefixes  │  8797923826375925960 │ 2025-02-21 00:39:31.094859586 │     1 │
└─────────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────┘

20 rows in set. Elapsed: 0.005 sec. Processed 8.68 thousand rows, 596.88 KB (1.73 million rows/s., 118.90 MB/s.)
Peak memory usage: 445.41 KiB.

cluster

oximeter_cluster_1 :) select * from oximeter.measurements_u64 order by timestamp desc limit 20

SELECT *
FROM oximeter.measurements_u64
ORDER BY timestamp DESC
LIMIT 20

Query id: ba8370ac-d8d3-4619-8107-dfa4bdb6c8dc

┌─timeseries_name─────────────────────────┬───────timeseries_key─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:imported_tunnel_endpoints   │  1773657864559831261 │ 2025-02-21 00:40:12.095907025 │     0 │
│ ddm_session:imported_underlay_prefixes  │  2630938741805735703 │ 2025-02-21 00:40:12.095826699 │     5 │
│ ddm_router:originated_tunnel_endpoints  │  7317262939213848358 │ 2025-02-21 00:40:12.094774287 │     1 │
│ ddm_session:imported_tunnel_endpoints   │  9708227563491034517 │ 2025-02-21 00:40:12.094598095 │     0 │
│ ddm_router:originated_underlay_prefixes │  4365815392617391907 │ 2025-02-21 00:40:12.094557112 │     1 │
│ ddm_session:imported_underlay_prefixes  │  8797923826375925960 │ 2025-02-21 00:40:12.094512779 │     0 │
│ ddm_session:imported_tunnel_endpoints   │  9708227563491034517 │ 2025-02-21 00:40:12.093621308 │     0 │
│ ddm_session:imported_underlay_prefixes  │  8797923826375925960 │ 2025-02-21 00:40:12.093540682 │     1 │
│ ddm_router:originated_tunnel_endpoints  │  9039498691245586535 │ 2025-02-21 00:40:12.092587776 │     0 │
│ ddm_router:originated_underlay_prefixes │ 17399952546143151644 │ 2025-02-21 00:40:12.092373402 │     5 │
│ mg_lower:routes_blocked_by_link_state   │ 12636176154868661118 │ 2025-02-21 00:40:12.090290578 │     0 │
│ switch_rib:active_routes                │  6539613388865110263 │ 2025-02-21 00:40:12.090229234 │     1 │
│ ddm_session:imported_tunnel_endpoints   │  1773657864559831261 │ 2025-02-21 00:40:11.095234329 │     0 │
│ ddm_session:imported_underlay_prefixes  │  2630938741805735703 │ 2025-02-21 00:40:11.095210238 │     5 │
│ ddm_router:originated_tunnel_endpoints  │  7317262939213848358 │ 2025-02-21 00:40:11.094714633 │     1 │
│ ddm_router:originated_underlay_prefixes │  4365815392617391907 │ 2025-02-21 00:40:11.094500629 │     1 │
│ ddm_session:imported_tunnel_endpoints   │  9708227563491034517 │ 2025-02-21 00:40:11.094465846 │     0 │
│ ddm_session:imported_underlay_prefixes  │  8797923826375925960 │ 2025-02-21 00:40:11.094367639 │     0 │
│ ddm_session:imported_tunnel_endpoints   │  9708227563491034517 │ 2025-02-21 00:40:11.092786060 │     0 │
│ ddm_session:imported_underlay_prefixes  │  8797923826375925960 │ 2025-02-21 00:40:11.092730977 │     1 │
└─────────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────┘

20 rows in set. Elapsed: 0.003 sec. Processed 12.13 thousand rows, 835.31 KB (4.36 million rows/s., 300.38 MB/s.)
Peak memory usage: 457.06 KiB.

Like you mention, the timeseries have slightly different timestamps, but otherwise they are the same.

@bnaecker

Copy link
Copy Markdown
Collaborator

By "different databases" I meant different ClickHouse installation types

That's helpful thanks. Just to check, were these running at the same time? That is, were the zones for both installation type running.

I considered this, but wouldn't that mean all timeseries would be inserted in one of single-node or cluster, but not both? I would expect that the timeseries stored in each one would be different no?

I'm not sure this follows. Each time oximeter inserts samples, it will potentially check out a different connection from qorb. If our theory is that qorb has a pool with connections to both installation, then each batch of samples might be directed to either installation.

This theory should be easy to disprove. If you create the same environment again, let's see what's in DNS under the ClickhouseNative SRV record. You'll get a bunch of AAAA records, and they should all point to the same installation. If that's true, then this theory can be discarded. But if we see AAAA records from both kinds, then that's at least consistent with what we're seeing.

If that turns out to be the case, then next thing would be to look at either the logs for oximeter or the DTrace probes for the qorb pool it's using. The handle-claimed probe seems like the most relevant, since that fires with the address whenever we check out a connection from the pool.

To be clear, I don't know that this is the issue! But it seems like a theory consistent with the data we have so far. I definitely think we should get to the bottom of why its happening before moving further -- I don't think either of us intended for this to be the case!

@karencfv

karencfv commented Feb 21, 2025

Copy link
Copy Markdown
Contributor Author

I added a bit more debugging info to the logs. Before each insertion is made, the following information will be printed

debug!(
    self.log,
    "inserting data to {:#?}", handle.server_info(); // This prints information about the clickhouse server oximeter is talking to
    "task_id" => ?task_id, // This is a UUID that is created each time `insert_unrolled_samples` is called.
                           // it will help us determine whether the same insertion is being made to several
                           // clickhouse servers
    "sql" => sql,
    "n_rows" => block.n_rows(),
    "n_columns" => block.n_columns(),
);

When running a single-node and cluster side by side, we get the following results:

Oximeter is inserting data to the single node and cluster indiscriminately. It's inserting data to all of oximeter_cluster_3,
oximeter_cluster_2, oximeter_cluster_1 and oxz_clickhouse_02a72c04-8322-42d5-84e7-29be8d8a9f20.local which is the single node.

03:00:51.465Z DEBG oximeter (oximeter-agent): inserting data to Hello {
        name: "ClickHouse",
        version_major: 23,
        version_minor: 8,
        revision: 54465,
        tz: "UTC",
        display_name: "oximeter_cluster_3",
        version_patch: 7,
        password_complexity_rules: [],
        interserver_secret: 11436262184387220048,
    }
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_columns = 4
    n_rows = 108
    sql = INSERT INTO oximeter.fields_string FORMAT Native
    task_id = 60d268a0-08c3-4c02-918c-dce0ec33ee54
03:00:51.479Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_rows = 108
    table_name = fields_string
03:00:51.479Z DEBG oximeter (oximeter-agent): inserting data to Hello {
        name: "ClickHouse",
        version_major: 23,
        version_minor: 8,
        revision: 54465,
        tz: "UTC",
        display_name: "oxz_clickhouse_02a72c04-8322-42d5-84e7-29be8d8a9f20.local",
        version_patch: 7,
        password_complexity_rules: [],
        interserver_secret: 13690808761680913153,
    }
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_columns = 4
    n_rows = 26
    sql = INSERT INTO oximeter.fields_u16 FORMAT Native
    task_id = 5369599f-f60a-42b5-9377-2a093aba0917
03:00:51.481Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_rows = 26
    table_name = fields_u16
03:00:51.481Z DEBG oximeter (oximeter-agent): inserting data to Hello {
        name: "ClickHouse",
        version_major: 23,
        version_minor: 8,
        revision: 54465,
        tz: "UTC",
        display_name: "oximeter_cluster_1",
        version_patch: 7,
        password_complexity_rules: [],
        interserver_secret: 9218049220290969329,
    }
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_columns = 4
    n_rows = 90
    sql = INSERT INTO oximeter.fields_uuid FORMAT Native
    task_id = 50c0f687-2815-4e5d-92da-c8e2591a3a76
03:00:51.541Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_rows = 90
    table_name = fields_uuid
03:00:51.541Z DEBG oximeter (oximeter-agent): inserting data to Hello {
        name: "ClickHouse",
        version_major: 23,
        version_minor: 8,
        revision: 54465,
        tz: "UTC",
        display_name: "oximeter_cluster_2",
        version_patch: 7,
        password_complexity_rules: [],
        interserver_secret: 16769546681719372887,
    }
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_columns = 5
    n_rows = 160
    sql = INSERT INTO oximeter.measurements_cumulativeu64 FORMAT Native
    task_id = f37b6256-52c9-4850-af69-38246e4ea367
03:00:51.601Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_rows = 160
    table_name = measurements_cumulativeu64
03:00:51.601Z DEBG oximeter (oximeter-agent): inserting data to Hello {
        name: "ClickHouse",
        version_major: 23,
        version_minor: 8,
        revision: 54465,
        tz: "UTC",
        display_name: "oximeter_cluster_3",
        version_patch: 7,
        password_complexity_rules: [],
        interserver_secret: 11436262184387220048,
    }
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_columns = 19
    n_rows = 26
    sql = INSERT INTO oximeter.measurements_histogramu64 FORMAT Native
    task_id = 268466fe-101c-40d7-a2cf-4d18f6b9780b

When I filter by task_id I only retrieve a single item. That makes me think that the data between the cluster and the single-node should not be the same (but it is as far as I can tell ?!!?!?).

root@oxz_oximeter_722c26c9:~# cat $(svcs -L oximeter) | grep 268466fe-101c-40d7-a2cf-4d18f6b9780b | looker
03:00:51.601Z DEBG oximeter (oximeter-agent): inserting data to Hello {
        name: "ClickHouse",
        version_major: 23,
        version_minor: 8,
        revision: 54465,
        tz: "UTC",
        display_name: "oximeter_cluster_3",
        version_patch: 7,
        password_complexity_rules: [],
        interserver_secret: 11436262184387220048,
    }
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_columns = 19
    n_rows = 26
    sql = INSERT INTO oximeter.measurements_histogramu64 FORMAT Native
    task_id = 268466fe-101c-40d7-a2cf-4d18f6b9780b

When filtering by table, interesting things start to pop up. These two entries that were inserted at almost the same time have the same amount of rows, but inserting to two different places. It could be possible that the data is the same.

root@oxz_oximeter_722c26c9:~# cat $(svcs -L oximeter) | grep "INSERT INTO oximeter.measurements_cumulativeu64" | looker
03:29:21.436Z DEBG oximeter (oximeter-agent): inserting data to Hello {
        name: "ClickHouse",
        version_major: 23,
        version_minor: 8,
        revision: 54465,
        tz: "UTC",
        display_name: "oximeter_cluster_3",
        version_patch: 7,
        password_complexity_rules: [],
        interserver_secret: 11436262184387220048,
    }
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_columns = 5
    n_rows = 160
    sql = INSERT INTO oximeter.measurements_cumulativeu64 FORMAT Native
    task_id = 353561af-dfbe-4c26-b764-fa949309aa25
03:29:26.426Z DEBG oximeter (oximeter-agent): inserting data to Hello {
        name: "ClickHouse",
        version_major: 23,
        version_minor: 8,
        revision: 54465,
        tz: "UTC",
        display_name: "oxz_clickhouse_02a72c04-8322-42d5-84e7-29be8d8a9f20.local",
        version_patch: 7,
        password_complexity_rules: [],
        interserver_secret: 13690808761680913153,
    }
    collector_id = 722c26c9-b512-4ec3-95d7-87d08da42758
    collector_ip = fd00:1122:3344:101::d
    id = e4b76b18-9e71-4841-a410-483527d39154
    n_columns = 5
    n_rows = 160
    sql = INSERT INTO oximeter.measurements_cumulativeu64 FORMAT Native
    task_id = 5fb7ceeb-1c26-4dbd-b15c-b1acb74f0977

I went to each of the databases and filtered with a range starting with the timestamp of each log entry relative to the server it was inserting to (the entries should somewhat match right?)

single-node

oxz_clickhouse_02a72c04-8322-42d5-84e7-29be8d8a9f20.local :) select * from oximeter.measurements_cumulativeu64 where timeseries_key == '14344414001964427869' and timestamp >= '2025-02-21 03:29:26.000000000' and timestamp <= '2025-02-21 03:30:30.000000000' limit 30

SELECT *
FROM oximeter.measurements_cumulativeu64
WHERE (timeseries_key = '14344414001964427869') AND (timestamp >= '2025-02-21 03:29:26.000000000') AND (timestamp <= '2025-02-21 03:30:30.000000000')
LIMIT 30

Query id: 43c9a946-8941-48cd-97d9-e8c8cb361199

┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:46.723434017 │  1334 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:46.724113904 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:47.723863186 │  1334 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:47.724290426 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:48.723864976 │  1335 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:48.724329318 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:49.724373650 │  1335 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:49.725372949 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:50.724091220 │  1336 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:50.724608016 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:06.722571028 │  1344 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:06.723229033 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:07.723631700 │  1344 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:07.724749987 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:08.722600869 │  1345 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:08.723732457 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:09.722739598 │  1345 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:09.723136925 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:10.722354581 │  1346 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:10.722971763 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:21.722322033 │  1351 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:21.722707749 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:22.722148570 │  1352 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:22.722637934 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:23.722629173 │  1352 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:23.723252896 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:24.722455020 │  1353 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:24.722925032 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:25.721864759 │  1353 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:30:25.722423507 │     0 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘

30 rows in set. Elapsed: 0.010 sec. Processed 25.29 thousand rows, 756.61 KB (2.43 million rows/s., 72.61 MB/s.)
Peak memory usage: 588.93 KiB.

cluster

oximeter_cluster_2 :) select * from oximeter.measurements_cumulativeu64 where timeseries_key == '14344414001964427869' and timestamp >= '2025-02-21 03:29:21.000000000' and timestamp <= '2025-02-21 03:30:30.000000000' limit 30

SELECT *
FROM oximeter.measurements_cumulativeu64
WHERE (timeseries_key = '14344414001964427869') AND (timestamp >= '2025-02-21 03:29:21.000000000') AND (timestamp <= '2025-02-21 03:30:30.000000000')
LIMIT 30

Query id: 54148a70-67cb-40f6-998c-4fdfce3b4d69

┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:26.724438726 │  1324 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:26.724846904 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:27.725135393 │  1324 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:27.725510249 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:28.724899427 │  1325 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:28.725888415 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:29.724434174 │  1325 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:29.725008654 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:30.724982401 │  1326 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:30.725625556 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:31.724797288 │  1326 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:31.725180385 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:32.724719642 │  1327 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:32.725156042 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:33.724849601 │  1327 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:33.725960147 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:34.724869582 │  1328 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:34.725658126 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:35.724028603 │  1328 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:35.724694189 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:36.724673907 │  1329 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:36.725325082 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:37.724501265 │  1329 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:37.725552967 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:38.724446481 │  1330 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:38.725465541 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:39.723871550 │  1330 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:39.724367855 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:40.723720900 │  1331 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 03:29:40.724124297 │     0 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘

30 rows in set. Elapsed: 0.010 sec. Processed 9.91 thousand rows, 348.77 KB (994.43 thousand rows/s., 34.99 MB/s.)
Peak memory usage: 1.04 MiB.

The entries are not similar at all. Also worth noting that the amount of rows on the cluster is much less. I expect the number of rows in the cluster to be less since it takes a couple of minutes to get the cluster started after the initial oximeter deployment is up, but this seems like a significant difference in size.

I also ran a query that shows the last 30 rows filtered by timeseries_key and the results are not similar at all

single-node

oxz_clickhouse_02a72c04-8322-42d5-84e7-29be8d8a9f20.local :) select * from oximeter.measurements_cumulativeu64 where timeseries_key == '14344414001964427869' order by timestamp desc limit 30

SELECT *
FROM oximeter.measurements_cumulativeu64
WHERE timeseries_key = '14344414001964427869'
ORDER BY timestamp DESC
LIMIT 30

Query id: 17c887af-1140-4473-94c2-ba89d57e4835

┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:20.608040313 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:20.607426171 │  2490 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:19.607244639 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:19.606841631 │  2489 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:18.607646187 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:18.607181905 │  2489 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:17.608236549 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:17.607859753 │  2488 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:16.607804969 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:16.607257332 │  2488 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘
┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:00.608683488 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:08:00.608245168 │  2480 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:59.609416389 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:59.608798497 │  2479 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:58.610087786 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:58.609063415 │  2479 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:57.610348634 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:57.609155012 │  2478 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:56.608717953 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:56.608314415 │  2478 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘
┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:45.609763410 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:45.609357573 │  2473 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:44.609628552 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:44.608997478 │  2472 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:43.609399246 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:43.609042211 │  2472 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:42.611323769 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:42.610241404 │  2471 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘
┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:41.610068953 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:41.609622792 │  2471 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘

30 rows in set. Elapsed: 0.005 sec. Processed 93.67 thousand rows, 1.33 MB (17.21 million rows/s., 244.14 MB/s.)
Peak memory usage: 547.87 KiB.

cluster

oximeter_cluster_2 :) select * from oximeter.measurements_cumulativeu64 where timeseries_key == '14344414001964427869' order by timestamp desc limit 30

SELECT *
FROM oximeter.measurements_cumulativeu64
WHERE timeseries_key = '14344414001964427869'
ORDER BY timestamp DESC
LIMIT 30

Query id: 3df9b5c3-bf17-42c8-b998-15d44ad50219

┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:55.609476586 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:55.608998133 │  2477 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:54.609486337 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:54.608997543 │  2477 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:53.609554152 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:53.608904357 │  2476 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:52.609478037 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:52.609089860 │  2476 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:51.609192228 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:51.608631349 │  2475 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘
┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:50.609455237 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:50.608873827 │  2475 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:49.610714104 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:49.609693724 │  2474 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:48.609771559 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:48.609193950 │  2474 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:47.610702874 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:47.609693905 │  2474 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:46.609364492 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:46.608894560 │  2473 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘
┌─timeseries_name─────────────────────┬───────timeseries_key─┬────────────────────start_time─┬─────────────────────timestamp─┬─datum─┐
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:40.609394797 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:40.608966347 │  2470 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:39.610676926 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:39.609613312 │  2470 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:38.609548358 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:38.609002681 │  2469 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:37.609726381 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:37.609297941 │  2469 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:36.609868001 │     0 │
│ ddm_session:advertisements_received │ 14344414001964427869 │ 2025-02-21 02:44:58.186343264 │ 2025-02-21 04:07:36.609528897 │  2468 │
└─────────────────────────────────────┴──────────────────────┴───────────────────────────────┴───────────────────────────────┴───────┘

30 rows in set. Elapsed: 0.009 sec. Processed 182.41 thousand rows, 2.08 MB (21.05 million rows/s., 239.66 MB/s.)
Peak memory usage: 1022.27 KiB.

At this point I think it's safe to say that the data between the single node and the cluster is not the same. The samples I took earlier that seemed to suggest the same data was being saved in both installations were most likely a red herring (the values were all 0, 1 and 5 so it's easy for the data to look the same).

With all of this, I think we can conclude that the existence of ClickhouseNative in both single-node and cluster is causing oximeter to write to either one.

@karencfv

Copy link
Copy Markdown
Contributor Author

To be clear, I don't know that this is the issue! But it seems like a theory consistent with the data we have so far. I definitely think we should get to the bottom of why its happening before moving further -- I don't think either of us intended for this to be the case!

Sorry @bnaecker ! Didn't see this before. I was writing that big chunk above and missed your comment. With everything above, I believe you are right with your assessment and each batch of samples is being directed to one or the other.

@karencfv

Copy link
Copy Markdown
Contributor Author

I have a feeling this should be an easy fix by having ClickhouseNative only live in the single-node zone

something like

if http_service == ServiceName::Clickhouse {
    self.service_backend_zone(
         ServiceName::ClickhouseNative,
         &zone,
         CLICKHOUSE_TCP_PORT,
     )?;
}

in https://github.com/oxidecomputer/omicron/blob/main/internal-dns/types/src/config.rs#L436-L453

@karencfv

Copy link
Copy Markdown
Contributor Author

I'll test that fix on Monday as I'm getting ready to head out now (it's Friday in NZ!). Thanks for the help @bnaecker!

@bnaecker

Copy link
Copy Markdown
Collaborator

Thanks @karencfv that's pretty conclusive. To confirm, do we see this behavior when running on main? We don't have the cluster installation yet today on a rack like Dogfood, but we'd expect to see this issue if we were to start that up? If so, I think we need an issue to track this, rather than continuing here.

@karencfv

Copy link
Copy Markdown
Contributor Author

To confirm, do we see this behavior when running on main?

Yes, I did all of the debugging using the latest commit from main.

We don't have the cluster installation yet today on a rack like Dogfood, but we'd expect to see this issue if we were to start that up? If so, I think we need an issue to track this, rather than continuing here.

Yeah, we would definitely see the issue if we were to set the ClickHouse policy to spin up both ClickHouse installations. Sure, I'll write up an issue to track this bug.

@andrewjstone

Copy link
Copy Markdown
Contributor

Thanks @karencfv @bnaecker. Great debugging and teamwork! I can jump in and help next week as well.

@karencfv

Copy link
Copy Markdown
Contributor Author

@andrewjstone @bnaecker, after fixing the bug we found I've updated this PR. I ran all of the manual tests in the PR's description and working as expected on a local omicron deployment. Would appreciate a review!

@andrewjstone andrewjstone left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @karencfv. Thanks again for taking this on. Approval pending Ben's feedback about attempting to ping clickhouse cluster in production.

Comment thread internal-dns/types/src/names.rs Outdated
ClickhouseNative,
/// The native TCP interface to a ClickHouse server.
///
/// NOTE: This is a temporary service name for replicated cluster.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is temporary necessarily. We may end up keeping this name.

Related, I think we need to change the comment above ClickhouseNative to say that it only points to single-node servers.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both. I think this should be the name of the DNS record for the cluster, even if we completely remove the single-node.

And yes, the comment on L36 should change too.

Comment thread oximeter/collector/src/results_sink.rs Outdated

@andrewjstone andrewjstone left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks @karencfv!

@karencfv karencfv disabled auto-merge February 27, 2025 03:50
@karencfv

karencfv commented Feb 27, 2025

Copy link
Copy Markdown
Contributor Author

So, the moment I removed the ping() the integration_tests::metrics::test_mgs_metrics test started failing.

I spun up a local omicron environment with the latest commit and things looked pretty weird:

  • When I run tail -F $(svcs -L oximeter) | looker on the oximeter zone the logs are whirling like crazy (much faster than I normally see)

    03:40:11.165Z DEBG oximeter (oximeter-agent): retrieving timeseries schema from database
      collector_id = 8653a6ee-2561-4757-a7d8-8542ebd80c39
      collector_ip = fd00:1122:3344:101::d
      id = fe406af5-eabe-47ec-b408-333280c15ed9
    03:40:11.267Z DEBG oximeter (oximeter-agent): retrieving timeseries schema from database
      collector_id = 8653a6ee-2561-4757-a7d8-8542ebd80c39
      collector_ip = fd00:1122:3344:101::d
      id = fe406af5-eabe-47ec-b408-333280c15ed9
    03:40:11.368Z DEBG oximeter (oximeter-agent): retrieving timeseries schema from database
      collector_id = 8653a6ee-2561-4757-a7d8-8542ebd80c39
      collector_ip = fd00:1122:3344:101::d
      id = fe406af5-eabe-47ec-b408-333280c15ed9
    03:40:11.469Z DEBG oximeter (oximeter-agent): retrieving timeseries schema from database
      collector_id = 8653a6ee-2561-4757-a7d8-8542ebd80c39
      collector_ip = fd00:1122:3344:101::d
      id = fe406af5-eabe-47ec-b408-333280c15ed9
  • Inside the clickhouse (single-node) zone, I went to check the table I normally check to make sure everything is fine (measurements_u64) and it had an unusually small number of records, 430. Usually this table gets filled up pretty fast and within a few minutes its number of rows is in the thousands. Now, it got stuck in 430 for a while and only increasing very slowly (it took a couple of minutes to reach 484 rows).

This appears to do with how Qorb's claim_timeout on the pool policy works.

https://github.com/oxidecomputer/qorb/blob/master/src/pool.rs#L313-L328

    async fn claim_or_enqueue(&mut self, tx: oneshot::Sender<Result<claim::Handle<Conn>, Error>>) {
        let result = self.claim().await;
        if result.is_ok() {
            let _ = tx.send(result);
            return;
        }
        // If we fail to claim a request...
        //
        // - Keep this request in a queue. A backend may gain slots or come
        // online later.
        // - Start the clock on a timeout.
        self.request_queue.push_back(ClaimRequest {
            tx,
            deadline: tokio::time::Instant::now() + self.policy.claim_timeout,
        });
    }

This means that instead of failing and dropping the connection, we were just enqueuing the claim indefinitely and clogging up oximeter.

To be certain I ran a manual test with ping() back in place, and sure enough, when on single-node mode only the row count was in the thousands within a few minutes

oxz_clickhouse_68b6dbbe-4c7e-4c3d-943c-a9ab1f89caf0.local :) select count(*) from oximeter.measurements_u64

SELECT count(*)
FROM oximeter.measurements_u64

Query id: 0c9eb8f6-ea44-42ce-be1c-155e9bf26227

┌─count()─┐
│    5356 │
└─────────┘

I then set the policy to both single node and cluster again and results as expected:

single-node

oxz_clickhouse_68b6dbbe-4c7e-4c3d-943c-a9ab1f89caf0.local :) select count(*) from oximeter.measurements_u64

SELECT count(*)
FROM oximeter.measurements_u64

Query id: 8124a48c-a085-44ea-af8a-fe33851eea42

┌─count()─┐
│   11278 │
└─────────┘

cluster

oximeter_cluster_3 :) select count(*) from oximeter.measurements_u64

SELECT count(*)
FROM oximeter.measurements_u64

Query id: 0e2b3c7b-037a-47ef-a0da-bfe8e378f374

┌─count()─┐
│    4484 │
└─────────┘

Conclusion

ping() lives to see yet another day!

@karencfv

Copy link
Copy Markdown
Contributor Author

Also, without the ping we call insert_samples directly, and that method unrolls the samples before even trying to insert anything. We really shouldn't be doing any of that work if we can know ahead of time that we won't be able to insert to the cluster.

@bnaecker

Copy link
Copy Markdown
Collaborator

So, the moment I removed the ping() the integration_tests::metrics::test_mgs_metrics test started failing.

Interesting! This behavior is pretty surprising to me, and something I think we probably want to understand in more detail.

the oximeter zone the logs are whirling like crazy

These log messages are at almost exactly 100ms apart, which is the claim timeout interval, right? Is that a coincidence?

This means that instead of failing and dropping the connection, we were just enqueuing the claim indefinitely and clogging up oximeter.

I think this part is the main source of my confusion. Both ping() and insert_samples() must claim a connection from the qorb pool -- why would that checkout process behave differently in those two methods? There are DTrace probes in qorb that might help illuminate those claims, if that's useful.

We really shouldn't be doing any of that work if we can know ahead of time that we won't be able to insert to the cluster.

I don't think we know this, regardless of whether we call ping() or not, right? ping() could succeed, and then the database could become unreachable by the time we call insert_samples(). The converse could also happen, where ping() fails but the database would become reachable immediately after. This is the main motivation for just calling insert_samples() directly, in my opinion.

@jgallagher

Copy link
Copy Markdown
Contributor

This appears to do with how Qorb's claim_timeout on the pool policy works.

https://github.com/oxidecomputer/qorb/blob/master/src/pool.rs#L313-L328

    async fn claim_or_enqueue(&mut self, tx: oneshot::Sender<Result<claim::Handle<Conn>, Error>>) {
        let result = self.claim().await;
        if result.is_ok() {
            let _ = tx.send(result);
            return;
        }
        // If we fail to claim a request...
        //
        // - Keep this request in a queue. A backend may gain slots or come
        // online later.
        // - Start the clock on a timeout.
        self.request_queue.push_back(ClaimRequest {
            tx,
            deadline: tokio::time::Instant::now() + self.policy.claim_timeout,
        });
    }

This means that instead of failing and dropping the connection, we were just enqueuing the claim indefinitely and clogging up oximeter.

I don't think this description of what qorb is doing is right. If we don't get a successful claim on the first try at the beginning of the function, the claim request is enqueued but not indefinitely: it's specifically enqueued for policy.claim_timeout, which on this PR is only 100ms. If we land there, the Oximeter claim().await hasn't even returned yet; it will either get a claim before claim_timeout expires, or qorb will pop it off the queue after claim_timeout and return an error.

A bit of further evidence that there's no clogging happening: calling ping() itself requires claiming a connection from qorb! So if something is wrong on the qorb side where failed claims clog things up, adding more claims via calling ping() would only make that problem worse.

@karencfv

Copy link
Copy Markdown
Contributor Author

Interesting! This behavior is pretty surprising to me, and something I think we probably want to understand in more detail.

Yeah, I agree. I didn't merge this PR because I wasn't 100% confident in my conclusions.

I think this part is the main source of my confusion. Both ping() and insert_samples() must claim a connection from the qorb pool -- why would that checkout process behave differently in those two methods? There are DTrace probes in qorb that might help illuminate those claims, if that's useful.

Thanks! Reading this makes me wonder if perhaps this behaviour isn't due to qorb, but rather oximeter somehow?

I don't think we know this, regardless of whether we call ping() or not, right? ping() could succeed, and then the database could become unreachable by the time we call insert_samples(). The converse could also happen, where ping() fails but the database would become reachable immediately after. This is the main motivation for just calling insert_samples() directly, in my opinion.

Since the cluster installation won't be the source of truth for now in our customer installations, I'm not really worried about some samples dropping off on our dogfood rack. What I don't love is that for our customer racks, where we know for certain that the cluster won't be running, we're going to be unrolling samples.

I don't think this description of what qorb is doing is right. If we don't get a successful claim on the first try at the beginning of the function, the claim request is enqueued but not indefinitely: it's specifically enqueued for policy.claim_timeout, which on this PR is only 100ms. If we land there, the Oximeter claim().await hasn't even returned yet; it will either get a claim before claim_timeout expires, or qorb will pop it off the queue after claim_timeout and return an error.

hmmmmm... yeah that makes sense.

Thanks both for the feedback! I'll take a look again to be certain about what's happening.

I have a feeling the culprit is unroll_samples() which is called by insert_samples(). For each sample we call verify_or_cache_sample_schema

for sample in samples.iter() {
    match self.verify_or_cache_sample_schema(sample).await {
        Err(_) => {
            // Skip the sample, but otherwise do nothing. The error is logged in the above
            // call.
            continue;
        }
        Ok(None) => {}
        Ok(Some(schema)) => {
            debug!(
                self.log,
                "new timeseries schema";
                "timeseries_name" => %schema.timeseries_name,
                "schema" => ?schema,
            );
            new_schema.insert(schema.timeseries_name.clone(), schema);
        }
    }
}

Which in turn calls get_schema_locked() -> execute_with_block() -> pool.claim(). Which means that every time we call insert_samples(), oximeter tries to make a claim per sample in quick succession. The number of samples can vary greatly as seen in these logs from a previous deployment:

03:31:37.795Z DEBG oximeter (oximeter-agent): inserting 28 samples into database
    collector_id = 8653a6ee-2561-4757-a7d8-8542ebd80c39
    collector_ip = fd00:1122:3344:101::d
<...>
03:31:41.467Z DEBG oximeter (oximeter-agent): inserting 6666 samples into database
    collector_id = 8653a6ee-2561-4757-a7d8-8542ebd80c39
    collector_ip = fd00:1122:3344:101::d

With this information, it would appear that the system gets clogged up with all of these failed claims, but does not get clogged up with ping() because it is only attempting to make a single claim, and moving on if it can't.

I'll test out my assumptions on my local environment to see if I'm correct

@andrewjstone

Copy link
Copy Markdown
Contributor

@bnaecker, @jgallagher and I spent a while on a call discussing this and found a couple bugs in oximeter and a semantic change that we want to make to how multinode and single node clickhouse work with oximeter when both are enabled.

As I just discussed with @karencfv in chat, she nailed the first bug which is that we are taking a claim per each sample. Fixing that would solve our immediate problem here but is not a robust enough solution. John and Ben are writing up separate issues and then I'm going to discuss with @karencfv the fix for this PR in how we talk to both multinode and singlenode once those other issues are resolved.

@karencfv

Copy link
Copy Markdown
Contributor Author

Converting this PR to draft so it doesn't get merged by mistake or something

@karencfv karencfv marked this pull request as ready for review March 6, 2025 02:48
@karencfv

karencfv commented Mar 6, 2025

Copy link
Copy Markdown
Contributor Author

Thanks everyone with all the help with this one. I've updated the PR with @jgallagher's channel wrapper idea, and @andrewjstone's help with the implementation.

I've tested the PR on my local Helios machine with the following results:

Single node only

root@oxz_oximeter_b5c7eee8:~# tail -F $(svcs -L oximeter) | looker
02:31:52.928Z DEBG oximeter (oximeter-agent): inserting 326 samples into database
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
02:31:52.928Z DEBG oximeter (oximeter-agent): inserting 326 samples into database
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
02:31:52.928Z DEBG oximeter (oximeter-agent): unrolling 326 total samples
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
02:31:52.943Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 212
    table_name = fields_i16
02:31:52.946Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 617
    table_name = fields_string
02:31:52.948Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 106
    table_name = fields_u16
02:31:52.949Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 113
    table_name = fields_u32
02:31:52.951Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 112
    table_name = fields_u8
02:31:52.953Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 382
    table_name = fields_uuid
02:31:52.954Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 6
    table_name = measurements_bool
02:31:52.955Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 237
    table_name = measurements_cumulativeu64
02:31:52.957Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 1
    table_name = measurements_f64
02:31:52.958Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 82
    table_name = measurements_u64
02:31:53.030Z WARN oximeter (oximeter-agent): failed to insert some results into metric DB
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    error = Failed to check out connection to database: No backends found for this service
    file = oximeter/collector/src/results_sink.rs:92
oxz_clickhouse_e6f81622-3e69-4025-8691-c0b86368e1ed.local :) select count(*) from oximeter.measurements_u64

SELECT count(*)
FROM oximeter.measurements_u64

Query id: ecb1575f-7e26-4c65-829b-f341c9e7e9a6

┌─count()─┐
│    2260 │
└─────────┘

Both single node and replicated cluster running

root@oxz_oximeter_b5c7eee8:~# tail -F $(svcs -L oximeter) | looker
02:38:43.425Z DEBG oximeter (oximeter-agent): inserting 1053 samples into database
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
02:38:43.425Z DEBG oximeter (oximeter-agent): unrolling 1053 total samples
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
02:38:43.452Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 1
    table_name = measurements_f64
02:38:43.465Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 12
    table_name = fields_ipaddr
02:38:43.467Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 1472
    table_name = fields_string
02:38:43.469Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 39
    table_name = fields_u16
02:38:43.471Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 276
    table_name = fields_u32
02:38:43.474Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 627
    table_name = fields_uuid
02:38:43.476Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 1018
    table_name = measurements_cumulativeu64
02:38:43.478Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 27
    table_name = measurements_histogramu64
02:38:43.480Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = c409b9d8-daea-4dd6-97f4-4cfbb106dd88
    n_rows = 8
    table_name = measurements_u64
02:38:43.527Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 82
    table_name = measurements_u64
02:38:43.527Z DEBG oximeter (oximeter-agent): inserting 1051 samples into database
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
02:38:43.527Z DEBG oximeter (oximeter-agent): unrolling 1051 total samples
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
02:38:43.628Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 8
    table_name = fields_ipaddr
02:38:43.697Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 1469
    table_name = fields_string
02:38:43.769Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 35
    table_name = fields_u16
02:38:43.854Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 276
    table_name = fields_u32
02:38:43.922Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 623
    table_name = fields_uuid
02:38:44.068Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 1016
    table_name = measurements_cumulativeu64
02:38:44.222Z DEBG oximeter (oximeter-agent): inserted rows into table
    collector_id = b5c7eee8-68a1-4f31-b929-ccb69bc4cc1a
    collector_ip = fd00:1122:3344:101::d
    id = d9b4b40e-4860-40a7-8d2d-c01983267bbe
    n_rows = 27
    table_name = measurements_histogramu64

Single node

oxz_clickhouse_e6f81622-3e69-4025-8691-c0b86368e1ed.local :) select count(*) from oximeter.measurements_u64

SELECT count(*)
FROM oximeter.measurements_u64

Query id: f1bea008-df6e-4bff-bac2-c1b36df779d4

┌─count()─┐
│    8602 │
└─────────┘

1 row in set. Elapsed: 0.002 sec. 

Replicated cluster

oximeter_cluster_1 :) select count(*) from oximeter.measurements_u64

SELECT count(*)
FROM oximeter.measurements_u64

Query id: 18161053-b9d7-4801-a156-1c19cd1fc676

┌─count()─┐
│    4598 │
└─────────┘

1 row in set. Elapsed: 0.006 sec. 

I think I've addressed all comments, but please let me know if anything's missing!

@bnaecker bnaecker left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for updating this! I think the type used to fanout messages to the two kinds of databases seems OK, thanks. I have a few small comments, let me know if you have any questions!

Comment thread oximeter/collector/src/agent.rs Outdated
Comment thread oximeter/collector/src/agent.rs Outdated
Comment thread oximeter/collector/src/agent.rs
Comment thread oximeter/collector/src/collection_task.rs Outdated
"failed to send value from the collection task to channel for single node: {e:?}"
);
};
if let Err(e) = result_cluster {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding correctly, we always create the cluster_tx, right? That means we will log all these messages, even when we will never start a clustered database. Is that OK? It's "harmless", I'm really asking how hard it will be to filter out those log messages.

One idea that might help is splitting out the loggers for each kind of database, or somehow adding a key that lets us easily filter them. Right now, I think you'd have to do that by some string matching.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unlikely that there will be many of these messages. They should only happen when one of the tasks is down and the receiver gets dropped. This should only happen on Nexus shutdown or a task crash. Since there's only likely to be one of them, and very infrequently, I'm not sure there needs to be more detail than is given here.

bnaecker

This comment was marked as duplicate.

@andrewjstone andrewjstone left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes @karencfv!

"collector_ip" => address.ip().to_string(),
));

let collection_task_wrapper = CollectionTaskWrapper::new();

@andrewjstone andrewjstone Mar 6, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are only using a single node here, we can probably use the mpsc::channel directly.

@karencfv karencfv Mar 6, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't really because OximeterAgent.result_sender is of type CollectionTaskSenderWrapper. Later, it's send() method is used by CollectionTask's run() method. I would have to modify this run() method to make it somehow aware of single node vs cluster real oximeter vs standalone oximeter.

Frankly, I would rather not do that since this code is all going to go away when we no longer run a single node installation, and this standalone oximeter is only used for testing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

"failed to send value from the collection task to channel for single node: {e:?}"
);
};
if let Err(e) = result_cluster {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unlikely that there will be many of these messages. They should only happen when one of the tasks is down and the receiver gets dropped. This should only happen on Nexus shutdown or a task crash. Since there's only likely to be one of them, and very infrequently, I'm not sure there needs to be more detail than is given here.

@karencfv karencfv left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews!

Comment thread oximeter/collector/src/agent.rs
"collector_ip" => address.ip().to_string(),
));

let collection_task_wrapper = CollectionTaskWrapper::new();

@karencfv karencfv Mar 6, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't really because OximeterAgent.result_sender is of type CollectionTaskSenderWrapper. Later, it's send() method is used by CollectionTask's run() method. I would have to modify this run() method to make it somehow aware of single node vs cluster real oximeter vs standalone oximeter.

Frankly, I would rather not do that since this code is all going to go away when we no longer run a single node installation, and this standalone oximeter is only used for testing.

@karencfv karencfv enabled auto-merge (squash) March 6, 2025 21:48
@karencfv karencfv merged commit e0ca417 into oxidecomputer:main Mar 6, 2025
@karencfv karencfv deleted the oximeter-writes branch March 7, 2025 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[oximeter] Modify to write to both single node and replicated cluster

4 participants