Skip to content

Proper Materialized views startup dependencies#72123

Merged
serxa merged 58 commits intoClickHouse:masterfrom
ilejn:mv_dependencies
Apr 23, 2025
Merged

Proper Materialized views startup dependencies#72123
serxa merged 58 commits intoClickHouse:masterfrom
ilejn:mv_dependencies

Conversation

@ilejn
Copy link
Copy Markdown
Contributor

@ilejn ilejn commented Nov 20, 2024

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

A materialized view can start too late, e.g. after the Kafka table that streams to it.

@ilejn ilejn marked this pull request as draft November 20, 2024 08:59
@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Nov 20, 2024

Hello @serxa , could you have a look please since you worked on async table loading?

The change really cures the issue, although I am still working on it because I am not sure why

  1. TablesLoader and DataBaseCatalog coexist and keep same or similar dependencies
  2. we need checkDependencies() for Kafka (StorageKafkaUtil.cpp) and other streaming engines, instead of having proper load dependencies. In other words, why "loading dependencies are a subset of referential dependencies". What prevents us from treating referential dependencies as loading dependencies?

@nikitamikhaylov nikitamikhaylov added the can be tested Allows running workflows for external contributors label Nov 20, 2024
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-not-for-changelog This PR should not be mentioned in the changelog label Nov 20, 2024
@robot-clickhouse
Copy link
Copy Markdown
Member

robot-clickhouse commented Nov 20, 2024

This is an automated comment for commit cd1cc0c with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check nameDescriptionStatus
BuildsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS❌ failure
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests❌ failure
Successful checks
Check nameDescriptionStatus
AST fuzzerRuns randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help✅ success
BuzzHouse (asan)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (debug)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (msan)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (tsan)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (ubsan)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
ClickBenchRuns ClickBench with instant-attach table✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docker keeper imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docker server imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docs checkBuilds and tests the documentation✅ success
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Performance ComparisonMeasure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests✅ success
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Stress testRuns stateless functional tests concurrently from several clients to detect concurrency-related errors✅ success
Style checkRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Unit testsRuns the unit tests for different release types✅ success
Upgrade checkRuns stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts✅ success

@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Nov 22, 2024

Hello @serxa , could you have a look, please.

@serxa serxa self-assigned this Nov 27, 2024
@serxa
Copy link
Copy Markdown
Member

serxa commented Nov 27, 2024

  1. TablesLoader and DataBaseCatalog coexist and keep same or similar dependencies

They serve different purposes. DataBaseCatalog is a thing that stores everything consistently, a source of truth if you will. TablesLoader is just a simple helper class that exists to extract common code from two different places: (a) server startup and (b) CREATE/ATTACH TABLE queries. This helper arranges "loading" in a way that is legitimate for the catalog to process without issues. This is the main reason for them to share similar things like dependency graphs. The catalog is just not smart enough to plan the chain (the order) of actions needed to load a database. Previously there was a synchronous loading-by-levels algorithm which was replaced by AsyncLoader. And now, probably, TablesLoader might be a part of DataBaseCatalog, because it is now much easier to track all the dependencies. But still, it would be a large refactoring to do with no clear objectives.

  1. we need checkDependencies() for Kafka (StorageKafkaUtil.cpp) and other streaming engines, instead of having proper load dependencies. In other words, why "loading dependencies are a subset of referential dependencies". What prevents us from treating referential dependencies as loading dependencies?

Any unnecessary load dependency may potentially slow down server startup because more tables will be loaded sequentially instead of in parallel. So we tried our best to separate different kinds of dependencies. IIRC referential dependencies exist only to avoid dropping a table that is mentioned in the definition of another table, but such tables could be loaded in parallel. cc @tavplubix

@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Nov 27, 2024

@serxa, thanks for reacting and for clarification.

I have to admit that I am not sure that current solution is optimal/most elegant.
Basically what we do is carefully detect dependencies, then ignore them starting up streaming tables too early (before their dependencies are ready) only to make all streaming tables to call checkDependencies() from their setup methods to wait proper moment to start.

I see following options

  1. limit the scope by fixing the bug, make this PR suitable for merge (cleanup + rename pushDependencies to preSetup() or postLoad()) as it currently is.
  2. move addViewDependency call out of MaterializedView, somewhere to TablesLoader, probably to DDLDependencyVisitor (not sure if it is possible, but should be)
  3. include view dependencies to load dependencies for tables with a streaming engine.

(3) eliminates need of checkDependencies, while these methods are needed for (1) and (2), but since the code is the same, it can be a single method of DataBaseCatalog.

Suggestions are appreciated.

@ilejn ilejn changed the title Proper Materailzied views startup dependencies Proper Materialized views startup dependencies Nov 27, 2024
@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Nov 28, 2024

BTW, if I am not mistaken, check for ErrorCodes::TOO_MANY_MATERIALIZED_VIEWS
is not reliable.

@serxa
Copy link
Copy Markdown
Member

serxa commented Dec 3, 2024

@ilejn

Basically what we do is carefully detect dependencies, then ignore them starting up streaming tables too early (before their dependencies are ready) only to make all streaming tables to call checkDependencies() from their setup methods to wait proper moment to start.

You are completely right, this is nonsense. I think option (3) is almost the right way to do it. We should get rid of checkDependencies(). And add correct dependencies in AsyncLoader for views. Now we have the following order of execution in case of load dependency (note that dependencies point backward to arrows in the diagram):

flowchart TD
    load_table1["load table1"] --> startup_table1["startup table1"]
    load_table2["load table2"] --> startup_table2["startup table2"]
    load_table1 --> load_table2
Loading

Which prevents execution parallel loading. But for views, we could instead use dependencies between startup jobs to make the following order of execution:

flowchart TD
    load_table1["load table1"] --> startup_table1["startup table1"]
    load_table2["load table2"] --> startup_table2["startup table2"]
    startup_table1 --> startup_table2
Loading

This won't prevent parallel execution of load jobs. And startup jobs are not long and it would not be a problem. Whenever a table is accessed by INSERT query it waits (blocks) until the startup job is done. So this logic remains the same and will automagically resolve the issue instead of the checkDependencies() hack.

To implement this we should pass view_dependencies into TablesLoader and use it for job construction in TablesLoader::startupTablesAsync() method. It could be done in a way similar to TablesLoader::loadTablesAsync(). Iteration should be done in view_dependencies.getTablesSortedByDependency() order as well.

Seems not too hard at the first glance. Would you like to give it a try in this PR?

@serxa
Copy link
Copy Markdown
Member

serxa commented Dec 3, 2024

BTW, if I am not mistaken, check for ErrorCodes::TOO_MANY_MATERIALIZED_VIEWS
is not reliable.

What do you mean?

@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Dec 3, 2024

BTW, if I am not mistaken, check for ErrorCodes::TOO_MANY_MATERIALIZED_VIEWS
is not reliable.

What do you mean?

We do this check in StorageMaterializedView ctor based on data collected in StorageMaterializedView::startup .
Switch to creating view dependencies in TablesLoader::loadTablesAsync solves this issues among others.

@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Dec 3, 2024

To implement this we should pass view_dependencies into TablesLoader

Yes.
Retrieving view dependencies by DDLDependencyVisitorData is not perfect, but ok for me.

and use it for job construction in TablesLoader::startupTablesAsync() method. It could be done in a way similar to TablesLoader::loadTablesAsync(). Iteration should be done in view_dependencies.getTablesSortedByDependency() order as well.

Should we limit the scope of this extra loop by streaming alike tables only?

Seems not too hard at the first glance.

Will see ;)

Would you like to give it a try in this PR?

Yes, sure.

BTW, I have an impression that in case of streaming tables and their MVs we don't have an implicit synchronization point, "Whenever a table is accessed by INSERT query it waits (blocks) until the startup job is done" does not work for them, that's why checkDependencies() are needed. Does it make sense?

@serxa
Copy link
Copy Markdown
Member

serxa commented Dec 3, 2024

I have an impression that in case of streaming tables and their MVs we don't have an implicit synchronization point

I'm not very deep in that streaming code, but yes, looks like that is true.

Should we limit the scope of this extra loop by streaming alike tables only?

No, I believe it is needed for all tables. So it will be not an extra loop, but improvement of existing one that is already present there (current loop just iterates tables in some unspecified order). In #72589 streaming is not mentioned, but async_load_tables was identified as the root cause. It is just because without async loading there was no way to do INSERT in the middle (when the source table is started, but target is not started yet). With async load of tables it is now possible.

So we will fix both issues by introducing these new dependencies directly into AsyncLoader DAG.

@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Dec 4, 2024

@serxa, I have several assorted concerns/questions/assumptions (probably not all of them make sense) based on your graphs and explanations.

  1. all loads precede all startups, right? (would be nice to make timeline more verbose on the graphs)
  2. I used to think that startup is much more time consuming than load.
  3. Your second picture illustrates CREATE MATERIALIZED VIEW mv TO table1 AS SELECT * FROM table2 leaving mv out of the scope, right?
  4. view_dependencies we have in master are from view to table (view depends on its target table), I am not sure that current view dependencies by themselves are suitable to track dependencies from MV's target table to source(s) tables(s)
  5. What we probably need - MV' source table(s) depends on target table
  6. The whole point of the suggested change is to add more dependencies for startup (besides current load_table[table_id.getFullTableName()]->goals() we plan to introduce some extra ones that come from view dependencies)
  7. Why do we need extra dependencies for normal tables? Implicit synchronization upon INSERT/SELECT works fine for them. Of course, we have to avoid race condition creating view dependencies, and it should be enough.
  8. For streaming tables we have to have startup of all dependent (child) tables completed at the moment parent table' startup is called. For normal tables A and B that have a MV between, we can run startup methods in parallel. That is why I think that we have to treat streaming tables in a special way.

It might be more convenient to communicate via Telegram. May I write you (I know username)?

@ilejn ilejn force-pushed the mv_dependencies branch 3 times, most recently from 088a82b to 618a436 Compare January 10, 2025 09:41
@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Feb 5, 2025

Looking into test_storage_s3_queue/test.py::test_registry failure.
Despite the fact it is known to be flaky.

Update: matter of delay. I've managed to reproduce this on my laptop, artificially slowing it down.

@kssenii , I suggest increasing timeout.

--- a/tests/integration/test_storage_s3_queue/test.py
+++ b/tests/integration/test_storage_s3_queue/test.py
@@ -2612,7 +2612,7 @@ def test_registry(started_cluster):
         )
 
     expected_rows = files_to_generate
-    for _ in range(20):
+    for _ in range(40):
         if expected_rows == get_count():
             break
         time.sleep(1)

@ilejn ilejn force-pushed the mv_dependencies branch 2 times, most recently from 8e82023 to cd1cc0c Compare February 6, 2025 13:11
@serxa
Copy link
Copy Markdown
Member

serxa commented Mar 27, 2025

I resolved new conflicts. It seems now mergeable again. Tests are green except the flaky one.

PR / Bugfix validation (pull_request)Failing after 60m

This thing I do not understand:

Run action failed for: [Bugfix validation] with exit code [-15]
Job timed out: [Bugfix validation] exit code [-15]
ERROR: Run action failed with timeout and did not generate JobReport - update dummy report with execution time
Command `python3 /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/tests/ci/bugfix_validate_check.py` has failed, timeout 2400s is exceededRun action done for: [Bugfix validation]
ERROR: Job was killed - generate evidence
INFO:botocore.credentials:Found credentials from IAM Role: ec2_admin
Posting slack message, dry_run [False]
{}
ERROR: Run failed with exit code [241]

@serxa
Copy link
Copy Markdown
Member

serxa commented Mar 27, 2025

This thing I do not understand

It seems to be the same problem of long tests, but it showed an empty report, so I did not realize it at the first glance. Let's override mergeable check one more time and wait for private CI to finish.

@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Mar 27, 2025

This thing I do not understand

It seems to be the same problem of long tests, but it showed an empty report, so I did not realize it at the first glance. Let's override mergeable check one more time and wait for private CI to finish.

Seems reasonable unfortunately.
The test behaves like flaky check does.

@serxa serxa enabled auto-merge March 27, 2025 15:29
@ilejn
Copy link
Copy Markdown
Contributor Author

ilejn commented Apr 7, 2025

Hello @serxa ,
what can I do to have it merged?

@serxa
Copy link
Copy Markdown
Member

serxa commented Apr 7, 2025

I think I messed up while resolving conflicts in private branch. And now there are failed tests. I was distracted, but I'll try to fix it now.

if (!table_id.table_name.empty())
{
mv_to_dependency = table_id;
if (mv_to_dependency->getDatabaseName().empty())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two places where getDatabaseName().empty() is called. But getDatabaseName() throws an exception on an empty database. I dont know why it has not been found by public CI, but some private tests found this problem.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great finding, I'll fix it by EOD.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serxa , could you share anything that facilitates reproduction?
Cannot manage yet.

Probably this code is 'inspired' by

if (qualified_name.database.empty())

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I believe the problem is specific to private code because it uses catalog for tracking dependencies and got exception because it uses uuid instead of db.table DB::Exception: Received from 172.16.2.10:9000. DB::Exception: Database name is empty: While processing CREATE MATERIALIZED VIEW _ UUID 'ddfb88c6-d551-4467-bc53-b9244f3a8ef0' REFRESH EVERY 1 YEAR SETTINGS all_replicas = 1 APPEND TO INNER UUID '8e252499-6086-443b-8b65-a466db04ccb6' (x Int64) ENGINE = SharedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}') ORDER BY x SETTINGS index_granularity = 8192 DEFINER = default SQL SECURITY DEFINER AS SELECT rand() AS x

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just make it similar to what it was. I think it should do the trick

QualifiedTableName target_name{table_id.database_name, table_id.table_name};
if (target_name.database.empty())
    target_name.database = current_database;

auto-merge was automatically disabled April 11, 2025 23:08

Head branch was pushed to by a user without write access

@serxa
Copy link
Copy Markdown
Member

serxa commented Apr 16, 2025

Private tests should be fixed now. At least local check does not reveal any problems. But public CI is in bad shape at the moment...

@serxa serxa enabled auto-merge April 16, 2025 13:53
@serxa
Copy link
Copy Markdown
Member

serxa commented Apr 17, 2025

@ilejn Let's merge master into this branch one more time

@serxa
Copy link
Copy Markdown
Member

serxa commented Apr 23, 2025

Private Sync is finally green, but for some reason, CI here does not recognize this fact here.

@serxa serxa added this pull request to the merge queue Apr 23, 2025
Merged via the queue into ClickHouse:master with commit c7357e5 Apr 23, 2025
113 of 122 checks passed
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 23, 2025
@serxa
Copy link
Copy Markdown
Member

serxa commented Apr 23, 2025

I'm so glad it is finally merged. Thanks to everybody involved!

@evillique
Copy link
Copy Markdown
Member

For visibility: #82222

@bobelev
Copy link
Copy Markdown

bobelev commented Nov 4, 2025

Any chance of backporting it to 25.3?

@serxa
Copy link
Copy Markdown
Member

serxa commented Nov 4, 2025

Any chance of backporting it to 25.3?

No chances, this rework is too big

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.