Fix profiling not recording data for remote SQL and metadata queries#530
Merged
Conversation
_remote_sql() and _execute_metadata_query() called chdb.query() directly, bypassing the profiler. This caused profiler.report() to return "No profiling data recorded" when users used session.sql() on remote connections or called databases()/tables().
UNION ALL does not guarantee row order in ClickHouse. The bare ORDER BY after UNION ALL only applied to the last SELECT, not the full result. Wrap all UNION ALL queries in a subquery so ORDER BY applies to the combined result.
- Rename test methods to reflect exactly what is verified - Verify specific step names and exact counts instead of just len() > 0 - Validate time_ms is non-negative and sql preview is non-empty - Add missing _remote_describe() profiling coverage - Add len(profiler.steps) == 0 assertion for disabled state - Add descriptive error messages to all assertions
Column assignment is an order-preserving operation — adding sort_values in tests masks potential row-ordering bugs in DataStore. The fixture already provides ordered data via ORDER BY, so tests should verify DataStore preserves that order without extra sorting. Also verify actual values in test_json_array_extraction_in_pure_chdb instead of only checking len().
TestRemoteClickHouseIntegration was hitting the public demo server (sql-clickhouse.clickhouse.com) which has a 200 queries/hour quota. CI runs across multiple platforms easily exceed this, causing flaky failures unrelated to code changes. Switch to the local ClickHouse test server managed by conftest.py (clickhouse_server fixture), which is already used by test_remote_connection_integration.py.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_remote_sql()calledchdb.query()directly, bypassing the profiler entirely. Users callingsession.sql("SELECT ...")on remote ClickHouse connections got "No profiling data recorded" fromprofiler.report()._execute_metadata_query()had the same issue —databases(),tables(), and_remote_describe()were invisible to the profiler.profiler.step()with timing and SQL preview metadata to both methods, consistent with the existing pattern inConnection.execute()/Connection.query_df().Test plan
datastore/tests/test_profiling.py— all passingunittest.mock.patch("chdb.query")so no real server neededsql,time_ms)disable_profiling()is active