Add instrumentation using XRay#77153
Conversation
|
Glad you're already working on this @htuall 🎉! Please next time ping me directly or even assign the PR to me so that I'm aware. Somehow this went off under my radar till now 👀 Have you checked how the llvm-xray tool is doing this same thing? I assumed we could get the mapping the same way and our part would be to find the proper function addresses based on non-mangled function names, which is what I think Also, I'm afraid I can't yet understand Russian comments that well even after @nikitamikhaylov's efforts to teach me the basics. So, please use English whenever possible so that I don't have to be going back and forth translating 🙏 |
… funcIds to function names
…h custom handler for xray + started working on adding SYSTEM-statements to add/remove instrumentation(wip)
pamarcos
left a comment
There was a problem hiding this comment.
You're going in the right direction :)
I merged master against your branch and also removed the unnecessary binary artifacts that I think you added by mistake.
Please, make sure that ClickHouse is compiling correctly. Also some test even if very basic would be nice. Even just to ensure that the new system statement works.
…atements using it + attempted to add system.instrumentation table(not sure if it's coorrect
642f177 to
175386a
Compare
Let's build with ENABLE_XRAY=1 only for the Debug build on AMD for now. This is just to ensure on the CI that it builds correctly. There are some USE_XRAY missing in other files, but it is left on purpose as an excercise to the reader :)
…(needs check) + includes to cpp from header
|
Creating a Chrome trace format from Ask to profile and sleep a bit for every query entry to have some data: SYSTEM INSTRUMENT ADD `QueryMetricLog::startQuery` PROFILE
SYSTEM INSTRUMENT ADD `QueryMetricLog::startQuery` SLEEP ENTRY 0.5Send 1000 parallel queries using 24 parallel jobs: yes "clickhouse client --query 'select 1'" | head -n 1000 | parallel -j24The SELECT
format(
'{{"traceEvents": [{}]}}',
arrayStringConcat(
groupArray(
concat(
-- Begin Event
format(
'\n{{"name": "{}", "cat": "{}", "ph": "B", "ts": {}, "pid": 1, "tid": {}, "args": {{"query_id": "{}"}}}}',
name,
'clickhouse',
toString(event_time_microseconds),
toString(tid),
query_id
),
',',
-- End Event
format(
'\n{{"name": "{}", "cat": "{}", "ph": "E", "ts": {}, "pid": 1, "tid": {}, "args": {{"query_id": "{}"}}}}',
name,
'clickhouse',
toString(event_time_microseconds + duration_microseconds),
toString(tid),
query_id
)
)
),
','
)
) AS trace_json
FROM system.xray_instrumentation_profiling_log
WHERE event_date >= today();clickhouse client --query "$(cat chrome_trace.sql)" > trace.jsonWe can see in Perfetto
|
|
Comparing the binary size: Random PR:
This PR:
For x86_64, there's a ~4.5% of binary size increase. For aarch64, there's a ~7% binary size increase. |
|
It seems we cannot use XRay and any other sanitizer at the same time as per https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=77153&sha=23c6ed27502189ca2df435a42c653cb5ed18896c&name_0=PR Even though I haven't found on the internet any evidence of them being incompatible, both sanitizers and XRay seem to use the same underlying allocator and runtime infrastructure from sanitizer_common, which is why you get "multiple definition" errors for symbols like __sanitizer::internal_allocator(). |

Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Add instrumentation using XRay to parse xray_instr_map section generated by xray and map entry addresses to function names. Resolves #74249
Details
Created parse_elf.cpp designated to parse xray_instr_map section generated by xray to then map entry addresses to function names. Currently not finished, needs further work, submitting as a checkpoint to track progress. Current problem: function address that I get from xray_instr_map does not match to the expected address of the function, which I also get from elf-binary. Possible reasons: 1) incorrect parsing of the section; 2) some relocations in elf-binary.
Resolves #74249