Work Fake PR: MySQL 8.4.0 - Bug#106645: Slow query log is not logging database/schema name.#8
Work Fake PR: MySQL 8.4.0 - Bug#106645: Slow query log is not logging database/schema name.#8jfg956 wants to merge 64 commits intomysql-8.4.0_for_fake_prsfrom
Conversation
(I should have tested the test before comitting)
Add MariaDB output. Reorg in sections. Add testing notes.
sql/log.cc
Outdated
| } else { | ||
| log_db_change = false; /* Do not log db change if db is in the comment line. */ | ||
| db[0] = 0; /* Reset db triggers logging db change after disabling log_slow_extra_db. */ | ||
| if (thd->db().str) { |
There was a problem hiding this comment.
Honestly I'd just do this bit and not worry about the extra system variable.
adjust scripts/mysqldumpslow.pl.in
When finished add a sample: https://github.com/percona/percona-toolkit/tree/3.x/t/lib/samples/slowlogs
There was a problem hiding this comment.
Thanks for the feedback Daniel.
I'd just do this bit and not worry about the extra system variable.
If I do this, I might break script parsing the slow log, which I would like to avoid. With the variable and a default of ON, people will see the brreakage and they will have the liberty to fix their script of disable the feature. I prefer to give them this libeerty. Also, with a variable, this patch can be back-ported in 8.0 and 8.4, with a default of OFF. So I think I will keep it.
adjust scripts/mysqldumpslow.pl.in
No need, it already swallows everything after Id: ....
When finished add a sample [...]
Thanks for bringing this to my attention, I will look into this.
regex
Thanks for the pointer. Cheers, J-F
There was a problem hiding this comment.
Percona notified: https://perconadev.atlassian.net/browse/PT-2370
… for connection xxx'.
The new iterator based explains are not impacted.
The issue here is a race condition. More than one thread is using the
query term iterator at the same time (whoch is neithe threas safe nor
reantrant), and part of its state is in the query terms being visited
which leads to interference/race conditions.
a) the explain thread
uses an iterator here:
Sql_cmd_explain_other_thread::execute
is inspecting the Query_expression of the running query
calling master_query_expression()->find_blocks_query_term which uses
an iterator over the query terms in the query expression:
for (auto qt : query_terms<>()) {
if (qt->query_block() == qb) {
return qt;
}
}
the above search fails to find qb due to the interference of the
thread b), see below, and then tries to access a nullpointer:
* thread mysql#36, name = ‘connection’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x000000010bb3cf0d mysqld`Query_block::type(this=0x00007f8f82719088) const at sql_lex.cc:4441:11
frame #1: 0x000000010b83763e mysqld`(anonymous namespace)::Explain::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:792:50
frame #2: 0x000000010b83cc4d mysqld`(anonymous namespace)::Explain_join::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:1487:21
frame #3: 0x000000010b837c34 mysqld`(anonymous namespace)::Explain::prepare_columns(this=0x00007000020611b8) at opt_explain.cc:744:26
frame #4: 0x000000010b83ea0e mysqld`(anonymous namespace)::Explain_join::explain_qep_tab(this=0x00007000020611b8, tabnum=0) at opt_explain.cc:1415:32
frame #5: 0x000000010b83ca0a mysqld`(anonymous namespace)::Explain_join::shallow_explain(this=0x00007000020611b8) at opt_explain.cc:1364:9
frame #6: 0x000000010b83379b mysqld`(anonymous namespace)::Explain::send(this=0x00007000020611b8) at opt_explain.cc:770:14
frame #7: 0x000000010b834147 mysqld`explain_query_specification(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, query_term=0x00007f8f82719088, ctx=CTX_JOIN) at opt_explain.cc:2088:20
frame #8: 0x000000010bd36b91 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f82719088) at sql_union.cc:1519:11
frame #9: 0x000000010bd36c68 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f8271d748) at sql_union.cc:1526:13
frame #10: 0x000000010bd373f7 mysqld`Query_expression::explain(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00) at sql_union.cc:1591:7
frame #11: 0x000000010b835820 mysqld`mysql_explain_query_expression(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2392:17
frame #12: 0x000000010b835400 mysqld`explain_query(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2353:13
* frame #13: 0x000000010b8363e4 mysqld`Sql_cmd_explain_other_thread::execute(this=0x00007f8fba585b68, thd=0x00007f8fbb111e00) at opt_explain.cc:2531:11
frame #14: 0x000000010bba7d8b mysqld`mysql_execute_command(thd=0x00007f8fbb111e00, first_level=true) at sql_parse.cc:4648:29
frame #15: 0x000000010bb9e230 mysqld`dispatch_sql_command(thd=0x00007f8fbb111e00, parser_state=0x0000700002065de8) at sql_parse.cc:5303:19
frame #16: 0x000000010bb9a4cb mysqld`dispatch_command(thd=0x00007f8fbb111e00, com_data=0x0000700002066e38, command=COM_QUERY) at sql_parse.cc:2135:7
frame #17: 0x000000010bb9c846 mysqld`do_command(thd=0x00007f8fbb111e00) at sql_parse.cc:1464:18
frame #18: 0x000000010b2f2574 mysqld`handle_connection(arg=0x0000600000e34200) at connection_handler_per_thread.cc:304:13
frame #19: 0x000000010e072fc4 mysqld`pfs_spawn_thread(arg=0x00007f8fba8160b0) at pfs.cc:3051:3
frame mysql#20: 0x00007ff806c2b202 libsystem_pthread.dylib`_pthread_start + 99
frame mysql#21: 0x00007ff806c26bab libsystem_pthread.dylib`thread_start + 15
b) the query thread being explained is itself performing LEX::cleanup
and as part of the iterates over the query terms, but still allows
EXPLAIN of the query plan since
thd->query_plan.set_query_plan(SQLCOM_END, ...)
hasn't been called yet.
20:frame: Query_terms<(Visit_order)1, (Visit_leaves)0>::Query_term_iterator::operator++() (in mysqld) (query_term.h:613)
21:frame: Query_expression::cleanup(bool) (in mysqld) (sql_union.cc:1861)
22:frame: LEX::cleanup(bool) (in mysqld) (sql_lex.h:4286)
30:frame: Sql_cmd_dml::execute(THD*) (in mysqld) (sql_select.cc:799)
31:frame: mysql_execute_command(THD*, bool) (in mysqld) (sql_parse.cc:4648)
32:frame: dispatch_sql_command(THD*, Parser_state*) (in mysqld) (sql_parse.cc:5303)
33:frame: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in mysqld) (sql_parse.cc:2135)
34:frame: do_command(THD*) (in mysqld) (sql_parse.cc:1464)
57:frame: handle_connection(void*) (in mysqld) (connection_handler_per_thread.cc:304)
58:frame: pfs_spawn_thread(void*) (in mysqld) (pfs.cc:3053)
65:frame: _pthread_start (in libsystem_pthread.dylib) + 99
66:frame: thread_start (in libsystem_pthread.dylib) + 15
Solution:
This patch solves the issue by removing iterator state from
Query_term, making the query_term iterators thread safe. This solution
labels every child query_term with its index in its parent's
m_children vector. The iterator can therefore easily compute the next
child to visit based on Query_term::m_sibling_idx.
A unit test case is added to check reentrancy.
One can also manually verify that we have no remaining race condition
by running two client connections files (with \. <file>) with a big
number of copies of the repro query in one connection and a big number
of EXPLAIN format=json FOR <connection>, e.g.
EXPLAIN FORMAT=json FOR CONNECTION 8\G
in the other. The actual connection number would need to verified
in connection one, of course.
Change-Id: Ie7d56610914738ccbbecf399ccc4f465f7d26ea7
In my patch for Bug#106645, I am allowing myself to remove the usage of the buff_len variable for administrator commandan. IMHO, it is better cose as it avoids re-using this variable, but feel free to revert if you do not like it.
In my patch for Bug#106645, I am allowing myself to change the format of Id. This avoids a snprintf and is more consistent with Thread_id below (which looks better). IMHO, addind Db is the right time to improve on the formatting of this line, but feel free to revert if you do not like it.
In my patch for Bug#106645, I am allowing myself to inline the declaration of buff_len. IMHO, this makes things better as after removing the usage of buff_len further down below, here is the only place where it is used, but feel free to revert if you do not like it.
In my patch for Bug#106645, I am allowing myself to change the initial assigment of end from buff to NULL and assigning it to buff when used. IMHO it makes more clear that there are many independant usages of buff in this function, but feel free to revert if you do not like it.
The Cherry-pick details are moved to the brain dump.
|
Contributed in below, closing, but keeping the branch. |
This is a Work Fake PR, with a lot of noisy commit. Please "thumbs-up" this PR if you think this work is important (other encouragement welcome).
This work is about implementint (not fixing, because it is a feature request) Bug#106645: Slow query log is not logging database/schema name.
More details about this work in the brain dump.
This PR is mostly for my own usage to see a clean diff of the whole work. You can subscribe to it to follow my work, and if you have preliminary comments, feel free to add them in here.