Skip to content

Work Fake PR: MySQL 8.4.0 - Bug#106645: Slow query log is not logging database/schema name.#8

Closed
jfg956 wants to merge 64 commits intomysql-8.4.0_for_fake_prsfrom
mysql-8.4.0_bug106645
Closed

Work Fake PR: MySQL 8.4.0 - Bug#106645: Slow query log is not logging database/schema name.#8
jfg956 wants to merge 64 commits intomysql-8.4.0_for_fake_prsfrom
mysql-8.4.0_bug106645

Conversation

@jfg956
Copy link
Owner

@jfg956 jfg956 commented May 9, 2024

This is a Work Fake PR, with a lot of noisy commit. Please "thumbs-up" this PR if you think this work is important (other encouragement welcome).

This work is about implementint (not fixing, because it is a feature request) Bug#106645: Slow query log is not logging database/schema name.

More details about this work in the brain dump.

This PR is mostly for my own usage to see a clean diff of the whole work. You can subscribe to it to follow my work, and if you have preliminary comments, feel free to add them in here.

@jfg956 jfg956 changed the title Init branch and brain dump. MySQL 8.4.0 Bug#106645: Slow query log is not logging database/schema name. May 13, 2024
@jfg956 jfg956 changed the title MySQL 8.4.0 Bug#106645: Slow query log is not logging database/schema name. MySQL 8.4.0 - Bug#106645: Slow query log is not logging database/schema name. May 13, 2024
sql/log.cc Outdated
} else {
log_db_change = false; /* Do not log db change if db is in the comment line. */
db[0] = 0; /* Reset db triggers logging db change after disabling log_slow_extra_db. */
if (thd->db().str) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I'd just do this bit and not worry about the extra system variable.

adjust scripts/mysqldumpslow.pl.in

When finished add a sample: https://github.com/percona/percona-toolkit/tree/3.x/t/lib/samples/slowlogs

regex:
https://github.com/percona/percona-toolkit/blob/83ba470afe5008276a7656102b8abe0cf40a31e6/lib/SlowLogParser.pm#L45

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback Daniel.

I'd just do this bit and not worry about the extra system variable.

If I do this, I might break script parsing the slow log, which I would like to avoid. With the variable and a default of ON, people will see the brreakage and they will have the liberty to fix their script of disable the feature. I prefer to give them this libeerty. Also, with a variable, this patch can be back-ported in 8.0 and 8.4, with a default of OFF. So I think I will keep it.

adjust scripts/mysqldumpslow.pl.in

No need, it already swallows everything after Id: ....

When finished add a sample [...]

Thanks for bringing this to my attention, I will look into this.

regex

Thanks for the pointer. Cheers, J-F

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jfg956 pushed a commit that referenced this pull request Jul 4, 2024
… for connection xxx'.

The new iterator based explains are not impacted.

The issue here is a race condition. More than one thread is using the
query term iterator at the same time (whoch is neithe threas safe nor
reantrant), and part of its state is in the query terms being visited
which leads to interference/race conditions.

a) the explain thread

uses an iterator here:

   Sql_cmd_explain_other_thread::execute

is inspecting the Query_expression of the running query
calling master_query_expression()->find_blocks_query_term which uses
an iterator over the query terms in the query expression:

   for (auto qt : query_terms<>()) {
       if (qt->query_block() == qb) {
           return qt;
       }
   }

the above search fails to find qb due to the interference of the
thread b), see below, and then tries to access a nullpointer:

    * thread mysql#36, name = ‘connection’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  frame #0: 0x000000010bb3cf0d mysqld`Query_block::type(this=0x00007f8f82719088) const at sql_lex.cc:4441:11
  frame #1: 0x000000010b83763e mysqld`(anonymous namespace)::Explain::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:792:50
  frame #2: 0x000000010b83cc4d mysqld`(anonymous namespace)::Explain_join::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:1487:21
  frame #3: 0x000000010b837c34 mysqld`(anonymous namespace)::Explain::prepare_columns(this=0x00007000020611b8) at opt_explain.cc:744:26
  frame #4: 0x000000010b83ea0e mysqld`(anonymous namespace)::Explain_join::explain_qep_tab(this=0x00007000020611b8, tabnum=0) at opt_explain.cc:1415:32
  frame #5: 0x000000010b83ca0a mysqld`(anonymous namespace)::Explain_join::shallow_explain(this=0x00007000020611b8) at opt_explain.cc:1364:9
  frame #6: 0x000000010b83379b mysqld`(anonymous namespace)::Explain::send(this=0x00007000020611b8) at opt_explain.cc:770:14
  frame #7: 0x000000010b834147 mysqld`explain_query_specification(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, query_term=0x00007f8f82719088, ctx=CTX_JOIN) at opt_explain.cc:2088:20
  frame #8: 0x000000010bd36b91 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f82719088) at sql_union.cc:1519:11
  frame #9: 0x000000010bd36c68 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f8271d748) at sql_union.cc:1526:13
  frame #10: 0x000000010bd373f7 mysqld`Query_expression::explain(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00) at sql_union.cc:1591:7
  frame #11: 0x000000010b835820 mysqld`mysql_explain_query_expression(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2392:17
  frame #12: 0x000000010b835400 mysqld`explain_query(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2353:13
 * frame #13: 0x000000010b8363e4 mysqld`Sql_cmd_explain_other_thread::execute(this=0x00007f8fba585b68, thd=0x00007f8fbb111e00) at opt_explain.cc:2531:11
  frame #14: 0x000000010bba7d8b mysqld`mysql_execute_command(thd=0x00007f8fbb111e00, first_level=true) at sql_parse.cc:4648:29
  frame #15: 0x000000010bb9e230 mysqld`dispatch_sql_command(thd=0x00007f8fbb111e00, parser_state=0x0000700002065de8) at sql_parse.cc:5303:19
  frame #16: 0x000000010bb9a4cb mysqld`dispatch_command(thd=0x00007f8fbb111e00, com_data=0x0000700002066e38, command=COM_QUERY) at sql_parse.cc:2135:7
  frame #17: 0x000000010bb9c846 mysqld`do_command(thd=0x00007f8fbb111e00) at sql_parse.cc:1464:18
  frame #18: 0x000000010b2f2574 mysqld`handle_connection(arg=0x0000600000e34200) at connection_handler_per_thread.cc:304:13
  frame #19: 0x000000010e072fc4 mysqld`pfs_spawn_thread(arg=0x00007f8fba8160b0) at pfs.cc:3051:3
  frame mysql#20: 0x00007ff806c2b202 libsystem_pthread.dylib`_pthread_start + 99
  frame mysql#21: 0x00007ff806c26bab libsystem_pthread.dylib`thread_start + 15

b) the query thread being explained is itself performing LEX::cleanup
and as part of the iterates over the query terms, but still allows
EXPLAIN of the query plan since

   thd->query_plan.set_query_plan(SQLCOM_END, ...)

hasn't been called yet.

     20:frame: Query_terms<(Visit_order)1, (Visit_leaves)0>::Query_term_iterator::operator++() (in mysqld) (query_term.h:613)
     21:frame: Query_expression::cleanup(bool) (in mysqld) (sql_union.cc:1861)
     22:frame: LEX::cleanup(bool) (in mysqld) (sql_lex.h:4286)
     30:frame: Sql_cmd_dml::execute(THD*) (in mysqld) (sql_select.cc:799)
     31:frame: mysql_execute_command(THD*, bool) (in mysqld) (sql_parse.cc:4648)
     32:frame: dispatch_sql_command(THD*, Parser_state*) (in mysqld) (sql_parse.cc:5303)
     33:frame: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in mysqld) (sql_parse.cc:2135)
     34:frame: do_command(THD*) (in mysqld) (sql_parse.cc:1464)
     57:frame: handle_connection(void*) (in mysqld) (connection_handler_per_thread.cc:304)
     58:frame: pfs_spawn_thread(void*) (in mysqld) (pfs.cc:3053)
     65:frame: _pthread_start (in libsystem_pthread.dylib) + 99
     66:frame: thread_start (in libsystem_pthread.dylib) + 15

Solution:

This patch solves the issue by removing iterator state from
Query_term, making the query_term iterators thread safe. This solution
labels every child query_term with its index in its parent's
m_children vector.  The iterator can therefore easily compute the next
child to visit based on Query_term::m_sibling_idx.

A unit test case is added to check reentrancy.

One can also manually verify that we have no remaining race condition
by running two client connections files (with \. <file>) with a big
number of copies of the repro query in one connection and a big number
of EXPLAIN format=json FOR <connection>, e.g.

    EXPLAIN FORMAT=json FOR CONNECTION 8\G

in the other. The actual connection number would need to verified
in connection one, of course.

Change-Id: Ie7d56610914738ccbbecf399ccc4f465f7d26ea7
jfg956 added 17 commits July 4, 2024 14:05
In my patch for Bug#106645, I am allowing myself to remove the usage of the buff_len
  variable for administrator commandan.  IMHO, it is better cose as it avoids re-using this variable,
  but feel free to revert if you do not like it.
In my patch for Bug#106645, I am allowing myself to change the format of Id.
  This avoids a snprintf and is more consistent with Thread_id below (which looks better).
  IMHO, addind Db is the right time to improve on the formatting of this line,
  but feel free to revert if you do not like it.
In my patch for Bug#106645, I am allowing myself to inline the declaration of buff_len.
  IMHO, this makes things better as after removing the usage of buff_len further down below,
  here is the only place where it is used, but feel free to revert if you do not like it.
In my patch for Bug#106645, I am allowing myself to change the initial assigment of end
  from buff to NULL and assigning it to buff when used.  IMHO it makes more clear
  that there are many independant usages of buff in this function,
  but feel free to revert if you do not like it.
The Cherry-pick details are moved to the brain dump.
@jfg956
Copy link
Owner Author

jfg956 commented Nov 22, 2024

Contributed in below, closing, but keeping the branch.

@jfg956 jfg956 closed this Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants