Skip to content

Make auth_socket more flexible.#1

Closed
dveeden wants to merge 1 commit intomysql:5.6from
dveeden:auth_sock_flex
Closed

Make auth_socket more flexible.#1
dveeden wants to merge 1 commit intomysql:5.6from
dveeden:auth_sock_flex

Conversation

@dveeden
Copy link

@dveeden dveeden commented Oct 27, 2014

Now supports "AS ''" so it doesn't have to be a 1-on-1 mapping.

Example:
CREATE USER 'foo'@'localhost IDENTIFIED WITH auth_socket AS 'myuser';

Now supports "AS '<user>'" so it doesn't have to be a 1-on-1 mapping.

Example:
CREATE USER 'foo'@'localhost IDENTIFIED WITH auth_socket AS 'myuser';
@roel666
Copy link

roel666 commented Oct 27, 2014

Thanks

@dveeden
Copy link
Author

dveeden commented Oct 27, 2014

The MySQL Bug ID for this:
http://bugs.mysql.com/74586

@MyDanny
Copy link
Contributor

MyDanny commented Oct 29, 2014

Thank you. Unfortunately, we can't accept pull requests on github. I see you've already submitted your patch via the bug tracker, so I'm closing this pull request.

@MyDanny MyDanny closed this Oct 29, 2014
MyDanny pushed a commit that referenced this pull request Mar 10, 2015
…rows.

Background 1:

When binlog_format = row, CREATE ... SELECT is logged in two pieces,
like:

Anonymous_Gtid
query_log_event(CREATE TABLE without SELECT)
Anonymous_Gtid
query_log_event(BEGIN)
...row events...
query_log_event(COMMIT) (or Xid_log_event)

Internally, there is a call to MYSQL_BIN_LOG::commit after the table has
been created and before the rows are selected.

When gtid_next='ANONYMOUS', we must not release anonymous ownership for
the commit occurring in the middle of the statement (since that would
allow a concurrent client to set gtid_mode=on, making it impossible to
commit the rest of the statement). Also, the commit in the middle of the
statement should not decrease the counter of ongoing GTID-violating
transactions, since that would allow a concurrent client to set
ENFORCE_GTID_CONSISTENCY=ON even if there is an ongoing transaction that
violates GTID-consistency.

The logic to skip releasing anonymous ownership and skip decreasing the
counter is as follows. Before calling mysql_bin_log.commit, it sets the
flag thd->is_commit_in_middle_of_statement. Eventually,
mysql_bin_log.commit calls gtid_state->update_on_commit, which calls
gtid_state->update_gtids_impl, which reads the
thd->is_commit_in_middle_of_statement and accordingly decides to skip
releasing anonymous ownership and/or skips decreasing the counter.

Problem 1:

When thd->is_commit_in_middle_of_statement has been set, it is crucial
that there is another call to update_gtids_impl when the transaction
ends (otherwise the session will keep holding anonymous ownership and
will not decrease the counters). Normally, this happens because
mysql_bin_log.commit is always called, and mysql_bin_log.commit normally
invokes ordered_commit, which calls update_gtids_impl. However, in case
the SELECT part of the statement does not find any rows,
mysql_bin_log.commit skips the call to ordered_commit, so
update_gtids_impl does not get called. This is the first problem we fix
in this commit.

Fix 1:

We fix this problem as follows. After calling mysql_bin_log.commit to
log the CREATE part of CREATE...SELECT, the CREATE...SELECT code sets
thd->pending_gtid_state_update=true (this is a new flag that we
introduce in this patch). If the flag is set, update_gtids_impl clears
it. At the end of mysql_bin_log.commit, we check the flag to see if
update_gtids_impl has been called by any function invoked by
mysql_bin_log.commit. If not, i.e., if the flag is still true at the end
of mysql_bin_log.commit, it means we have reached the corner case where
update_gtids_impl was skipped. Thus we call it explicitly from
mysql_bin_log.commit.

Background 2:

GTID-violating DDL (CREATE...SELECT and CREATE TEMPORARY) is detected in
is_ddl_gtid_compatible, called from gtid_pre_statement_checks, which is
called from mysql_parse just before the implicit pre-commit.
is_ddl_gtid_compatible determines whether an error or warning or nothing
is to be generated, and whether to increment the counters of GTID-
violating transactions. In case an error is generated, it is important
that the error happens before the implicit commit, so that the statement
fails before it commits the ongoing transaction.

Problem 2:

In case a warning is to be generated, and there is an ongoing
transaction, the implicit commit will write to the binlog, and thus it
will call gtid_state->update_gtids_impl, which will decrease the
counters of GTID-violating transactions. Thus, the counters will be zero
for the duration of the transaction.

Fix 2:

We call is_ddl_gtid_compatible *both* before the implicit commit and
after the implicit commit. If an error is to be generated, the error is
generated before the commit. If a warning is to be generated and/or the
counter of GTID-violating transactions is to be increased, then this
happens after the commit.

Code changes #1:

@sql/binlog.cc
- Move MYSQL_BIN_LOG::commit to a new function
MYSQL_BIN_LOG::write_binlog_and_commit_engine. Make
MYSQL_BIN_LOG::commit call this function, and after the return check
thd->pending_gtid_state_update to see if another call to
gtid_state->update_on_[commit|rollback] is needed.
- Simplify MYSQL_BIN_LOG::write_bin_log_and_commit_engine; remove
useless local variable 'error' that would never change its value.

@sql/binlog.h:
- Declaration of new function.

@sql/rpl_gtid_state.cc:
- Set thd->pending_gtid_state_update to false at the end of
update_gtids_impl.

Code changes #2:
@sql/binlog.cc:
- Add two parameters to handle_gtid_consistency and is_ddl_compatible:
handle_error is true in the call to is_ddl_gtid_compatible that happens
*before* the implicit commit and fals in the call to
is_ddl_gtid_compatible that happens *after* the implicit commit. It
tells the function to generate the error, if an error is to be
generated. The other parameter, handle_nonerror, is true in the call to
is_ddl_gtid_compatible that happens *after* the implicit commit and
false in the call that happens *before* the implicit commit. It tells
the function to generate the warnings and increment the counter, if that
needs to be done.

@sql/rpl_gtid_execution.cc:
- Call is_ddl_gtid_compatible after the implicit commit. Pass the two
new parameters to the function.

@sql/sql_class.h:
- Update prototype for is_ddl_gtid_compatible.

@sql/sql_insert.cc:
- Set thd->pending_gtid_state_update = true after committing the CREATE
part of a CREATE...SELECT.

Misc changes:

@sql/binlog.cc
- Add DEBUG_SYNC symbol end_decide_logging_format used in tests.

@sql/rpl_gtid_state.cc:
- For modularity, move out parts of update_gtids_impl to a new function,
end_gtid_violating_transaction.
- Move the lock/unlock of global_sid_lock into update_gtids_impl.
- Make update_gtids_impl release global_sid_lock before the call to
end_gtid_violating_transaction, so as to hold it as short as possible.

@sql/rpl_gtid.h
- Because we release the locks earlier in update_gtids_impl in
rpl_gtid_state.cc, we need to acquire the lock again in
end_[anonymous|automatic]_gtid_violating_transaction, in order to do
some debug assertions.
- Add DBUG_PRINT for the counters.

Test changes:
- Split binlog_enforce_gtid_consistency into six tests, depending on the
type of scenarios it tests:
Three classes of GTID-violation:
    *_create_select_*: test CREATE ... SELECT.
    *_tmp_*: test CREATE TEMPORARY/DROP TEMPORARY inside a transaction.
    *_trx_nontrx_*: test combinations of transactional and
    non-transactional updates in the same statement or in the same
    transaction.
For each class of GTID-violation, one positive and one negative test:
    *_consistent.test: Cases which are *not* GTID-violating
    *_violation.test: Cases which *are* GTID-violating.
- The general logic of these test is:
- extra/binlog_tests/enforce_gtid_consistency.test iterates over all
    values of GTID_MODE, ENFORCE_GTID_CONSISTENCY, and GTID_NEXT. For
    each case, it sources file; the name of the sourced file is
    specified by the file that sources
    extra/binlog_tests/enforce_gtid_consistency.test
- The top-level file in suite/binlog/t invokes
    extra/binlog_tests/enforce_gtid_consistency.test, specifying one
    of the filenames
    extra/binlog_tests/enforce_gtid_consistency_[create_select|tmp|
    trx_nontrx]_[consistent|violation].test.
- Each of the files
    extra/binlog_tests/enforce_gtid_consistency_[create_select|tmp|
    trx_nontrx]_[consistent|violation].test
    sets up a number of test scenarios. Each test scenario is executed
    by sourcing
    extra/binlog_tests/enforce_gtid_consistency_statement.inc.
- extra/binlog_tests/enforce_gtid_consistency_statement.inc executes
    the specified statement, checks that warnings are generated and
    counters incremented/decremented as specified by the caller.
- Since the tests set GTID_MODE explicitly, it doesn't make sense to run
the test in both combinations GTID_MODE=ON/OFF. However, for the
*_trx_nontrx_* cases, it is important to test that it works both with
binlog_direct_non_transactional_updates=on and off. The suite is never
run with those combinations. To leverage from the combinations
GTID_MODE=ON/OFF, we run the test with
binlog_direct_non_transactional_updates=on if GTID_MODE=ON, and we run
the test with binlog_direct_non_transactional_updates=off if
GTID_MODE=OFF.
bkandasa pushed a commit that referenced this pull request Aug 4, 2015
Added the extra scope argument to the status variables.
bkandasa pushed a commit that referenced this pull request Aug 4, 2015
Two places in replication code were causing Valgrind errors:

 1. Memory leak in Relay_log_info::wait_for_gtid_set, since it passed
    the return value from Gtid_set::to_string() directly to
    DBUG_PRINT, without storing it anywhere so that it can be freed.

 2. In MYSQL_BIN_LOG::init_gtid_sets would pass a bad pointer to
    DBUG_PRINT in some cases.

In problem #1, an underlying problem was that to_string returns newly
allocated memory and this was easy to miss when reading the code that
calls the function.  It would be better to return the value through a
parameter, since that forces the caller to store it in a variable, and
then it is more obvious that the value must be freed.  And in fact
such a function existed already, so we fix the problem by removing the
(redundant) no-args version of Gtid_set::to_string and using the one-
or two-argument function instead.

In problem #2, print an empty string if we detect that the pointer
will be bad.

These bugs were found when adding some debug printouts to
read_gtids_from_binlog.  These debug printouts never made it to the
server code through any other bug report, but would be useful to have
for future debugging, so including them in this patch.

Additionally, removed the call to global_sid_lock->rdlock() used
before Previous_gtids_log_event::get_str().  This is not needed since
Previous_gtids_log_event doesn't use shared resources.
bjornmu pushed a commit that referenced this pull request Oct 21, 2015
…TO SELF

Problem: If a multi-column update statement fails when updating one of
the columns in a row, it will go on and update the remaining columns
in that row before it stops and reports an error. If the failure
happens when updating a JSON column, and the JSON column is also
referenced later in the update statement, new and more serious errors
can happen when the update statement attempts to read the JSON column,
as it may contain garbage at this point.

The fix is twofold:

1) Field_json::val_str() currently returns NULL if an error happens.
This is correct for val_str() functions in the Item class hierarchy,
but not for val_str() functions in the Field class hierarchy. The
val_str() functions in the Field classes instead return a pointer to
an empty String object on error. Since callers don't expect it to
return NULL, this caused a crash when a caller unconditionally
dereferenced the returned pointer. The patch makes
Field_json::val_str() return a pointer to an empty String on error to
avoid such crashes.

2) Whereas #1 fixes the immediate crash, Field_json::val_str() may
still read garbage when this situation occurs. This could lead to
unreliable behaviour, and both valgrind and ASAN warn about it. The
patch therefore also makes Field_json::store() start by clearing the
field, so that it will hold an empty value rather than garbage after
an error has happened.

Fix #2 is sufficient to fix the reported problems. Fix #1 is included
for consistency, so that Field_json::val_str() behaves the same way as
the other Field::val_str() functions.

The query in the bug report didn't always crash. Since the root cause
was that it had read garbage, it could be lucky and read something
that looked like a valid value. In that case, Field_json::val_str()
didn't return NULL, and the crash was avoided.

The patch also makes these changes:

- It removes the Field_json::store_dom() function, since it is only
  called from one place. It is now inlined instead.

- It corrects information about return values in the comment that
  describes the ensure_utf8mb4() function.
bjornmu pushed a commit that referenced this pull request Oct 21, 2015
Background:

WAIT_FOR_EXECUTED_GTID_SET waits until a specified set of GTIDs is
included in GTID_EXECUTED. SET GTID_PURGED adds GTIDs to
GTID_EXECUTED. RESET MASTER clears GTID_EXECUTED.

There were multiple issues:

 1. Problem:

    The change in GTID_EXECUTED implied by SET GTID_PURGED did
    not cause WAIT_FOR_EXECUTED_GTID_SET to stop waiting.

    Analysis:

    WAIT_FOR_EXECUTED_GTID_SET waits for a signal to be sent.
    But SET GTID_PURGED never sent the signal.

    Fix:

    Make GTID_PURGED send the signal.

    Changes:
    - sql/rpl_gtid_state.cc:Gtid_state::add_lost_gtids
    - sql/rpl_gtid_state.cc: removal of #ifdef HAVE_GTID_NEXT_LIST
    - sql/rpl_gtid.h: removal of #ifdef HAVE_GTID_NEXT_LIST

 2. Problem:

    There was a race condition where WAIT_FOR_EXECUTED_GTID_SET
    could miss the signal from a commit and go into an infinite
    wait even if GTID_EXECUTED contains all the waited-for GTIDs.

    Analysis:

    In the bug, WAIT_FOR_EXECUTED_GTID_SET took a lock while
    taking a copy of the global state. Then it released the lock,
    analyzed the copy of the global state, and decided whether it
    should wait.  But if the GTID to wait for was committed after
    the lock was released, WAIT_FOR_EXECUTED_GTID_SET would miss
    the signal and go to an infinite wait even if GTID_EXECUTED
    contains all the waited-for GTIDs.

    Fix:

    Refactor the code so that it holds the lock all the way from
    before it reads the global state until it goes to the wait.

    Changes:

    - sql/rpl_gtid_state.cc:Gtid_state::wait_for_gtid_set:
      Most of the changes in this function are to fix this bug.

    Note:

    When the bug existed, it was possible to create a test case
    for this by placing a debug sync point in the section where
    it does not hold the lock.  However, after the bug has been
    fixed this section does not exist, so there is no way to test
    it deterministically.  The bug would also cause the test to
    fail rarely, so a way to test this is to run the test case
    1000 times.

 3. Problem:

    The function would take global_sid_lock.wrlock every time it has
    to wait, and while holding it takes a copy of the entire
    gtid_executed (which implies allocating memory).  This is not very
    optimal: it may process the entire set each time it waits, and it
    may wait once for each member of the set, so in the worst case it
    is O(N^2) where N is the size of the set.

    Fix:

    This is fixed by the same refactoring that fixes problem #2.  In
    particular, it does not re-process the entire Gtid_set for each
    committed transaction. It only removes all intervals of
    gtid_executed for the current sidno from the remainder of the
    wait-for-set.

    Changes:
    - sql/rpl_gtid_set.cc: Add function remove_intervals_for_sidno.
    - sql/rpl_gtid_state.cc: Use remove_intervals_for_sidno and remove
      only intervals for the current sidno. Remove intervals
      incrementally in the innermost while loop, rather than recompute
      the entire set each iteration.

 4. Problem:

    If the client that executes WAIT_FOR_EXECUTED_GTID_SET owns a
    GTID that is included in the set, then there is no chance for
    another thread to commit it, so it will wait forever.  In
    effect, it deadlocks with itself.

    Fix:

    Detect the situation and generate an error.

    Changes:
    - sql/share/errmsg-utf8.txt: new error code
      ER_CANT_WAIT_FOR_EXECUTED_GTID_SET_WHILE_OWNING_A_GTID
    - sql/item_func.cc: check the condition and generate the new error

 5. Various simplfications.

    - sql/item_func.cc:Item_wait_for_executed_gtid_set::val_int:
      - Pointless to set null_value when generating an error.
      - add DBUG_ENTER
      - Improve the prototype for Gtid_state::wait_for_gtid_set so
        that it takes a Gtid_set instead of a string, and also so that
        it requires global_sid_lock.
    - sql/rpl_gtid.h:Mutex_cond_array
      - combine wait functions into one and make it return bool
      - improve some comments
    - sql/rpl_gtid_set.cc:Gtid_set::remove_gno_intervals:
      - Optimize so that it returns early if this set becomes empty

@mysql-test/extra/rpl_tests/rpl_wait_for_executed_gtid_set.inc
- Move all wait_for_executed_gtid_set tests into
  mysql-test/suite/rpl/t/rpl_wait_for_executed_gtid_set.test

@mysql-test/include/kill_wait_for_executed_gtid_set.inc
@mysql-test/include/wait_for_wait_for_executed_gtid_set.inc
- New auxiliary scripts.

@mysql-test/include/rpl_init.inc
- Document undocumented side effect.

@mysql-test/suite/rpl/r/rpl_wait_for_executed_gtid_set.result
- Update result file.

@mysql-test/suite/rpl/t/rpl_wait_for_executed_gtid_set.test
- Rewrote the test to improve coverage and cover all parts of this bug.

@sql/item_func.cc
- Add DBUG_ENTER
- No point in setting null_value when generating an error.
- Do the decoding from text to Gtid_set here rather than in Gtid_state.
- Check for the new error
  ER_CANT_WAIT_FOR_EXECUTED_GTID_SET_WHILE_OWNING_A_GTID

@sql/rpl_gtid.h
- Simplify the Mutex_cond_array::wait functions in the following ways:
  - Make them one function since they share most code. This also allows
    calling the three-argument function with NULL as the last
    parameter, which simplifies the caller.
  - Make it return bool rather than 0/ETIME/ETIMEOUT, to make it more
    easy to use.
- Make is_thd_killed private.
- Add prototype for new Gtid_set::remove_intervals_for_sidno.
- Add prototype for Gtid_state::wait_for_sidno.
- Un-ifdef-out lock_sidnos/unlock_sidnos/broadcast_sidnos since we now
  need them.
- Make wait_for_gtid_set return bool.

@sql/rpl_gtid_mutex_cond_array.cc
- Remove the now unused check_thd_killed.

@sql/rpl_gtid_set.cc
- Optimize Gtid_set::remove_gno_intervals, so that it returns early
  if the Interval list becomes empty.
- Add Gtid_set::remove_intervals_for_sidno. This is just a wrapper
  around the already existing private member function
  Gtid_set::remove_gno_intervals.

@sql/rpl_gtid_state.cc
- Rewrite wait_for_gtid_set to fix problems 2 and 3. See code
  comment for details.
- Factor out wait_for_sidno from wait_for_gtid.
- Enable broadcast_sidnos/lock_sidnos/unlock_sidnos, which were ifdef'ed out.
- Call broadcast_sidnos after updating the state, to fix issue #1.

@sql/share/errmsg-utf8.txt
- Add error message used to fix issue #4.
bjornmu pushed a commit that referenced this pull request Oct 21, 2015
An assert failure is seen in some queries which have a semijoin and
use the materialization strategy.

The assertion fails if either the length of the key is zero or the
number of key parts is zero. This could indicate two different
problems.

1) If the length is zero, there may not be a problem, as it can
legitimately be zero if, for example, the key is a zero-length string.

2) If the number of key parts is zero, there is a bug, as a key must
have at least one part.

The patch fixes issue #1 by removing the length check in the
assertion.

Issue #2 happens if JOIN::update_equalities_for_sjm() doesn't
recognize the expression selected from a subquery, and fails to
replace it with a reference to a column in a temporary table that
holds the materialized result. This causes it to not recognize it as a
part of the key later, and keyparts could end up as zero. The patch
fixes it by calling real_item() on the expression in order to see
through Item_refs that may wrap the expression if the subquery reads
from a view.
bjornmu pushed a commit that referenced this pull request Feb 5, 2016
Problem:

The binary log group commit sync is failing when committing a group of
transactions into a non-transactional storage engine while other thread
is rotating the binary log.

Analysis:

The binary log group commit procedure (ordered_commit) acquires LOCK_log
during the #1 stage (flush). As it holds the LOCK_log, a binary log
rotation will have to wait until this flush stage to finish before
actually rotating the binary log.

For the #2 stage (sync), the binary log group commit only holds the
LOCK_log if sync_binlog=1. In this case, the rotation has to wait also
for the sync stage to finish.

When sync_binlog>1, the sync stage releases the LOCK_log (to let other
groups to enter the flush stage), holding only the LOCK_sync. In this
case, the rotation can acquire the LOCK_log in parallel with the sync
stage.

For commits into transactional storage engine, the binary log rotation
checks a counter of "flushed but not yet committed" transactions,
waiting until this counter to be zeroed before closing the current
binary log file.  As the commit of the transactions happen in the #3
stage of the binary log group commit, the sync of the binary log in
stage #2 always succeed.

For commits into non-transactional storage engine, the binary log
rotation is checking the "flushed but not yet committed" transactions
counter, but it is zero because it only counts transactions that
contains XIDs. So, the rotation is allowed to take place in parallel
with the #2 stage of the binary log group commit. When the sync is
called at the same time that the rotation has closed the old binary log
file but didn't open the new file yet, the sync is failing with the
following error: 'Can't sync file 'UNOPENED' to disk (Errcode: 9 - Bad
file descriptor)'.

Fix:

For non-transactional only workload, binary log group commit will keep
the LOCK_log when entering #2 stage (sync) if the current group is
supposed to be synced to the binary log file.
hramilison pushed a commit that referenced this pull request Jun 2, 2016
Description: 
For ngshell query
Crud.Find({ name:$.name, count:count(*) }).GroupBy($.name);
an error has been raised:
"Expression #1 of SELECT list is not in GROUP BY clause and contains
nonaggregated column 'test.coll.doc' which is not functionally dependent
on columns in GROUP BY clause; this is incompatible with
sql_mode=only_full_group_by"

Reviewed-by: Lukasz Kotula <lukasz.kotula@oracle.com>
RB:12283
bjornmu pushed a commit that referenced this pull request Apr 10, 2017
Author: Andrei Elkin <andrei.elkin@oracle.com>
Date:   Fri Nov 25 15:17:17 2016 +0200

    WL#9175 Correct recovery of DDL statements/transactions by binary log

    The patch consists of two parts implementing the WL agenda which is
    is to provide crash-safety for DDL.
    That is a server (a general one, master or slave) must be able to recover
    from crash to commit or rollback every DDL command that was in progress
    on the eve of crash.

    The Commit decision is done to commands that had reached
    Engine-prepared status and got successfully logged into binary log.
    Otherwise they are rolled back.

    In order to achieve the goal some refinements are done to the binlogging
    mechanism, minor addition is done to the server recovery module and some changes
    applied to the slave side.

    The binary log part includes Query-log-event which is made to contain xid
    that is a key item at server recovery. The recovery now is concern with it along
    with its standard location in Xid_log_event.

    The first part deals with the ACL DDL sub-class and
    TRIGGER related queries are fully 2pc-capable. It constructs
    the WL's framework which is proved on these subclasses.
    It also specifies how to cover the rest of DDLs by the WL's framework.
    For those not 2pc-ready DDL cases, sometimes "stub" tests are prepared
    to be refined by responsible worklogs.

    Take a few notes to the low-level details of implementation.

    Note #1.

    Tagging by xid number is done to the exact 2pc-capable DDL subclass.
    For DDL:s that will be ready for xiding in future, there is a tech specification
    how to do so.

    Note #2.

    By virtue of existing mechanisms, the slave applier augments the DDL
    transaction incorporating the slave info table update and the
    Gtid-executed table (either one optionally) at time of the DDL is
    ready for the final commit.
    When for filtering reason the DDL skips committing at its regular
    time, the augmented transaction would still be not empty consisting of
    only the added statements, and it would have to be committed by
    top-level slave specific functions through Log_event::do_update_pos().

    To aid this process Query_log_event::has_committed is introduced.

    Note #3 (QA, please read this.)

    Replication System_table interface that is employed by handler of TABLE type slave info
    had to be refined in few places.

    Note #4 (runtime code).

    While trying to lessen the footprint to the runtime server code few
    concessions had to be conceded. These include changes to
    ha_commit_trans()
    to invoke new pre_commit() and post_commit(), and post_rollback() hooks
    due to the slave extra statement.

    -------------------------------------------------------------------

    The 2nd part patch extends the basic framework,
    xidifies the rest of DDL commands that are
    (at least) committable at recovery. At the moment those include
    all Data Definition Statements except ones related to
    VIEWs, STORED Functions and Procedures.

    DDL Query is recoverable for these subclasses when it has been
    recorded into the binary log and was discovered there at the server
    restart, quite compatible with the DML algorithm.
    However a clean automatic rollback can't be provided for some
    of the commands and the user would have to complete recovery
    manually.
pobrzut pushed a commit that referenced this pull request May 8, 2017
… WITH DEVELOPER STUDIO

When we do a release type build of the server (with both optimized and
debug enabled server/plugins) with Developer Studio, some MTR tests
when run with --debug-server will fail in one of two ways:

1. Tests which try to load a plugin into the mysql client fail with
   missing symbols. This is caused by the plugin having references to
   functions which do not exist in the non-debug client.

2. Some tests on sparc fail with Thread stack overrun.

Fix for issue #1: mtr will have appended /debug to the plugin dir part
when running with --debug-server and if there actually is such a
directory. The fix is to remove any trailing /debug from the
env. variable within the test. This will affect the client only, not
the server. Developer builds will not have put the plugins in a
subdirectory /debug so it makes no different to those.

Fix for issue #2: apparently this thread stack overrun is not feasible
to avoid, so just skip the test if running with debug server on sparc;
there is already an include file to do that.

Also added not_sparc_debug.inc to the "white list" so the tests are
skipped even when running mtr --no-skip.

(cherry picked from commit 9c79e477261ab252e38def436bca3336ef597603)
pobrzut pushed a commit that referenced this pull request May 8, 2017
akopytov pushed a commit to akopytov/mysql-server that referenced this pull request Aug 25, 2017
In WL-included builds ASAN run witnessed missed ~Query_log_event invocation.
The destruct-or was not called due to the WL's changes in the error propagation
that specifically affect LC MTS.
The failure is exposed in particular by rpl_trigger as the following
stack:

  #0 0x9ecd98 in __interceptor_malloc (/export/home/pb2/test/sb_2-22611026-1489061390.32/mysql-commercial-8.0.1-dmr-linux-x86_64-asan/bin/mysqld+0x9ecd98)
  mysql#1 0x2b1a245 in my_raw_malloc(unsigned long, int) obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:209:12
  mysql#2 0x2b1a245 in my_malloc obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:72
  mysql#3 0x2940590 in Query_log_event::Query_log_event(char const*, unsigned int, binary_log::Format_description_event const*, binary_log::Log_event_type) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:4343:46
  mysql#4 0x293d235 in Log_event::read_log_event(char const*, unsigned int, char const**, Format_description_log_event const*, bool) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:1686:17
  mysql#5 0x293b96f in Log_event::read_log_event()
  mysql#6 0x2a2a1c9 in next_event(Relay_log_info*)

Previously before the WL
Mts_submode_logical_clock::wait_for_workers_to_finish() had not
returned any error even when Coordinator thread is killed.

The WL patch needed to refine such behavior, but at doing so
it also had to attend log_event.cc::schedule_next_event() to register
an error to follow an existing pattern.
While my_error() does not take place the killed Coordinator continued
scheduling, ineffectively though - no Worker gets engaged (legal case
of deferred scheduling), and without noticing its killed status up to
a point when it resets the event pointer in
apply_event_and_update_pos():

  *ptr_ev= NULL; // announcing the event is passed to w-worker

The reset was intended for an assigned Worker to perform the event
destruction or by Coordinator itself when the event is deferred.
As neither is the current case the event gets unattended for its termination.

In contrast in the pre-WL sources the killed Coordinator does find a Worker.
However such Worker could be already down (errored out and exited), in
which case apply_event_and_update_pos() reasonably returns an error and executes

  delete ev

in exec_relay_log_event() error branch.

**Fixed** with deploying my_error() call in log_event.cc::schedule_next_event()
error branch which fits to the existing pattern.
THD::is_error() has been always checked by Coordinator before any attempt to
reset *ptr_ev= NULL. In the errored case Coordinator does not reset and
destroys the event itself in the exec_relay_log_event() error branch pretty similarly to
how the pre-WL sources do.

Tested against rpl_trigger and rpl suites to pass.

Approved on rb#15667.
akopytov pushed a commit to akopytov/mysql-server that referenced this pull request Aug 25, 2017
Some character sets are designated as MY_CS_STRNXFRM, meaning that sorting
needs to go through my_strnxfrm() (implemented by the charset), and some are
not, meaning that a client can do the strnxfrm itself based on
cs->sort_order. However, most of the logic related to the latter has been
removed already (e.g. filesort always uses my_strnxfrm() since 2003), and now
it's mostly in the way. The three main uses left are:

 1. A microoptimization for constructing sort keys in filesort.
 2. A home-grown implementation of Boyer-Moore for accelerating certain
    LIKE patterns that should probably be handled through FTS.
 3. Some optimizations to MyISAM prefix keys.

Given that our default collation (utf8mb4_0900_ai_ci) now is a strnxfrm-based
collation, the benefits of keeping these around for a narrow range of
single-byte locales (like latin1_swedish_ci, cp850 and a bunch of more
obscure locales) seems dubious. We seemingly can't remove the flag entirely
due to mysql#3 seemingly affecting the on-disk MyISAM structure, but we can remove
the code for mysql#1 and mysql#2.

Change-Id: If974e490d451b7278355e33ab1fca993f446b792
bjornmu pushed a commit that referenced this pull request Sep 21, 2017
Patch #1:

When find_child_doms() evaluates an ellipsis path leg, it adds each
matching child twice. First once for every immediate child in the
top-level call to find_child_doms(), and then again in the recursive
call to find_child_doms() on each of the children. This doesn't cause
any wrong results, since duplicate elimination prevents them from
being added to the result, but it is unnecessary work. This patch
makes find_child_dom() stop calling add_if_missing() on the children
before it recurses into them, so that they are only added once.

Additionally:

Make find_child_dom() a free, static function instead of a member of
Json_dom. It is a helper function for Json_dom::seek(), and doesn't
use any non-public members of Json_dom, so it doesn't have to be part
of Json_dom's interface.

Remove unnecessary checks for only_need_one in the ellipsis
processing. Json_dom::seek() sets the flag to true only in the last
path leg, and the JSON path parser rejects paths that end with an
ellipsis, so this cannot happen. Added an assert instead.

Microbenchmarks (64-bit, Intel Core i7-4770 3.4 GHz, GCC 6.3):

BM_JsonDomSearchEllipsis              79880 ns/iter [+32.0%]
BM_JsonDomSearchEllipsis_OnlyOne      75872 ns/iter [+35.3%]
BM_JsonDomSearchKey                     129 ns/iter [ -1.6%]
BM_JsonBinarySearchEllipsis          320920 ns/iter [+11.8%]
BM_JsonBinarySearchEllipsis_OnlyOne  315458 ns/iter [+12.6%]
BM_JsonBinarySearchKey                   86 ns/iter [ -2.3%]

Change-Id: I865af789b90b820a6e180ad822f2fb68f411516b
bjornmu pushed a commit that referenced this pull request Jan 23, 2018
…_ROW()' FAILED

Analysis:

When a window with buffering follows a equijoin on a unique index
(JT_EQ_REF) , we can get into trouble because windowing modifies the
input record, presuming that once the windowing has been handed the
record, next time control passes back to the join code a (new) record
will be read to the input record.

However, this does not hold with JT_EQ_REF, cf. the caching done in
join_read_key:

From its Doxygen:

  "Since the eq_ref access method will always return the same row, it
   is not necessary to read the row more than once, regardless of how
   many times it is needed in execution.  This cache element is used
   when a row is needed after it has been read once, unless a key
   conversion error has occurred, or the cache has been disabled."

Fix:

We solve this problem by reinstating the input record before handing
control back from end_write_wf. We optimize: only do this if the
window in question follows after such a JOIN, i.e. window #1, and it
has actually clobbered the input record. This can only happen if
the last qep_tab has type JT_EQ_REF.

Another, perhaps better approach, is to refactor to never touch the
input record but keep the copying between the out record and the frame
table record instead. Left for future refactoring.

Added some missing Window method "const"s, and folded a couple of
one-liners into window.h (from .cc).

Repro added.

Change-Id: I33bc43cd99ff79303b17d181abc3805ce226fb85
bjornmu pushed a commit that referenced this pull request Jan 23, 2018
…TABLE_UPGRADE_GUARD

To repeat: cmake -DWITH_ASAN=1 -DWITH_ASAN_SCOPE=1
./mtr --mem --sanitize main.dd_upgrade_error

A few dd tests fail with:
==26861==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7000063bf5e8 at pc 0x00010d4dbe8b bp 0x7000063bda40 sp 0x7000063bda38
READ of size 8 at 0x7000063bf5e8 thread T2
    #0 0x10d4dbe8a in Prealloced_array<st_plugin_int**, 16ul>::empty() const prealloced_array.h:186
    #1 0x10d406a8b in lex_end(LEX*) sql_lex.cc:560
    #2 0x10dae4b6d in dd::upgrade::Table_upgrade_guard::~Table_upgrade_guard() (mysqld:x86_64+0x100f87b6d)
    #3 0x10dadc557 in dd::upgrade::migrate_table_to_dd(THD*, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, bool) (mysqld:x86_64+0x100f7f557)
    #4 0x10dad7e85 in dd::upgrade::migrate_plugin_table_to_dd(THD*) (mysqld:x86_64+0x100f7ae85)
    #5 0x10daec6a1 in dd::upgrade::do_pre_checks_and_initialize_dd(THD*) upgrade.cc:1216
    #6 0x10cd0a5c0 in bootstrap::handle_bootstrap(void*) bootstrap.cc:336

Change-Id: I265ec6dd97ee8076aaf03763840c0cdf9e20325b
Fix: increase lifetime of 'LEX lex;' which is used by 'table_guard'
dveeden pushed a commit to dveeden/mysql-server that referenced this pull request Feb 4, 2018
Fix misc UBSAN warnings in unit tests.
To repeat:
export UBSAN_OPTIONS="print_stacktrace=1"

./runtime_output_directory/merge_large_tests-t --gtest_filter='-*DeathTest*' > /dev/null

unittest/gunit/gis_algos-t.cc:78:70:
runtime error: downcast of address 0x000012dc0be8 which does not point to an object of type 'Gis_polygon_ring'

include/sql_string.h:683:35: runtime error: null pointer passed as argument 2, which is declared to never be null
    mysql#1 0x373e7af in histograms::Value_map<String>::add_values(String const&, unsigned long long) sql/histograms/value_map.cc:149
    mysql#2 0x294fcf2 in dd_column_statistics_unittest::add_values(histograms::Value_map<String>&) unittest/gunit/dd_column_statistics-t.cc:62

runtime_output_directory/merge_keyring_file_tests-t --gtest_filter='-*DeathTest*' > /dev/null

plugin/keyring/common/keyring_key.cc:82:57: runtime error: null pointer passed as argument 2, which is declared to never be null

Change-Id: I2651362e3373244b72e6893f0e22e67402b49a52
(cherry picked from commit 1fe3f72561994da1d912a257689e1b18106f8828)
hramilison pushed a commit that referenced this pull request Apr 19, 2018
              AUTH_COMMON.H

Description:- Sever crashes due to a NULL pointer
de-reference.

Analysis:- Sever encounters a NUll pointer de-reference
during "acl_load()".

Fix:- A check is introduced to avoid the NULL pointer
de-reference.
This issue is already prevented in 8.0 through Bug#27225806
fix. Therefore, this patch is applicable only for 5.7.
sreedhars pushed a commit that referenced this pull request Jul 27, 2018
Group Replication does implement conflict detection on
multi-primary to avoid write errors on parallel operations.
The conflict detection is also engaged in single-primary mode on the
particular case of primary change and the new primary still has a
backlog to apply. Until the backlog is flushed, conflict detection
is enabled to prevent write errors between the backlog and incoming
transactions.

The conflict detection data, which we name certification info, is
also used to detected dependencies between accepted transactions,
dependencies which will rule the transactions schedule on the
parallel applier.

In order to avoid that the certification info grows forever,
periodically all members exchange their GTID_EXECUTED set, which
full intersection will provide the set of transactions that are
applied on all members. Future transactions cannot conflict with
this set since all members are operating on top of it, so we can
safely remove all write-sets from the certification info that do
belong to those transactions.
More details at WL#6833: Group Replication: Read-set free
Certification Module (DBSM Snapshot Isolation).

Though a corner case was found on which the garbage collection was
purging more data than it should.
The scenario is:
 1) Group with 2 members;
 2) Member1 executes:
      CREATE TABLE t1(a INT, b INT, PRIMARY KEY(a));
      INSERT INTO t1 VALUE(1, 1);
    Both members have a GTID_EXECUTED= UUID:1-4
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       #1                           UUID1:1-4
 3) member1 executes TA
      UPDATE t1 SET b=10 WHERE a=1;
    and blocks immediately before send the transaction to the group.
    This transaction has snapshot_version: UUID:1-4
 4) member2 executes TB
      UPDATE t1 SET b=10 WHERE a=1;
    This transaction has snapshot_version: UUID:1-4
    It goes through the complete patch and it is committed.
    This transaction has GTID: UUID:1000002
    Both members have a GTID_EXECUTED= UUID:1-4:1000002
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       #1                           UUID1:1-4:1000002
 5) member2 becomes extremely slow in processing transactions, we
    simulate that by holding the transaction queue to the GR
    pipeline.
    Transaction delivery is still working, but the transaction will
    be block before certification.
 6) member1 is able to send its TA transaction, lets recall that
    this transaction has snapshot_version: UUID:1-4.
    On conflict detection on member1, it will conflict with #1,
    since this snapshot_version does not contain the snapshot_version
    of #1, that is TA was executed on a previous version than TB.
    On member2 the transaction will be delivered and will be put on
    hold before conflict detection.
 7) meanwhile the certification info garbage collection kicks in.
    Both members have a GTID_EXECUTED= UUID:1-4:1000002
    Its intersection is UUID:1-4:1000002
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       #1                           UUID1:1-4:1000002
    The condition to purge write-sets is:
       snapshot_version.is_subset(intersection)
    We have
       "UUID:1-4:1000002".is_subset("UUID:1-4:1000002)
    which is true, so we remove #1.
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       <empty>
 8) member2 gets back to normal, we release transaction TA, lets
    recall that this transaction has snapshot_version: UUID:1-4.
    On conflict detection, since the certification info is empty,
    the transaction will be allowed to proceed, which is incorrect,
    it must rollback (like on member1) since it conflicts with TB.

The problem it is on certification garbage collection, more
precisely on the condition used to purge data, we cannot leave the
certification info empty otherwise this situation can happen.
The condition must be changed to
       snapshot_version.is_subset_not_equals(intersection)
which will always leave a placeholder to detect delayed conflicting
transaction.

So a trace of the solution is (starting on step 7):
 7) meanwhile the certification info garbage collection kicks in.
    Both members have a GTID_EXECUTED= UUID:1-4:1000002
    Its intersection is UUID:1-4:1000002
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       #1                           UUID1:1-4:1000002
    The condition to purge write-sets is:
       snapshot_version.is_subset_not_equals(intersection)
    We have
       "UUID:1-4:1000002".is_subset_not_equals("UUID:1-4:1000002)
    which is false, so we do not remove #1.
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       #1                           UUID1:1-4:1000002
 8) member2 gets back to normal, we release transaction TA, lets
    recall that this transaction has snapshot_version: UUID:1-4.
    On conflict detection on member2, it will conflict with #1,
    since this snapshot_version does not contain the snapshot_version
    of #1, that is TA was executed on a previous version than TB.

This is the same scenario that we see on this bug, though here the
pipeline is being blocked by the distributed recovery procedure,
that is, while the joining member is applying the missing data
through the recovery channel, the incoming data is being queued.
Meanwhile the certification info garbage collection kicks in and
purges more data that it should, the result it is that conflicts are
not being detected.
bjornmu pushed a commit that referenced this pull request Jul 27, 2018
Use void* for function arguments, and cast Item_field* in function body.

compare_fields_by_table_order(Item_field*, Item_field*, void*) through
pointer to incorrect function type 'int (*)(void *, void *, void *)'
sql/sql_optimizer.cc:3904: note: compare_fields_by_table_order(Item_field*,
Item_field*, void*) defined here
    #0 0x34ac57d in base_list::sort(int (*)(void*, void*, void*), void*)
sql/sql_list.h:278:13
    #1 0x49ad303 in substitute_for_best_equal_field(Item*, COND_EQUAL*,

Change-Id: I4f5417304a24201682f32fc7631034de7aa62589
bjornmu pushed a commit that referenced this pull request Jul 27, 2018
…ENERATED_READ_FIELDS

It's a SELECT with WHERE "(-1) minus 0x4d".
this operation has a result type of "unsigned" (because 0x4d is unsigned
integer) and the result (-78) doesn't fit int an unsigned type.
This WHERE is evaluated by InnoDB in index condition pushdown:
#0 my_error
#1 Item_func::raise_numeric_overflow
...
#7 Item_cond_and::val_int
#8 innobase_index_cond
...
#12 handler::index_read_map
...
#15 handler::multi_range_read_next
...
#20 rr_quick
#21 join_init_read_record
As val_int() has no "error" return code, the execution continues until
frame #12; there we call update_generated_read_fields(), which has
an assertion about thd->is_error() which fails.

Fix: it would be nice to detect error as soon as it happens, i.e. in innodb
code right after it calls val_bool(). But innodb's index condition
pushdown functions only have found / not found return codes so they cannot
signal "error" to the upper layers. Same is true for MyISAM. Moreover,
"thd" isn't easily accessible there.
Adding a detection a bit above in the stack (handler::* functions which
do index reads) is also possible but would require fixing ~20
functions.
The chosen fix here is to change update_generated_*_fields()
to return error if thd->is_error() is true.
Note that the removed assertion was already one cause of
bug 27041382.
bjornmu pushed a commit that referenced this pull request Oct 22, 2018
Post push fix

sql/opt_range.cc:14196:22: runtime error: -256 is outside the range of representable values of type 'unsigned int'
    #0 0x4248a9d in cost_skip_scan(TABLE*, unsigned int, unsigned int, unsigned long long, Cost_estimate*, unsigned long long*, Item*, Opt_trace_object*) sql/opt_range.cc:14196:22
    #1 0x41c524c in get_best_skip_scan(PARAM*, SEL_TREE*, bool) sql/opt_range.cc:14086:5
    #2 0x41b7b65 in test_quick_select(THD*, Bitmap<64u>, unsigned long long, unsigned long long, bool, enum_order, QEP_shared_owner const*, Item*, Bitmap<64u>*, QUICK_SELECT_I**) sql/opt_range.cc:3352:23
    #3 0x458fc08 in get_quick_record_count(THD*, JOIN_TAB*, unsigned long long) sql/sql_optimizer.cc:5542:17
    #4 0x458a0cd in JOIN::estimate_rowcount() sql/sql_optimizer.cc:5290:25

The fix is to handle REC_PER_KEY_UNKNOWN explicitly, to avoid using
-1.0 in computations later.

Change-Id: Ie8a81bdf7323e4f66abcad0a9aca776de8acd945
sjmudd pushed a commit to sjmudd/mysql-server that referenced this pull request Jan 24, 2019
piki added a commit to planetscale/mysql-server that referenced this pull request Apr 27, 2022
Track rows read and return them as part of the `info` field in OK/EOF packets
rahulmalik87 pushed a commit to rahulmalik87/mysql-server that referenced this pull request Jun 16, 2022
…NSHIP WITH THE BUFFER SIZE

Bug #33501541: Unmanageable Sort Buffer Behavior in 8.0.20+

Implement direct disk-to-disk copies of large packed addons during the
filesort merge phase; if a single row is so large that its addons do not
fit into its slice of the sort buffer during merging (even after
emptying that slice of all other rows), but the sort key _does_ fit,
simply sort the truncated row as usual, and then copy the rest of the
addon incrementally from the input to the output, 4 kB at a time, when
the row is to be written to the merge output. This is possible because
the addon itself doesn't need to be in RAM for the row to be compared
against other rows; only the sort key must.

This greatly relaxes the sort buffer requirements for successful merging,
especially when it comes to JSON rows or small blobs (which are
typically used as packed addons, not sort keys). The rules used to be:

 1. During initial chunk generation: The sort buffer must be at least
    as large as the largest row to be sorted.
 2. During merging: Merging is guaranteed to pass if the sort buffer is
    at least 15 times as large as the largest row (sort key + addons),
    but one may be lucky and pass with only the demands from mysql#1.

Now, for sorts implemented using packed addons (which is the common case
for small blobs and JSON), the new rules are:

 1. Unchanged from mysql#1 above.
 2. During merging: Merging is guaranteed to pass if the sort buffer is
    at least 15 times are large as the largest _sort key_ (plus 4-byte
    length marker), but one may be lucky and pass with only the demands
    from mysql#1.

In practice, this means that filesort merging will almost never fail
due to insufficient buffer space anymore; the query will either fail
because a single row is too large in the sort step, or it will pass
nearly all of the time.

However, do note that while such merges will work, they will not always
be very performant, as having lots of 1-row merge chunks will mean many
merge passes and little work being done during the initial in-memory
sort. Thus, the main use of this functionality is to be able to do sorts
where there are a few rows with large JSON values or similar, but where
most fit comfortably into the buffer. Also note that since requirement
 mysql#1 is unchanged, one still cannot sort e.g. 500 kB JSON values using the
default 256 kB sort buffer.

Older recommendations to keep sort buffers small at nearly any cost are
no longer valid, and have not been for a while. Sort buffers should be
sized to as much RAM as one can afford without interfering with other
tasks (such as the buffer pool, join buffers, or other concurrent
sorts), and small sorts are not affected by the maximum sort buffer size
being set to a larger value, as the sort buffer is incrementally
allocated.

Change-Id: I85745cd513402a42ed5fc4f5b7ddcf13c5793100
rahulmalik87 pushed a commit to rahulmalik87/mysql-server that referenced this pull request Jun 16, 2022
…NSHIP WITH THE BUFFER SIZE

Bug #33501541: Unmanageable Sort Buffer Behavior in 8.0.20+

Implement direct disk-to-disk copies of large packed addons during the
filesort merge phase; if a single row is so large that its addons do not
fit into its slice of the sort buffer during merging (even after
emptying that slice of all other rows), but the sort key _does_ fit,
simply sort the truncated row as usual, and then copy the rest of the
addon incrementally from the input to the output, 4 kB at a time, when
the row is to be written to the merge output. This is possible because
the addon itself doesn't need to be in RAM for the row to be compared
against other rows; only the sort key must.

This greatly relaxes the sort buffer requirements for successful merging,
especially when it comes to JSON rows or small blobs (which are
typically used as packed addons, not sort keys). The rules used to be:

 1. During initial chunk generation: The sort buffer must be at least
    as large as the largest row to be sorted.
 2. During merging: Merging is guaranteed to pass if the sort buffer is
    at least 15 times as large as the largest row (sort key + addons),
    but one may be lucky and pass with only the demands from mysql#1.

Now, for sorts implemented using packed addons (which is the common case
for small blobs and JSON), the new rules are:

 1. Unchanged from mysql#1 above.
 2. During merging: Merging is guaranteed to pass if the sort buffer is
    at least 15 times are large as the largest _sort key_ (plus 4-byte
    length marker), but one may be lucky and pass with only the demands
    from mysql#1.

In practice, this means that filesort merging will almost never fail
due to insufficient buffer space anymore; the query will either fail
because a single row is too large in the sort step, or it will pass
nearly all of the time.

However, do note that while such merges will work, they will not always
be very performant, as having lots of 1-row merge chunks will mean many
merge passes and little work being done during the initial in-memory
sort. Thus, the main use of this functionality is to be able to do sorts
where there are a few rows with large JSON values or similar, but where
most fit comfortably into the buffer. Also note that since requirement
 mysql#1 is unchanged, one still cannot sort e.g. 500 kB JSON values using the
default 256 kB sort buffer.

Older recommendations to keep sort buffers small at nearly any cost are
no longer valid, and have not been for a while. Sort buffers should be
sized to as much RAM as one can afford without interfering with other
tasks (such as the buffer pool, join buffers, or other concurrent
sorts), and small sorts are not affected by the maximum sort buffer size
being set to a larger value, as the sort buffer is incrementally
allocated.

Change-Id: I85745cd513402a42ed5fc4f5b7ddcf13c5793100
rahulmalik87 pushed a commit to rahulmalik87/mysql-server that referenced this pull request Jun 16, 2022
…ysql#1)

This reduces the number of issues of type "Single - argument constructor
may inadvertently be used as a type conversion constructor"
as reported by Flint++ tool.
Some of the reported issues were not addressed:
 - lines explicitly marked with NOLINT (like plugin/x/src/prepare_param_handler.h)
 - false positives (tool reports alignas() calls as constructors)
 - issues where objects were intended to be used with type conversion
 - constructors with "const std::string &" param, to accept literal strings as well

Change-Id: I7eb6fabea7137fc7e81143f06ec7636b22f6ea97
rahulmalik87 pushed a commit to rahulmalik87/mysql-server that referenced this pull request Jun 16, 2022
Includes a partial implementation of span from C++20.

Change-Id: Ibae9a4aeed95135f4ef35a7ce7b095e6930c1d66
rahulmalik87 pushed a commit to rahulmalik87/mysql-server that referenced this pull request Jun 16, 2022
Post push fix.

Remove include of unused header files "ndb_global.h" and <cassert>.

Use std::size_t instead of size_t.

Change-Id: I2c718d0889965ce5967d575172da8df4aa55b1d7
nawazn pushed a commit that referenced this pull request Jul 26, 2022
* PROBLEM

The test "ndb.ndb_bug17624736" was constantly failing in
[daily|weekly]-8.0-cluster branches in PB2, whether on `ndb-ps` or
`ndb-default-big` profile test runs. The high-level reason for the
failure was the installation of a duplicate entry in the Data
Dictionary in respect to the `engine`-`se_private_id` pair, even when
the previous table definition should have been dropped.

* LOW-LEVEL EXPLANATION

NDB reuses the least available ID for the dictionary table ID. The ID
is then used by the NDB plugin to install as SE private ID field of
the MySQL Data Dictionary table definition. If a problem occurs during
the synchronization of NDB table definitions in the Data
Dictionary (i.e., a previous definition was not successfully removed),
then an attempt to install a table using an already installed SE
private ID can occur. If that ID was inadvertedly cached as `missing`,
then the function `acquire_uncached_table_by_se_private_id` will
return fast without retrieving the table definition. Therefore, the
old table definition on that ID cannot be retrieved ever for that Data
Dictionary client instance, the new one won't be installed, and errors
will be raised.

* SOLUTION

For NDB plugin to query a table definition, using the SE private
ID (without causing a missing entry to be cached forever for that
client instance), this patch adds a flag argument to the function to
allow the caller to request it to skip the fast cache.

Change-Id: I45eef594ee544000fe6b30b86977e5e91155dc80
nawazn pushed a commit that referenced this pull request Jul 26, 2022
-- Patch #1: Persist secondary load information --

Problem:
We need a way of knowing which tables were loaded to HeatWave after
MySQL restarts due to a crash or a planned shutdown.

Solution:
Add a new "secondary_load" flag to the `options` column of mysql.tables.
This flag is toggled after a successful secondary load or unload. The
information about this flag is also reflected in
INFORMATION_SCHEMA.TABLES.CREATE_OPTIONS.

-- Patch #2 --

The second patch in this worklog triggers the table reload from InnoDB
after MySQL restart.

The recovery framework recognizes that the system restarted by checking
whether tables are present in the Global State. If there are no tables
present, the framework will access the Data Dictionary and find which
tables were loaded before the restart.

This patch introduces the "Data Dictionary Worker" - a MySQL service
recovery worker whose task is to query the INFORMATION_SCHEMA.TABLES
table from a separate thread and find all tables whose secondary_load
flag is set to 1.

All tables that were found in the Data Dictionary will be appended to
the list of tables that have to be reloaded by the framework from
InnoDB.

If an error occurs during restart recovery we will not mark the recovery
as failed. This is done because the types of failures that can occur
when the tables are reloaded after a restart are less critical compared
to previously existing recovery situations. Additionally, this code will
soon have to be adapted for the next worklog in this area so we are
proceeding with the simplest solution that makes sense.

A Global Context variable m_globalStateEmpty is added which indicates
whether the Global State should be recovered from an external source.

-- Patch #3 --

This patch adds the "rapid_reload_on_restart" system variable. This
variable is used to control whether tables should be reloaded after a
restart of mysqld or the HeatWave plugin. This variable is persistable
(i.e., SET PERSIST RAPID_RELOAD_ON_RESTART = TRUE/FALSE).

The default value of this variable is set to false.

The variable can be modified in OFF, IDLE, and SUSPENDED states.

-- Patch #4 --

This patch refactors the recovery code by removing all recovery-related
code from ha_rpd.cc and moving it to separate files:

  - ha_rpd_session_factory.h/cc:
  These files contain the MySQLAdminSessionFactory class, which is used
to create admin sessions in separate threads that can be used to issue
SQL queries.

  - ha_rpd_recovery.h/cc:
  These files contain the MySQLServiceRecoveryWorker,
MySQLServiceRecoveryJob and ObjectStoreRecoveryJob classes which were
previously defined in ha_rpd.cc. This file also contains a function that
creates the RecoveryWorkerFactory object. This object is passed to the
constructor of the Recovery Framework and is used to communicate with
the other section of the code located in rpdrecoveryfwk.h/cc.

This patch also renames rpdrecvryfwk to rpdrecoveryfwk for better
readability.

The include relationship between the files is shown on the following
diagram:

        rpdrecoveryfwk.h◄──────────────rpdrecoveryfwk.cc
            ▲    ▲
            │    │
            │    │
            │    └──────────────────────────┐
            │                               │
        ha_rpd_recovery.h◄─────────────ha_rpd_recovery.cc──┐
            ▲                               │           │
            │                               │           │
            │                               │           │
            │                               ▼           │
        ha_rpd.cc───────────────────────►ha_rpd.h       │
                                            ▲           │
                                            │           │
            ┌───────────────────────────────┘           │
            │                                           ▼
    ha_rpd_session_factory.cc──────►ha_rpd_session_factory.h

Other changes:
  - In agreement with Control Plane, the external Global State is now
  invalidated during recovery framework startup if:
    1) Recovery framework recognizes that it should load the Global
    State from an external source AND,
    2) rapid_reload_on_restart is set to OFF.

  - Addressed review comments for Patch #3, rapid_reload_on_restart is
  now also settable while plugin is ON.

  - Provide a single entry point for processing external Global State
  before starting the recovery framework loop.

  - Change when the Data Dictionary is read. Now we will no longer wait
  for the HeatWave nodes to connect before querying the Data Dictionary.
  We will query it when the recovery framework starts, before accepting
  any actions in the recovery loop.

  - Change the reload flow by inserting fake global state entries for
  tables that need to be reloaded instead of manually adding them to a
  list of tables scheduled for reload. This method will be used for the
  next phase where we will recover from Object Storage so both recovery
  methods will now follow the same flow.

  - Update secondary_load_dd_flag added in Patch #1.

  - Increase timeout in wait_for_server_bootup to 300s to account for
  long MySQL version upgrades.

  - Add reload_on_restart and reload_on_restart_dbg tests to the rapid
  suite.

  - Add PLUGIN_VAR_PERSIST_AS_READ_ONLY flag to "rapid_net_orma_port"
  and "rapid_reload_on_restart" definitions, enabling their
  initialization from persisted values along with "rapid_bootstrap" when
  it is persisted as ON.

  - Fix numerous clang-tidy warnings in recovery code.

  - Prevent suspended_basic and secondary_load_dd_flag tests to run on
  ASAN builds due to an existing issue when reinstalling the RAPID
  plugin.

-- Bug#33752387 --

Problem:
A shutdown of MySQL causes a crash in queries fired by DD worker.

Solution:
Prevent MySQL from killing DD worker's queries by instantiating a
DD_kill_immunizer before the queries are fired.

-- Patch #5 --

Problem:
A table can be loaded before the DD Worker queries the Data Dictionary.
This means that table will be wrongly processed as part of the external
global state.

Solution:
If the table is present in the current in-memory global state we will
not consider it as part of the external global state and we will not
process it by the recovery framework.

-- Bug#34197659 --

Problem:
If a table reload after restart causes OOM the cluster will go into
RECOVERYFAILED state.

Solution:
Recognize when the tables are being reloaded after restart and do not
move the cluster into RECOVERYFAILED. In that case only the current
reload will fail and the reload for other tables will be attempted.

Change-Id: Ic0c2a763bc338ea1ae6a7121ff3d55b456271bf0
bjornmu pushed a commit that referenced this pull request Oct 11, 2022
Enh#34350907 - [Nvidia] Allow DDLs when tables are loaded to HeatWave
Bug#34433145 - WL#15129: mysqld crash Assertion `column_count == static_cast<int64_t>(cp_table-
Bug#34446287 - WL#15129: mysqld crash at rapid::data::RapidNetChunkCtx::consolidateEncodingsDic
Bug#34520634 - MYSQLD CRASH : Sql_cmd_secondary_load_unload::mysql_secondary_load_or_unload
Bug#34520630 - Failed Condition: "table_id != InvalidTableId"

Currently, DDL statements such as ALTER TABLE*, RENAME TABLE, and
TRUNCATE TABLE are not allowed if a table has a secondary engine
defined. The statements fail with the following error: "DDLs on a table
with a secondary engine defined are not allowed."

This worklog lifts this restriction for tables whose secondary engine is
RAPID.

A secondary engine hook is called in the beginning (pre-hook) and in the
end (post-hook) of a DDL statement execution. If the DDL statement
succeeds, the post-hook will direct the recovery framework to reload the
table in order to reflect that change in HeatWave.

Currently all DDL statements that were previously disallowed will
trigger a reload. This can be improved in the future by checking whether
the DDL operation has an impact on HeatWave or not. However detecting
all edge-cases in this behavior is not straightforward so this
improvement has been left as a future improvement.

Additionally, if a DDL modifies the table schema in a way that makes it
incompatible with HeatWave (e.g., dropping a primary key column) the
reload will fail silently. There is no easy way to recognize whether the
table schema will become incompatible with HeatWave in a pre-hook.

List of changes:
  1) [MySQL] Add new HTON_SECONDARY_ENGINE_SUPPORTS_DDL flag to indicate
whether a secondary engine supports DDLs.
  2) [MySQL] Add RAII hooks for RENAME TABLE and TRUNCATE TABLE, modeled
on the ALTER TABLE hook.
  3) Define HeatWave hooks for ALTER TABLE, RENAME TABLE, and TRUNCATE
TABLE statements.
  4) If a table reload is necessary, trigger it by marking the table as
stale (WL#14914).
  4) Move all change propagation & DDL hooks to ha_rpd_hooks.cc.
  5) Adjust existing tests to support table reload upon DDL execution.
  6) Extract code related to RapidOpSyncCtx in ha_rpd_sync_ctx.cc, and
the PluginState enum to ha_rpd_fsm.h.

* Note: ALTER TABLE statements related to secondary engine setting and
loading were allowed before:
    - ALTER TABLE <TABLE> SECONDARY_UNLOAD,
    - ALTER TABLE SECONDARY_ENGINE = NULL.

-- Bug#34433145 --
-- Bug#34446287 --

--Problem #1--
Crashes in Change Propagation when the CP thread tries to apply DMLs of
tables with new schema to the not-yet-reloaded table in HeatWave.

--Solution #1--
Remove table from Change Propagation before marking it as stale and
revert the original change from rpd_binlog_parser.cc where we were
checking if the table was stale before continuing with binlog parsing.
The original change is no longer necessary since the table is removed
from CP before being marked as stale.

--Problem #2--
In case of a failed reload, tables are not removed from Global State.

--Solution #2--
Keep track of whether the table was reloaded because it was marked as
STALE. In that case we do not want the Recovery Framework to retry the
reload and therefore we can remove the table from the Global State.

-- Bug#34520634 --

Problem:
Allowing the change of primary engine for tables with a defined
secondary engine hits an assertion in mysql_secondary_load_or_unload().

Example:
    CREATE TABLE t1 (col1 INT PRIMARY KEY) SECONDARY_ENGINE = RAPID;
    ALTER TABLE t1 ENGINE = BLACKHOLE;
    ALTER TABLE t1 SECONDARY_LOAD; <- assertion hit here

Solution:
Disallow changing the primary engine for tables with a defined secondary
engine.

-- Bug#34520630 --

Problem:
A debug assert is being hit in rapid_gs_is_table_reloading_from_stale
because the table was dropped in the meantime.

Solution:
Instead of asserting, just return false if table is not present in the
Global State.

This patch also changes rapid_gs_is_table_reloading_from_stale to a more
specific check (inlined the logic in load_table()). This check now also
covers the case when a table was dropped/unloaded before the Recovery
Framework marked it as INRECOVERY. In that case, if the reload fails we
should not have an entry for that table in the Global State.

The patch also adjusts dict_types MTR test, where we no longer expect
for tables to be in UNAVAIL state after a failed reload. Additionally,
recovery2_ddls.test is adjusted to not try to offload queries running on
Performance Schema.

Change-Id: I6ee390b1f418120925f5359d5e9365f0a6a415ee
bjornmu pushed a commit that referenced this pull request Jan 17, 2023
… Signal (get_store_key at sql/sql_select.cc:2383)

These are two related but distinct problems manifested in the shrinkage
of key definitions for derived tables or common table expressions,
implemented in JOIN::finalize_derived_keys().

The problem in Bug#34572040 is that we have two references to one CTE,
each with a valid key definition. The function will first loop over
the first reference (cte_a) and move its used key from position 0 to
position 1. Next, it will attempt to move the key for the second
reference (cte_b) from position 4 to position 2.
However, for each iteration, the function will calculate used key
information. On the first iteration, the values are correct, but since
key value #1 has been moved into position #0, the old information is
invalid and provides wrong information. The problem is thus that for
subsequent iterations we read data that has been invalidated by earlier
key moves.

The best solution to the problem is to move the keys for all references
to the CTE in one operation. This way, we can calculate used keys
information safely, before any move operation has been performed.

The problem in Bug#34634469 is also related to having more than one
reference to a CTE, but in this case the first reference (ref_3) has
a key in position 5 which is moved to position 0, and the second
reference (ref_4) has a key in position 3 that is moved to position 1.

However, the key parts of the first key will overlap with the key parts
of the second key after the first move, thus invalidating the key
structure during the copy. The actual problem is that we move
a higher-numbered key (5) before a lower-numbered key (3), which in
this case makes it impossible to find an empty space for the moved key.

The solution to this problem is to ensure that keys are moved in
increasing key order.

The patch changes the algorithm as follows:

- When identifying a derived table/common table expression, ensure
  to move all its keys in one operation (at least those references
  from the same query block).

- First, collect information about all key uses: hash key,
  unique index keys and actual key references.
  For the key references, also populate a mapping array that enumerates
  table references with key references in order of increasing
  key number.
  Also clear used key information for references that do not use keys.

- For each table reference with a key reference in increasing key order,
  move the used key into the lowest available position. This will ensure
  that used entries are never overwritten.

- When all table references have been processed, remove unused key
  definitions.

Change-Id: I938099284e34a81886621f6a389f34abc51e78ba
martinbskold pushed a commit to martinbskold/mysql-server that referenced this pull request May 2, 2023
The shadow tables used by ndb_binlog thread when handling data events
have no knowledge about wheter the table is a foreign key parent in NDB,
this is mainly because the NDB dictionary extra metadata for parent
table is not updated when creating the child tables. The information
about being parent table is required in order to properly detect a
transaction conflict while writing changes to the binlog for changes of
parent key tables.

This is fixed by extending the ndb_binlog thread with a metadata cache
that maintains a list of tables in NDB which are foreign key parents.
This makes it possible to update the shadow table's knowledge about
being parent table while handling data events.

Change-Id: Ic551fde3e9460e3668d5aa1ac951ee3bec442cf5
martinbskold pushed a commit to martinbskold/mysql-server that referenced this pull request May 2, 2023
If mysql_bind_param() fails, it crashes.

glibc reports: double free or corruption (fasttop)

ASAN reports: heap-use-after-free

  mysql#1 0x55ad770cda23 in mysql_bind_param libmysql/libmysql.cc:2477:9

freed by thread T0 here:

  mysql#1 0x55ad770cda23 in mysql_bind_param libmysql/libmysql.cc:2477:9

The wrong counter-variable is used which results in the same index being
freed again and again.

Change
------

- use the right index-variable when freeing ->names if
  mysql_bind_param() fails.

Change-Id: I580267d5913c55b00151409c25d90f2ffe1b4119
martinbskold pushed a commit to martinbskold/mysql-server that referenced this pull request May 2, 2023
Compilation warning/error:
storage/ndb/test/src/UtilTransactions.cpp:1544: error: variable 'eof'
may be uninitialized when used here [-Werror,-Wconditional-
uninitialized]

Manual code inspection also shows there are problems with
retry logic as well as potentially releasing resources.

Fix by refactoring ´verifyTableReplicasWithSource` into functions where
the retry, scan and compare logic are separated. This makes it possible
to make sure that the "eof" variable is always initialized while
fetching the next row during scan.

Change-Id: I111e0a6613622aa2279a5ec5845e48e8ca0e115f
dveeden pushed a commit to dveeden/mysql-server that referenced this pull request May 5, 2023
Introduce class NdbSocket, which includes both an ndb_socket_t
and an SSL *, and wraps all socket operations that might use
TLS.

Change-Id: I20d7aeb4854cdb11cfd0b256270ab3648b067efa
dveeden pushed a commit to dveeden/mysql-server that referenced this pull request May 5, 2023
The patch for
    WL#15130 Socket-level TLS patch mysql#1: class NdbSocket
re-introduced a -Wcast-qual warning.

Use const_cast, and reinterpret_cast to fix it, since the corresponding
posix version of ndb_socket_writev() is const-correct.

Change-Id: Ib446a926b4108edf51eda7d8fd27ada560b67a24
venkatesh-prasad-v pushed a commit to venkatesh-prasad-v/mysql-server that referenced this pull request Aug 3, 2023
…etwork

https://bugs.mysql.com/bug.php?id=109668

Description
-----------
GR suffered from problems caused by the security probes and network scanner
processes connecting to the group replication communication port. This usually
is not a problem, but poses a serious threat when another member tries to join
the cluster by initialting a connection to the member which is affected by
external processes using the port dedicated for group communication for longer
durations.

On such activites by external processes, the SSL enabled server stalled forever
on the SSL_accept() call waiting for handshake data. Below is the stacktrace:

    Thread 55 (Thread 0x7f7bb77ff700 (LWP 2198598)):
    #0 in read ()
    mysql#1 in sock_read ()
    mysql#2 in BIO_read ()
    mysql#3 in ssl23_read_bytes ()
    mysql#4 in ssl23_get_client_hello ()
    mysql#5 in ssl23_accept ()
    mysql#6 in xcom_tcp_server_startup(Xcom_network_provider*) ()

When the server stalled in the above path forever, it prohibited other members
to join the cluster resulting in the following messages on the joiner server's
logs.

    [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group'
    [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.'

Solution
--------
This patch adds two new variables

1. group_replication_xcom_ssl_socket_timeout

   It is a file-descriptor level timeout in seconds for both accept() and
   SSL_accept() calls when group replication is listening on the xcom port.
   When set to a valid value, say for example 5 seconds, both accept() and
   SSL_accept() return after 5 seconds. The default value has been set to 0
   (waits infinitely) for backward compatibility. This variable is effective
   only when GR is configred with SSL.

2. group_replication_xcom_ssl_accept_retries

   It defines the number of retries to be performed before closing the socket.
   For each retry the server thread calls SSL_accept()  with timeout defined by
   the group_replication_xcom_ssl_socket_timeout for the SSL handshake process
   once the connection has been accepted by the first accept() call. The
   default value has been set to 10. This variable is effective only when GR is
   configred with SSL.

Note:
- Both of the above variables are dynamically configurable, but will become
  effective only on START GROUP_REPLICATION.
- This patch is only for the Linux systems.
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
Part of WL#15135 Certificate Architecture

This patch introduces a set of C++ classes to implement the creation of
private keys, PKCS#10 signing requests, and X.509 certificates for NDB
cluster.

The TlsSearchPath class provides searching for files over a delimited
list of directories.

The PrivateKey and Certificate classes provide simple wrappers over the
OpenSSL routines to create, free, save, and open keys and certificates.

Classes PendingPrivateKey and PendingCertificate implement file naming
conventions and life cycle for pending key pairs; ActivePrivateKey
and ActiveCertificate implement them for active key pairs. Class
SigningRequest provides the naming conventions and life cycle for
PKCS#10 CSRs.

Class NodeCertificate is the primary in-memory representation of a
node's TLS credentials.

A unit test, NodeCertificate-t, is intended to thoroughly test the
whole suite of classes. It should be possible to run this test under
valgrind with no reported leaks.

Change-Id: I76bf719375ab2a9b6a97245e326158a49dde28c2
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
This is a complete implementation of ndb_sign_keys.

It searches --ndb-tls-search-path for node certificate and key
files, and additionally searchs in --CA-search-path for CA-related
key and certificate files.

It includes three methods for remote key signing:
  With --remote-CA-host, run ndb_sign_keys remotely, using ssh.
  With --remote-openssl, run openssl on the remote host, using ssh.
  With --CA-tool, run a local signing helper tool.

Change-Id: I5d93b702a667fa98d820ed150631a91e8444b8d7
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
Post-push fix for : WL#15166 patch #1 ndb_sign_keys

DWORD is 'unsigned long' not int
Remove an unused local variable.
C-style cast (LPSTR) drops const qualifier [-Wcast-qual]

Change-Id: I059ad8a5a5f6b1cc644456576a8acff9a78331e3
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
Add boolean parameter "RequireCertificate" to [DB] section.
Default is false. If true, node will fail at startup time unless
it finds a TLS key and a current valid certificate.

Add boolean parameter "RequireTls" to [DB] section. Default is
false. If true, every transporter link involving the data node
must use TLS.

Add boolean parameter "RequireTls" to [TCP] sections. This is
computed, and not user-setable. If either endpoint of a link
has RequireTls set to true, RequireTls for the link will be
set true.

Add some clarifying comments to ndbinfo_plans test.

Change-Id: I889d9b7563022e2ebb2eaae92c3b26b557180d40
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
Add an MGM protocol command to turn a plaintext mgm api session
into a TLS session. Add three new MGM API functions:
  ndb_mgm_set_ssl_ctx()
  ndb_mgm_start_tls()
  ndb_mgm_connect_tls()

Define two client TLS requirement levels:
 CLIENT_TLS_RELAXED, CLIENT_TLS_STRICT

This adds a new test: testMgmd -n StartTls

Change-Id: Ib46faacd9198c474558e46c3fa0538c7e759f3fb
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
Post push fix. Remove added C++ dependencies in C header mgmapi.h.

 - forward declare SSL_CTX.
 - add missing struct keyword with ndb_mgm_cert_table and
   ndb_mgm_tls_stats
 - make ndb_mgm_set_ssl_ctx return int instead of bool as other mgmapi
   functions do.

Change-Id: I493b4c4fb1272974e1bb72e35abb08c8cef1a534
bjornmu pushed a commit that referenced this pull request Jan 16, 2024
Post push fix.

Do not allow ndb_mgm_listen_event to return a socket that uses TLS since
user can not access the corresponding SSL object thorugh the public
MgmAPI.

Change-Id: I2a741efe4f80db750419101ecabb03fb5e025346
bjornmu pushed a commit that referenced this pull request Jan 16, 2024
Post push fix.

Make NdbSocket::ssl_readln return 0 on timeout.

Change-Id: I4cad95abd319883c16f2c28eff5cf2b6761731d6
bjornmu pushed a commit that referenced this pull request Jan 16, 2024
Post push fix.

Add missing socket close in testMgmd -n StartTls.

Change-Id: Ia446b522ad2698f63d588d3c52122df8735765c7
bjornmu pushed a commit that referenced this pull request Jan 16, 2024
Problem
================================

Group Replication ASAN run failing without any symptom of a
leak, but with shutdown issues:

worker[6] Shutdown report from
/dev/shm/mtr-3771884/var-gr-debug/6/log/mysqld.1.err after tests:
 group_replication.gr_flush_logs
group_replication.gr_delayed_initialization_thread_handler_error
group_replication.gr_sbr_verifications
group_replication.gr_server_uuid_matches_group_name_bootstrap
group_replication.gr_stop_async_on_stop_gr
group_replication.gr_certifier_message_same_member
group_replication.gr_ssl_mode_verify_identity_error_xcom

Analysis and Fix
================================

It ended up being a leak on gr_ssl_mode_verify_identity_error_xcom test:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7f1709fbe1c7 in operator new(unsigned long)
      ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
    #1 0x7f16ea0df799 in xcom_tcp_server_startup(Xcom_network_provider*)
      (/export/home/tmp/BUG35594709/mysql-trunk/BIN-ASAN/plugin_output_directory
        /group_replication.so+0x65d799)
    #2 0x7f170751e2b2  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc2b2)

This happens because we delegated incoming connections
cleanup to the external consumer in incoming_connection_task.
Since it calls incoming_connection() from
Network_provider_manager, in case of a concurrent stop,
a connection could be left orphan in the shared atomic
due to the lack of an Active Provider, thus creating a
memory leak.

The solution is to make this cleanup on
Network_provider_manager, on both stop_provider() and in
stop_all_providers() methods, thus ensuring that no
incoming connection leaks.

Change-Id: I2367c37608ad075dee63785e9f908af5e81374ca
bjornmu pushed a commit that referenced this pull request Jan 16, 2024
Post push fix.

Make NdbSocket::ssl_readln return 0 on timeout.

Change-Id: I4cad95abd319883c16f2c28eff5cf2b6761731d6
bjornmu pushed a commit that referenced this pull request Jan 16, 2024
BUG#35949017 Schema dist setup lockup
Bug#35948153 Problem setting up events due to stale NdbApi dictionary cache [#2]
Bug#35948153 Problem setting up events due to stale NdbApi dictionary cache [#1]
Bug#32550019 Missing check for ndb_schema_result leads to schema dist timeout

Change-Id: I4a32197992bf8b6899892f21587580788f828f34
msprajap pushed a commit that referenced this pull request Jan 16, 2024
… cache [#1]

Problem:
A MySQL Server which has been disconnected from schema distribution
fails to setup event operations since the columns of the table can't be
found in the event.

Analysis:
The ndbcluster plugin uses NDB table definitions which are cached by the
NdbApi. These cached objects are reference counted and there can be
multiple versions of the same table in the cache, the intention is that
it should be possible to continue using the table even though it
changes in NDB. When changing a table in NDB this cache need to be
invalidated, both on the local MySQL Server and on all other MySQL
Servers connected to the same cluster. Such invalidation is especially
important before installing in DD and setting up event subscriptions.

The local MySQL Server cache is invalidated directly when releasing the
reference from the NdbApi after having modified the table.

The other MySQL Servers are primarily invalidated by using schema
distribution. Since schema distribution is event driven the invalidation
will happen promptly but as with all things in a distributed system
there is a possibility that these events are not handled for some
reason. This means there must be a fallback mechanism which
invalidates stale cache objects.

The reported problem occurs since there is a stale NDB table definition
in the NdbApi, it has the same name but different columns than the
current table in NDB. In most cases the NdbApi continues
to operate on a cached NDB table definition but when setting up events
the "mismatch on version" will be detected inside the NdbApi(due to the
relation between the event and the table), this causes the cache to be
invalidated and current version to be loaded from NDB. However the
caller is still using the "old" cached table definition and thus when
trying to subscribe the columns they can not be found.

Solution:

1) Invalidate NDB table definition in schema event handler that handles
new table created. This covers the case where table is dropped directly
in NDB using for example ndb_drop_table or ndb_restore and then
subsequently created using SQL. This scenario is covered by the existing
metadata_sync test cases who will be detected by 4) before this part of
the fix.

2) Invalidate NDB table definition before table schema synchronization
install tables in DD and setup event subscripotion. This function
handles the case when schema distribution is reconnecting to the cluster
and a table it knew about earlier has changed while schema distribution
event handlers have not been active. This scenario is tested by the
drop_util_table test case.

3) Invalidate NDB table definition when schema distribution event
handler which is used for drop table and cluster failure occurs. At this
time it's well known that table does not exists or it's status is
unknown. Earlier this invalidation was only performed if there was a
version mismatch in the the event vs. table relation.

4) Detect when problem occurs by checking that NDB table definition has
not been invalidated (by NdbApi event functions) in the function that
setup the event subscription. It's currently not possible to handle the
problem this low down, but at least it can be detected and fix added to
the callers. This detection is only done in debug compile.

Change-Id: I4ed6efb9308be0022e99c51eb23ecf583805b1f4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants