Skip to content

[pull] 8.0 from mysql:8.0#3

Merged
pull[bot] merged 1275 commits intoMu-L:8.0from
mysql:8.0
Oct 11, 2022
Merged

[pull] 8.0 from mysql:8.0#3
pull[bot] merged 1275 commits intoMu-L:8.0from
mysql:8.0

Conversation

@pull
Copy link

@pull pull bot commented Oct 11, 2022

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

kahatlen and others added 30 commits August 8, 2022 17:02
Post-push fix for a spurious unit test failure.

HypergraphOptimizerTest.HashJoinWithSubqueryPredicate expects a join
condition to be (t2.y = t3.y) AND (t2.x = t3.x), but it occasionally
sees the equivalent form (t2.x = t3.x) AND (t2.y = t3.y) and fails.

The choice between the two join conditions is quite arbitrary, and it
does not matter which one is selected for what the unit test is
intended to test. The unit test case is therefore changed to accept
either ordering of the equijoin conditions. (It is also changed to
verify that the join has no non-equijoin conditions, as that is
essential to the test case, even though it does not directly impact
the test failures.)

The source of the instability seems to be this code in
MakeJoinHypergraph(), which removes duplicates from an array of
multiple equalities:

  std::sort(multiple_equalities.begin(), multiple_equalities.end());
  multiple_equalities.erase(
      std::unique(multiple_equalities.begin(), multiple_equalities.end()),
      multiple_equalities.end());

The problem is the sorting of pointers, as it makes the ordering of
the elements dependent on where in the virtual memory area the objects
have been allocated, which is not deterministic.

We might also want to change this code to become deterministic to
avoid similar problems popping up later.

Change-Id: I29ccd477a773a0a2153e39bc296c95e0d596dbcc
An aggregate query on a BIT field returns a value which is formatted
as a bit string, but has the BINARY flag. This is because the
BINARY flag is automatically added to all result items which have a
binary charset. This should not be done for BIT results since they
are always formatted as bit strings.

This is fixed by adding a check to skip setting the BINARY flag for
BIT results.

Change-Id: If73f511cb27a4e63436bf7e48305bf7b7d197859
              optimizer is ignoring the STRAIGHT_JOIN HINT

Contrary to what the bug title says, hypergraph is not ignoring the
straight_join hint. Since straight joins are associative,
it still produces a valid plan as per the directives.
However, it does not pick the ideal plan. When hints are
not present, it does indeed pick a much better plan.
But, for heatwave this is a problem because of the number
of partial plans that are generated.
The problem when straight join hints are given is that only a
few combinations are possible. And with these limited
combinations and the order in which the join conditions
get pushed down to each of these joins, makes the combination
for the best join not possible for this particular case.
E.g.
For a query of this form (SJ - straight_join):
t1 SJ t2 SJ t3 SJ t4 with 4 join conditions:
t3->t4 has a non equality join condition
t2->t4 has an equijoin condition
t1->t3 has an equijoin condition
t1->t2 has an equijoin condition
Call to optimize_cond() places all the equalities at the end of
the final where/join condition. So the in-equality is placed first
in this case.
When optimizer is trying to push the conditions down to joins
initially, it sees the non equality join condition first and
pushes it down for a join between t3->t4 (2 large tables in case
of the non-performant TPC-DS query).
Then it sees the (t2->t4) equi-join condition. It concludes that
it's better to place this condition for the join between (t2)->(t3t4)
because of 2 reasons. The first one being that the used tables
for the join t3->t4 has t3,t4 because of the already present join
condition and the second one being that t3 needs to be seen
before t4 (because of stright_join).
For the next join condition between (t1->t3), it places it on the
join between (t1,(t2t3t4)) and the same with the last condition.
The same reasoning persists when creating edges resulting
in the join between (t3,t4) which are large tables in the actual
use case.
However, if the optimizer were to see the equalities first,
it would place the join condition in such a way that the following
join was possible ((t1 HJ t2) HJ t3) HJ t4) which is still valid as
per the hints but a much better join.
The same problem does not happen when the hints are not used
because many other joins are possible and therefore it could create
more edges.
It could be argued that optimizer should have been able to figure
out the valid edge between t1->t3. However its easier to fix this
particular case by just sorting conditions in a way that the
equalities are seen earlier. So we take the easier route for now.
And putting equijoin conditions ahead of others does no harm.
Also that at this point, the predicates are not ordered in any way
except for optimize_cond() doing a bit of sorting for the multiple
equalties. So in this patch, we put all the equijoin conditions
at the begining followed by other types of conditions. We also make
sure that the sorting that was done while creating hash join
conditions is taken care here itself i.e putting all the conditions
with subqueries and the expensive conditions at the end of the list.
This results in a few failing unit tests. The necessary changes have
been done for those tests.

Change-Id: I6880ffbe6b3fa177a3fbb7612aed3f46eb6440d7
Upgrade the bundled googletest/googlemock source to the most recent version,
which is currently 1.12.0

The workaround for -Werror=maybe-uninitialized for googlemock matchers
can be removed, the bug has been fixed upstream.

git add extra/googletest/googletest-release-1.12.0/googlemock/
git add extra/googletest/googletest-release-1.12.0/googletest/
git rm -rf extra/googletest/googletest-release-1.12.0/googlemock/test
git rm -rf extra/googletest/googletest-release-1.12.0/googletest/test

git rm -rf extra/googletest/googletest-release-1.11.0

Visual Studio fails to compile any usage of testing::WithArgs<0, ...>
gmock-actions.h(1432,59):error C2039: 'type' is not a member of 'std'

The workaround is to add extra arguments to:
  Mock_dd_field_varstring::fake_store()
  Mock_dd_field_varstring::fake_val_str()
so that we can remove WithArgs<0> and WithArgs<1>.

Similarly for
  mock_gcs_xcom_state_exchange_interface::free_xcom_member_state()
  mock_gcs_xcom_state_exchange_interface::free_members_joined()
  suspect_is_me()
  suspect_is_2()
  copy_message_content()

The macro GTEST_DISALLOW_COPY_AND_ASSIGN_ is no longer provided by
googletest headers. The fix is to expand the old definition of the
macro at all places it was used.

Some source file need COMPILE_FLAGS "/bigobj" with Visual Studio.

Change-Id: Id078f63163e672ac6949901c664a712a00ec4d72
(cherry picked from commit 69ef1a1621b85be54dbdb32a26acaee099f40f6c)
Additional patch:
remove obsolete comments for GTEST_DISALLOW_COPY_AND_ASSIGN_

Change-Id: Ice549fff87b3a2a87d961d3e46863e0e38b983b4
(cherry picked from commit 32fdf87cb3380582a5d8faa917101c8d8b7999c5)
…44 to mysql-trunk

Backport from mysql-trunk.

All queries where there are UNION DISTINCT followed by UNION ALL were
not offloaded on mysql-trunk. Queries are of the form
1. (SELECT ...) UNION DISTINCT (SELECT ...) UNION ALL (SELECT ...)
2. CREATE VIEW V1 AS (SELECT ...) UNION
         DISTINCT (SELECT ...) UNION ALL (SELECT ...)

Problem: In the queries with UNION ALL, STREAM Access Path did not find
the correct JOIN Pointer/ Query Block, and hence did not find the correct
compilation context, which lead to failing assertions.

Solution:
Since Union(all) is a SET operation and query_terms are used in
MYSQL to represent SET Operations, and it is available on mysql-trunk,
we use query_terms to resolve query_blocks for union children.
This solution introduces query_term in the TranslateState to facilitate
this solution and the solution to BUG#34162247. Traversing query_term
tree additionally helps improves the following:
1. Readibility: Less corner cases to check for
     when accessing query blocks in presence of Set Operations.
2. Maintanability: Fewer single use code paths necessary to obtain
     correct query blocks. This will be fully accomplished after
     closing BUG#34323639.
3. Speed: Traversing query term tree is faster than traversing
     AP tree in some cases, since AP trees can be nested very
     deep, whereas, query term trees nest only as much as
     maximum depth of set operations.
Conceptually, this solution also helps to resolve the following
challenges which arise in translating the access path with
query terms:
1. Find appropriate query term,
   when multiple query terms are nested in 1 Query Expression and
2. Find appropriate Union Child's Join pointer
   when we have nested query terms.

Change-Id: I69f1028a71eff8fd45ff55f73a0ee51273476e75
…Query_terms Printing QueryTerm AST in APTree [1/2]

Backport from mysql-trunk.

* Print Query-term tree along with AccessPath.

* Replaces Query-block with appropriate Query-terms on need basis.

* Introduces parent and child relation between Query-term nodes to represent Query-term AST.

* Example:
  Query: CREATE VIEW v AS ((SELECT a1, a2 FROM A UNION DISTINCT SELECT b1, b2 FROM B ORDER BY a1 ASC LIMIT 10) ORDER BY a1,a2 DESC LIMIT 7);
  APTree:
  -> (qt_query_block) ptr=0x7f09f4e16e98 parent=(nil) num_children=0 m_block=0x7f09f4e16e98 select_number=1 join=0x7f09f52c8780 master_expression_qt=0x7f09f4e16e98 MATERIALIZE (0x7f09f52da6f8) (derived table)
    -> (table_path) TABLE_SCAN (0x7f09f52da340) (on v)
    -> (qt_unary) ptr=0x7f09f52bced0 parent=(nil) num_children=1 m_block=0x7f09f52bdb60 m_block_id=5 m_block_join=0x7f09f52ca2e8 master_expression_qt=0x7f09f52bced0 limit=7 offset=0 LIMIT_OFFSET (0x7f09f52cb2d8) (limit=7, offset=0)
        -> SORT (0x7f09f52cb1f0)
            -> MATERIALIZE (0x7f09f52d9b18) (temporary table)
                -> (table_path) TABLE_SCAN (0x7f09f52cb160) (on <result temporary>)
                -> (qt_union) ptr=0x7f09f4e14770 parent=0x7f09f52bced0 num_children=2 m_block=0x7f09f4e14bf0 m_block_id=4 m_block_join=0x7f09f52cb368 master_expression_qt=0x7f09f52bced0 limit=10 offset=0 LIMIT_OFFSET (0x7f09f52cc340) (limit=10, offset=0)
                    -> SORT (0x7f09f52cc258)
                        -> MATERIALIZE (0x7f09f52ccd48) (union)
                            -> (table_path) TABLE_SCAN (0x7f09f52cc1c8) (on <union temporary>)
                            -> (qt_query_block) ptr=0x7f09f4e2cca0 parent=0x7f09f4e14770 num_children=0 m_block=0x7f09f4e2cca0 select_number=2 join=0x7f09f52c8c08 master_expression_qt=0x7f09f52bced0 TABLE_SCAN (0x7f09f52c96e8) (on A)
                            -> (qt_query_block) ptr=0x7f09f4e12eb8 parent=0x7f09f4e14770 num_children=0 m_block=0x7f09f4e12eb8 select_number=3 join=0x7f09f52c9778 master_expression_qt=0x7f09f52bced0 TABLE_SCAN (0x7f09f52ca258) (on B)

Change-Id: I592e95f98b1c9b25b07d76a53957aa8bdbaaa7fc
Another case of missing thd error check, this time in
Json_table_column::fill_column()

Change-Id: I06adb5cd685a8076393edcba505210eb22f60e92
(cherry picked from commit 74f95358938e2448068265775c0d652ae4831f5f)
Post-push fix for broken build with gcc 13.0.0

include/hexify.h:47:42: error:
ISO C++ forbids declaration of type name with no type [-fpermissive]
einterpret_cast<const uint8_t *>

Change-Id: I5b506cd8293975f33ec8d3eebcaafa5aaa434d93
(cherry picked from commit 30de6e96d17fecb412f31e106a771a743876875d)
GROUP CONCAT - (without orderby)
===============================================
Support for Group concat aggregation in Heatwave.
-Group_concat(expr) returns a string result with the concatenated
non-NULL values from a group.

High level Design:
==================
All the input columns concatenated  with CONCAT expression. Concatenated
string is combined over rows by applying group_concat aggregation.

A separator is passed as the second argument to the group concat and
it is added while concatenating the rows. The passing of the separator
string also requires the removal of the limitation of having only one
input to the aggregate operator.

Session parameter group_concat_max_len is not supported beyond maximum
column width.

Current limitations:
-> Group-concat with roll-up queries.
-> Group-concat functions with ORDER BY is not supported.
-> Dictionary encoded columns are not supported.

Tests are added as part of func_gconcat.test
Result files are updated with new offloads to Heatwave.

All QA identified bugs are fixed

Change-Id: I77dd330dc4790aeba02e17db50347f1dbd8a3d8b
Avoid test failure due to timeouts.

Timeout limit was set to 20mins, while test seems to take at
least > 15min and oftentimes more. Patch double the timeout limits.

Change-Id: Id041966092579109560d212d4e4c014307401c30
Change-Id: I3b8ba5235b564b5da4b46450c94fdabb3f9fc544
…ential overhead.

The Parallel(lock) queue lists in DbAcc will contain all shared-locks granted
to the 'same row', when the row is read multiple times using an unique key in
the same transaction (or query). In particular this may happen when the same
row is joined with other rows in a large join query.

WITH_DEBUG compiled binaries, as well as binaries compiled for ERROR_INSERT,
 will 'validate' the lock queues in order to check the consistency,
which involves traversing the entire lock lists.
This adds an exponential overhead to the code, which could
ends up totally stalling the data nodes. 'Overslept' warnings, upto
node disconnects and arbitrator shuttig down nodes may be seen.

Patch adds an upper limit (of 42) of how many queue items we validate
before giving it a 'pass'.

We usually seems to insert, remove and *validate* the lists starting
either from the 'lockOwner' (The head) or the last item, and validate
the lists from this starting position. The list contents inbetween
the head and tail has few changes. Thus, not validate the entire
huge lists should not loose much test coverage.

Change-Id: Ib3554844ecd5c143bca995bbaa83cac2ff2653df
(cherry picked from commit a677c64bf759cd7446005db9f66aa1c1f904d06a)
…ra predicate

A condition like t1.a = t2.a + t3.a is not considered an equi-join condition
because the right side of the condition has more than one table. As a result,
this makes it to the extra predicate of the hash join. However, if the
optimizer knows that "t2.a = t3.a", the above condition could well be
transformed into t1.a = t2.a + t2.a which could be used as an equi-join
condition. This is done in the old optimizer after the join order is decided.
Hence heatwave is able to offload in case of the old optimizer and not in
Hypergraph.
Solution is to propagate such equalities into all conditions that are not
multiple equalities. This results in several unit tests failing because some
of these fields in non-equalities were getting replaced with another table's
field or the conditions were written in such a way that optimizer could now
detect that these conditions are going be always false. E.g. t1.x = t2.x
and t1.x <> t2.x would now be transformed into t1.x = t2.x and t1.x <> t1.x.
To retain the original intention of the test case, all such cases are changed
to using different field from the same table.

Change-Id: Iaf9c8b393bb563867c6ac56bc637804780648269
derived table transformation missing

Post-push fix: The calculation of disallowed tables mixed NodeMap and
table_map. The node numbers and the table numbers are often the same,
but not always, so this could cause incorrect behaviour.

Fixed it by using table_map consistently.

Change-Id: I2242c651711c793e769063ec1d1da1351c18f4f8
by Bug#33732907).

This calls back to the fact that when running in "normal mode",
we see one COM_QUERY row for each statement, whereas when running
with --ps-protocol, we see a COM_STMT_PREPARE/COM_STMT_EXECUTE
pair for each statement. This affects counters and log file/table
contents.

This test now accounts for the mode it runs in and adjusts row
counts accordingly. Since we no longer show individual log lines
in the result, we now take separate counts of lines of interest
with and lines of interest without obfuscated passwords.

Change-Id: Id267a87f4ae1df835a2974dfc405d9aeee0ca8fe
…lgorithm used

  Issue:
    Tables with INSTANT ADD/DROP columns does not support
    tablespace import without the .cfg file. The code is
    throwing an error in the release build (as expected)
    but hitting a debug assert in the debug build. The debug
    build should throw similar error instead of hitting an
    assert. The behavior should be consistent in both
    release and debug build.

  Fix:
    Raise the error at the appropriate location and handle
    the same.

Change-Id: Ifd15f8a6285183f9ecbba08a8d2b56fb62790c86
               stored functions in a connection

Description:
============
Consider a stored routine having a SELECT as a substatement
that contains multiple and/or/xor conditions. Executing the
stored routine large number of times leads to high
consumption of virtual memory eventually leading to OOM.

Analysis:
=========
Query_block is part of the AST that represents the query (a
SELECT), and it is created and populated when the statement
is prepared. However, it is not supposed to be updated after
preparation, for each execution it is supposed to be
read-only. For a regular statement, the whole AST is torn
down after preparation and execution. That's not the case
for stored routines.

The consequence is that one of its attributes
(Query_block::cond_count) which stores the number of
arguments of and/or/xor in where/having/on clause(s) in the
SELECT query, is not reset for each execution of the stored
routine. For each execution, the attribute's value gets
multiplied. This attribute's value is used to determine the
memory allocation size for Key_field structure in
update_ref_and_keys(). The memory grows linearly as the
number of calls to this function increases.

Fix:
====
Query_block::cond_count is saved during the preparation,
incremented appropriately during the optimization and
restored with the saved value during the next execution of
the stored routine.

Change-Id: I59d38d5015b03111dbe4b633d456bee809942173
               SUBQUERY - REGRESSION

Description:
Mysqlpump may not work in cases where derived tables are generated.
This is due to privileges not being set for those tables.

Fix:
Backport. Derived tables handled separately and privileges set.

Gerrit#16933

Change-Id: Ia31d479ed2d7aba47993d2e646efb84651ee784a
…filter condittion

Strings with lots of leading zeros were incorrectly marked with
overflow when converting to decimal. The value returned was the
maximum decimal value for the given precision and scale.

The fix is to replace leading zeros with a single '0'.

Also convert the macro FIX_INTG_FRAC_ERROR() to an inline function
fix_intg_frac_error().

Change-Id: I371d0ebc451c5ea77f7e8ebd9e411606f9cba01f
(cherry picked from commit 7ccf63d0636477c2b494bda5262331e467d2c563)
encountering semi join + outer join
Bug#34448530: Hypergraph Offload Issue : Filter condition not pushed down

Some queries that used to be offloaded to HeatWave with the old
optimizer, were not offloaded when using the hypergraph optimizer.
These were queries that had gone through a subquery_to_derived
transformation which ended up attaching a degenerate join condition to
a left outer join.

HeatWave refuses to offload a query if an outer join condition
references only tables on the inner side of the join. Such conditions
are safe to push down on the inner side. In the failing cases,
however, it cannot be done because pushing the condition down to a
join below will just move the problem and make the join condition of
the other join inedible to HeatWave. Pulling it up into the WHERE
clause is also not an option (for correctness). So, for now, the
condition stays in the ON clause.

To work around the issue, ProposeHashJoin() is taught to detect this
case, and manually move the degenerate join condition from the join
predicate to a filter node on top of the right child of the join. This
is only done when optimizing the query for execution in a secondary
storage engine. If HeatWave learns how to handle these join
conditions, the workaround can be removed.

Change-Id: I1f2a2569afa67e1491f265685c3370df759b605b
… should be atomically set

A new structure `btr_search_prefix_info_t` is created to contain information on prefix of the row used for AHI caching.
All AHI-related fields in `buf_block_t` are moved to a new structure `buf_block_t::ahi_t`. It is better documented, includes description of latching protocol involved. This structure is accessed using `block->ahi`.
The `btr_search_prefix_info_t` is used in multiple places as atomic, with static assertion it is lockfree.
A new structure `btr_search_sys_t::search_part_t` is created to contain all state private to each AHI partition.
`btr_cur_update_in_place()` no longer takes X-latch on AHI before doing the modification.
`btr_search_update_hash_ref()` and `btr_search_update_hash_on_insert()` check if the AHI is not disabled after acquiring the X-latch. This was a race condition that could lead to incorrect state of AHI data structures.
`btr_search_drop_page_hash_index()` is better described with inline comments now, it does not take the S-latch at the begin at all, dereference of block->ahi.index is proved to be valid in all usages. It also handles the case when AHI is being disabled in meantime in correct way, without race conditions - if it is a private block, in `buf_LRU_free_page()`, we clear the block's reference, and otherwise we do nothing.
`btr_search_set_block_not_cached()` is introduced and called in places where the block->ahi.index was set to null. Now it always checks if the value was not-null and updates the index's reference count.
`btr_search_build_page_hash_index()` and `btr_search_move_or_delete_hash_entries()` no longer take S-latch on begin of processing.
`btr_search_disable()` now returns true iff the AHI was enabled and the call changed the status to disabled. This is used in `buf_pool_resize()` to judge if it should restore the status after the resize completes. A new mutex `btr_search_enabled_mutex` is used to synchronize AHI disabling and enabling, so the enabling is not executed before all AHI clearing operations are completed.
The heaps with buffer pool's blocks backed storage that are used in AHI now are storing a free block in a pointer to atomic variable. This allows the `btr_search_sys_resize()` to not leak a block that could be allocated concurrently in `btr_search_check_free_space_in_heap()`.
A race condition between `buf_LRU_free_page()` and `btr_search_drop_page_hash_index()` is removed - the `buf_LRU_free_page()` now forces page removal from AHI even if the AHI is (being) disabled.
Fixed a big bug in ut0new infrastructure on Solaris - destructors for aligned arrays were not called.
Added `ut::unique_ptr_aligned` type to allow to name a type returned by the methods allocating aligned memory.
`btr_search_disable()` now waits for all the blocks to stop reference the indexes through AHI that may be processed by some thread executing `buf_LRU_free_page()`, before allowing the AHI to be enabled again.
`rw_lock_t`'s destructor is improved so it can be called directly.

RB#27411
Change-Id: I37bb68e0c48416ca5f578ece33c61e914277c62f

Change-Id: I3e285ea7b6f83c347257ecb791b1c87a03653c4b
@pull pull bot added the ⤵️ pull label Oct 11, 2022
@pull pull bot merged commit a246bad into Mu-L:8.0 Oct 11, 2022
pull bot pushed a commit that referenced this pull request Apr 27, 2023
Missing initializers for m_verbosity and m_util_latest_gci

storage/ndb/test/src/UtilTransactions.cpp:35:24: warning: 2
uninitialized fields at the end of the constructor call [clang-analyzer-
optin.cplusplus.UninitializedObject]

Removed unused member m_defaultClearMethod

Change-Id: I46617df4a1ea3c3c56a8d82e999dcc01019232a7
pull bot pushed a commit that referenced this pull request Apr 27, 2023
  # This is the 1st commit message:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA

  Problem Statement
  -----------------
  Currently customers cannot enable heatwave analytics service to their
  HA DBSystem or enable HA if they are using Heatwave enabled DBSystem.
  In this change, we attempt to remove this limitation and provide
  failover support of heatwave in an HA enabled DBSystem.

  High Level Overview
  -------------------
  To support heatwave with HA, we extended the existing feature of auto-
  reloading of tables to heatwave on MySQL server restart (WL-14396). To
  provide seamless failover functionality to tables loaded to heatwave,
  each node in the HA cluster (group replication) must have the latest
  view of tables which are currently loaded to heatwave cluster attached
  to the primary, i.e., the secondary_load flag should be in-sync always.

  To achieve this, we made following changes -
    1. replicate secondary load/unload DDL statements to all the active
       secondary nodes by writing the DDL into the binlog, and
    2. Control how secondary load/unload is executed when heatwave cluster
       is not attached to node executing the command

  Implementation Details
  ----------------------
  Current implementation depends on two key assumptions -
   1. All MDS DBSystems will have RAPID plugin installed.
   2. No non-MDS system will have the RAPID plugin installed.

  Based on these assumptions, we made certain changes w.r.t. how server
  handles execution of secondary load/unload statements.
   1. If secondary load/unload command is executed from a mysql client
      session on a system without RAPID plugin installed (i.e., non-MDS),
      instead of an error, a warning message will be shown to the user,
      and the DDL is allowed to commit.
   2. If secondary load/unload command is executed from a replication
      connection on an MDS system without heatwave cluster attached,
      instead of throwing an error, the DDL is allowed to commit.
   3. If no error is thrown from secondary engine, then the DDL will
      update the secondary_load metadata and write a binlog entry.

  Writing to binlog implies that all the consumer of binlog now need to
  handle this DDL gracefully. This has an adverse effect on Point-in-time
  Recovery. If the PITR backup is taken from a DBSystem with heatwave, it
  may contain traces of secondary load/unload statements in its binlog.
  If such a backup is used to restore a new DBSystem, it will cause failure
  while trying to execute statements from its binlog because
   a) DBSystem will not heatwave cluster attached at this time, and
   b) Statements from binlog are executed from standard mysql client
      connection, thus making them indistinguishable from user executed
      command.
  Customers will be prevented (by control plane) from using PITR functionality
  on a heatwave enabled DBSystem until there is a solution for this.

  Testing
  -------
  This commit changes the behavior of secondary load/unload statements, so it
   - adjusts existing tests' expectations, and
   - adds a new test validating new DDL behavior under different scenarios

  Change-Id: Ief7e9b3d4878748b832c366da02892917dc47d83

  # This is the commit message #2:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (PITR SUPPORT)

  Problem
  -------
  A PITR backup taken from a heatwave enabled system could have traces
  of secondary load or unload statements in binlog. When such a backup
  is used to restore another system, it can cause failure because of
  following two reasons:

  1. Currently, even if the target system is heatwave enabled, heatwave
  cluster is attached only after PITR restore phase completes.
  2. When entries from binlogs are applied, a standard mysql client
  connection is used. This makes it indistinguishable from other user
  session.

  Since secondary load (or unload) statements are meant to throw error
  when they are executed by user in the absence of a healthy heatwave
  cluster, PITR restore workflow will fail if binlogs from the backup
  have any secondary load (or unload) statements in them.

  Solution
  --------
  To avoid PITR failure, we are introducing a new system variable
  rapid_enable_delayed_secondary_ops. It controls how load or unload
  commands are to be processed by rapid plugin.

    - When turned ON, the plugin silently skips the secondary engine
      operation (load/unload) and returns success to the caller. This
      allows secondary load (or unload) statements to be executed by the
      server in the absence of any heatwave cluster.
    - When turned OFF, it follows the existing behavior.
    - The default value is OFF.
    - The value can only be changed when rapid_bootstrap is IDLE or OFF.
    - This variable cannot be persisted.

  In PITR workflow, Control Plane would set the variable at the start of
  PITR restore and then reset it at the end of workflow. This allows the
  workflow to complete without failure even when heatwave cluster is not
  attached. Since metadata is always updated when secondary load/unload
  DDLs are executed, when heatwave cluster is attached at a later point
  in time, the respective tables get reloaded to heatwave automatically.

  Change-Id: I42e984910da23a0e416edb09d3949989159ef707

  # This is the commit message #3:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (TEST CHANGES)

  This commit adds new functional tests for the MDS HA + HW integration.

  Change-Id: Ic818331a4ca04b16998155efd77ac95da08deaa1

  # This is the commit message #4:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA
  BUG#34776485: RESTRICT DEFAULT VALUE FOR rapid_enable_delayed_secondary_ops

  This commit does two things:
  1. Add a basic test for newly introduced system variable
  rapid_enable_delayed_secondary_ops, which controls the behavior of
  alter table secondary load/unload ddl statements when rapid cluster
  is not available.

  2. It also restricts the DEFAULT value setting for the system variable
  So, following is not allowed:
  SET GLOBAL rapid_enable_delayed_secondary_ops = default
  This variable is to be used in restricted scenarios and control plane
  only sets it to ON/OFF before and after PITR apply. Allowing set to
  default has no practical use.

  Change-Id: I85c84dfaa0f868dbfc7b1a88792a89ffd2e81da2

  # This is the commit message #5:

  Bug#34726490: ADD DIAGNOSTICS FOR SECONDARY LOAD / UNLOAD DDL

  Problem:
  --------
  If secondary load or unload DDL gets rolled back due to some error after
  it had loaded / unloaded the table in heatwave cluster, there is no undo
  of the secondary engine action. Only secondary_load flag update is
  reverted and binlog is not written. From User's perspective, the table
  is loaded and can be seen on performance_schema. There are also no
  error messages printed to notify that the ddl didn't commit. This
  creates a problem to debug any issue in this area.

  Solution:
  ---------
  The partial undo of secondary load/unload ddl will be handled in
  bug#34592922. In this commit, we add diagnostics to reveal if the ddl
  failed to commit, and from what stage.

  Change-Id: I46c04dd5dbc07fc17beb8aa2a8d0b15ddfa171af

  # This is the commit message #6:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (TEST FIX)

  Since ALTER TABLE SECONDARY LOAD / UNLOAD DDL statements now write
  to binlog, from Heatwave's perspective, SCN is bumped up.

  In this commit, we are adjusting expected SCN values in certain
  tests which does secondary load/unload and expects SCN to match.

  Change-Id: I9635b3cd588d01148d763d703c72cf50a0c0bb98

  # This is the commit message mysql#7:

  Adding MTR tests for ML in rapid group_replication suite

  Added MTR tests with Heatwave ML queries with in
  an HA setup.

  Change-Id: I386a3530b5bbe6aea551610b6e739ab1cf366439

  # This is the commit message mysql#8:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (MTR TEST ADJUSTMENT)

  In this commit we have adjusted the existing test to work with the
  new MTR test infrastructure which extends the functionalities to
  HA landscape. With this change, a lot of mannual settings have now
  become redundant and thus removed in this commit.

  Change-Id: Ie1f4fcfdf047bfe8638feaa9f54313d509cbad7e

  # This is the commit message mysql#9:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (CLANG-TIDY FIX)

  Fix clang-tidy warnings found in previous change#16530, patch#20

  Change-Id: I15d25df135694c2f6a3a9146feebe2b981637662

Change-Id: I3f3223a85bb52343a4619b0c2387856b09438265
pull bot pushed a commit that referenced this pull request Jul 18, 2023
Some minor refactoring of function find_item_in_list() before the fix.

- Code is generally aligned with coding standard.

- Error handling is separated out, now is is false for success and
  true for error.

- The found item is now an output argument, and a null pointer means
  the item was not found (along with the other two out arguments).

- The report_error argument is removed since it was always used as
  REPORT_EXCEPT_NOT_FOUND.

- A local variable "find_ident" is introduced, since it better
  represents that we are searching for a column reference than having
  separate field_name, table_name and db_name variables.

Item_field::replace_with_derived_expr_ref()

- Redundant tests were removed.

Function resolve_ref_in_select_and_group() has also been changed so
that success/error is now returned as false/true, and the found item
is an out argument.

Function Item_field::fix_fields()

- The value of thd->lex->current_query_block() is cached in a local
  variable.

- Since single resolving was introduced, a test for "field" equal to
  nullptr was redundant and could be eliminated, along with the indented
  code block that followed.

- A code block for checking bitmaps if the above test was false could
  also be removed.

Change-Id: I3cd4bd6a23dd07faff773bdf11940bcfd847c903
pull bot pushed a commit that referenced this pull request Jul 18, 2023
Two problems were identified for this bug. The first is seen by looking
at the reduced query:

select subq_1.c1 as c1
from (select subq_0.c0 as c0,
             subq_0.c0 as c1,
             90 as c2,
             subq_0.c1 as c3
      from (select (select v2 from table1) as c0,
                   ref_0.v4 as c1
            from table1 as ref_0
           ) as subq_0
      ) as subq_1
where EXISTS (select subq_1.c0 as c2,
                     case
                     when EXISTS (select (select v0 from table1) as c1
                                          from table1 as ref_8
                                          where EXISTS (select subq_1.c2 as c7
                                                        from table1 as ref_9
                                                       )
                                         )
                     then subq_1.c3
                     end as c5
              from table1 as ref_7);

In the innermost EXISTS predicate, a column subq_1.c2 is looked up.
It is erroneously found as the column subq_1.c0 with alias c2 in the
query block of the outermost EXISTS predicate. But this resolving is not
according to SQL standard: A table name cannot be part of a column alias,
it has to be a simple identifier, and any referencing column must also
be a simple identifier. By changing item_ref->item_name to
item_ref->field_name in a test in find_item_in_list, we ensure that the
match is against a table (view) name and column name and not an alias.

But there is also another problem. The EXISTS predicate contains a few
selected columns that are resolved and then immediately deleted since
they are redundant in EXISTS processing. But if these columns are
outer references and defined in a derived table, we may actually
de-reference them before the initial reference increment. Thus, those
columns are removed before they are possibly used later. This happens
to subq_1.c2 which is resolved in the outer-most query block and
coming from a derived table. We prevent this problem by incrementing
the reference count of selected expressions from derived tables earlier,
and we try to prevent this problem from re-occuring by adding an
"m_abandoned" field in class Item, which is set to true when the
reference count is decremented to zero and prevents the reference count
from ever be incremented after that.

Change-Id: Idda48ae726a580c1abdc000371b49a753e197bc6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.