VReplication support for non-PRIMARY KEY (Online DDL context)#8364
VReplication support for non-PRIMARY KEY (Online DDL context)#8364shlomi-noach merged 58 commits intovitessio:mainfrom
Conversation
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…PK' columns from unique key Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…nique key specific info Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…d unique key Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
|
While this is still WIP, I'm happy to get eyes on the proposed solution. For your convenience, this is the outline: proto
Online DDL
Streamer
VReplicator / table plan builder
Notes
|
…h opposite ordering Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…_stress_suite test Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…nges under stress Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
|
Question: as mentioned above:
The module that populates these two values is This way the one who own the decision to use a specific unique key can already dictate the columns; the vreplication components don't need to read Extra information is that I believe I will want to anyway add the names of the columns in the source key (possibly renamed if there's a The more I think of it, the more I believe we should just write the names of columns rather than name of index. Thoughts? EDIT: this change is implemented in #8420 |
…target_unique_key_columns, values populated by vrepl and no need for information_schema.columns queries on source and target Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…e strings instead of string arrays. The value of source_unique_key_columns is comma delimited (and escaped) list of column names; similarly for target_unique_key_columns Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…vrpel-suite-pk-uk Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
|
Merged #8420 into this PR. From #8420's main comment: DescriptionA take/twist on #8364, per comment Mostly exactly the same as #8364 , except:
|
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
rohit-nayak-ps
left a comment
There was a problem hiding this comment.
lgtm.
Very nice. I like the fact there are minimal changes to the core workflow due to this!
|
Just observing here that this feature is only implemented for Online DDL, though internally vreplication supports it via this PR. If we want to extend this to vreplication workflows (movetables/reshard/...) we will need to generate the filter hints that Online DDL generates while creating those workflows. |
Correct. There has to be some logic above VReplication to set those fields. |
Description
Today, VReplication assumes:
PRIMARY KEYPRIMARY KEY(same columns, same order of columns)Online DDL requires that this restriction is lifted. A user should be allowed to change a
PRIMARY KEY. Online DDL-wise, the requirement is that both source and target tables have at least one shared not-nullableUNIQUE KEY. A valid example:with the following change:
In the above scenario both before/after tables have a non-nullable unique key on
uuidcolumn. In the source table, it isPRIMARY, and in target table, it isuuid_idx. The fact that the name of the index is different is merely a technicality. VReplication should be able to iterate tables byuuidorder on both source and target.Today, the assumption that we only ever use a
PRIMARY KEYgoes very deep. Thereload()function in vreplication pre-evaluates all tables in a schema according to their PK, irrespective of any VReplication stream. In today's assumption, when a new VReplication stream kicks in, it just knows which table columns comprise the PRIMARY KEY, because everything is already evaluated beforehand.However, we now want to choose a unique key per operation. Maybe the source/target share the exact same
PRIMARY KEY, as is normally the case, and there's nothing to do. But maybe they don't, and there's multiple options. So we need to evaluate which unique keys are shared between the two tables, pick the "best" shared key, and iterate by that key. But we will only know this once a online DDL is requested!Again, the assumption that we iterate on
PRIMARY KEYis deeply rooted. I want to avoid rewriting huge sections of code. Instead, I propose an "override". We keep such terms aslastPKorpkColumns, but we understand that they may refer to non-PK unique keys. We introduce unique key hints in VReplication's filter/rule. Online DDL evaluates shared unique key and writes down its name into the rule. VReplication overridespkColumnsby reading the unique key columns in key order (ie how columns are ordered withing the unique key, not how they are ordered in the table). Most other information remains intact.Importing
gh-ostsuite tests to validate correctness, and adding more tests.This is work in progress.
Related Issue(s)
#8179
Checklist