Skip to content

migrations: Support AWS DMS as a source #84505

@ajstorm

Description

@ajstorm

#34766 provided support for CRDB to impersonate PG and act as a target for migrations. There is still work remaining if we ever want to support DMS with CRDB as a source. A working prototype is available in #93404.

Initial Load

To get the initial load to work, we need the following:

Replication

  • Implement the postgres streaming replication slot protocol, including the pg_catalog tables and associated builtins to create a replication slot. In the prototype, we used a global buffer which is populated by CDC using a newly added replication:// source URI to do so using the "normal connection protocol", which is wrong as the replication slot protocol has its own parser.
    • involves aligning on an adequate WAL mapping and some sort of ordering in order to achieve this. see (cdc: support Debezium (via Postgres WAL) #68333). Similar to "sinkless changefeeds", but not quite the same.
    • we only need to implement the test_decoding plugin, but others may work too with DMS. Note this means setting PluginName as an additional parameter on the source endpoint in DMS.
  • Implement database level CDC (changefeedccl: add targets for changefeeds beyond tables #73435)
    • We may be able to get away with a cheat here too. In the prototype we set one up for all tables manually. There is some work required to productize this, namely:
      • Ensuring that as new tables get created we CFs for them
      • Gracefully handle table renames
  • Implement the plpgsql to create event triggers - this is required for DMS to recognise and apply schema changes during replication.
    • Alternatively, add a session variable that fakes the existence of the event trigger and function as we did in the prototype (or instruct the user to create these functions / triggers with nothing in them). The caveat: schema changes don't get properly propagated on the target DB.
  • Correctly populate relreplident in pg_catalog.pg_class.

Requisite plpgsql and event trigger

CREATE OR REPLACE FUNCTION objects_schema.awsdms_intercept_ddl()
  RETURNS event_trigger
LANGUAGE plpgsql
SECURITY DEFINER
  AS $$
  declare _qry text;
BEGIN
  if (tg_tag='CREATE TABLE' or tg_tag='ALTER TABLE' or tg_tag='DROP TABLE') then
         SELECT current_query() into _qry;
         insert into objects_schema.awsdms_ddl_audit
         values
         (
         default,current_timestamp,current_user,cast(TXID_CURRENT()as varchar(16)),tg_tag,0,'',current_schema,_qry
         );
         delete from objects_schema.awsdms_ddl_audit;
end if;
END;
$$$

CREATE EVENT TRIGGER awsdms_intercept_ddl ON ddl_command_end 
EXECUTE PROCEDURE objects_schema.awsdms_intercept_ddl();

Jira issue: CRDB-17695

Epic CC-8911

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-migrationsMigrating to CRDB from another database vendorA-tools-aws-dmsBlocking support for AWS Database Migration ServiceC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-migrations

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions