Skip to content

cli,sql: cockroach dump cannot be used to export all data #28948

@knz

Description

@knz

tldr: cockroach dump is a "best effort" tool which is unable to dump some types of data out of CockroachDB. Therefore, only deployments using particular combinations of SQL features can be exported via dump faithfully.

Context: cockroach dump is advertised as a migration tool, not a disaster recovery tool. Disaster recovery is offered as follows:

  • for core deployments (without license) by taking an offline backup of the data directories, by switching all nodes off during the backup
  • for enterprise deployments, using online BACKUP/RESTORE
  • or using Cockroach Cloud which handles backups as a service transparently

limitations in dump

dump is rather brittle.

There is no guarantee it can migrate data out of CockroachDB, but it can even break between versions without tests catching the breakage.

Details

cockroach dump has an "interesting" design in that it performs dumps
as follows:

  • obtains the SQL type of names via information_schema.columns;
  • independently, sends a SELECT query to the table over pgwire;
  • when the data comes back, it instanciates Datums by parsing the
    data received from pgwire; which parser to use is determined
    by a switch on the type name string obtained from
    information_schema.columns;
  • it then uses a datum formatter to write the SQL representation
    of the datums into the dump.

There are, irremediable problems with this approach:

  • the pgwire conversion of the data is lossy. That's
    irremediable. Because of this cockroach dump can never fully
    guarantee it can dump a table accurately. Really, cockroach dump should
    read the K/V pairs in binary format and decode them losslessly as datums.

  • lib/pq which is used to obtain the values over pgwire doesn't
    necessarily know about all the types supported by the CockroachDB
    node on the other side. When it does not know of a type, it will pass the pgwire
    encoded bytes as-is as a Go []byte to the cockroach dump reader,
    which won't have any logic to deal further with that. There are two
    compounded problems here:

    • cockroach dump should really talk over pgwire directly instead
      of using lib/pq.
    • cockroach dump should use the pgwire package (the
      encode/decode functions therein) to ensure it gets the same data
      that was sent. This may be insufficient though because
      we don't guarantee exact roundtrip, see the point above.
  • there is no guarantee whatsoever that information_schema.columns
    contains enough information to decide on an in-memory value type for
    CockroachDB.

    PRs sql: fix the reporting of types in information_schema.columns #28945 and cli,sql: add another crutch to the handling of column types in dump #28949 are attempting to make cockroach dump
    more resistant here by providing it a CockroachDB-specific extension
    to information_schema.columns, called crdb_sql_type. However,
    what cockroach dump should really do is go get the descriptor
    protobuf, extract the actual sqlbase.ColumnType from that, and
    then use (*ColumnType).ToDatumType().

Metadata

Metadata

Assignees

Labels

A-disaster-recoveryC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.S-3-ux-surpriseIssue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions