Skip to content

Storage version 65#15702

Merged
Mytherin merged 3 commits intoduckdb:v1.2-histrionicusfrom
carlopi:storage_version_65
Jan 16, 2025
Merged

Storage version 65#15702
Mytherin merged 3 commits intoduckdb:v1.2-histrionicusfrom
carlopi:storage_version_65

Conversation

@carlopi
Copy link
Contributor

@carlopi carlopi commented Jan 14, 2025

This adds the possibility for DuckDB to read (and modify) files of a new storage version. Those files are (at this moment) identical to files version 64 (that has been the default since v0.10.1), with a significant change that they can't be opened by previous DuckDB version.

This would make possible to guard features touching storage that are incompatible with previous DuckDB versions, such as improved compression methods.

The meaning of the data in src/storage/version_map.json changes from being the exact storage version produced by a given DuckDB version to the maximum storage version produced. Note that this is compatible with previous interpretation, where the LOWER and UPPER bounds happened to be the same.

Opening files with storage_version=65 with older duckdb versions will produce error messages that point to https://duckdb.org/docs/internals/storage.html, where this is documented.

Only visible (at the SQL level) version of this change is that now DuckDB AttachedDatabases have a storage_version tag.

This PR is to be followed by a reworked version of #14981.

This adds the possibility for DuckDB to read (and modify) files of a new storage version.
Those files are (at this moment) identical to files version 64, with a significant change
that they can't be opened by previous DuckDB version.

This would make possible to guard features touching storage that are incompatible with previous
DuckDB versions, such as improved compression.

The meaning of the data in src/storage/version_map.json changes from being the exact storage
version produced by a given DuckDB version to the maxium storage version produced.
Note that this is compatible with previous interpretation, where the LOWER and UPPER
bounds happened to be the same.

Note that opening files with storage_version = 65 with older duckdb versions will produce
error messages that point to https://duckdb.org/docs/internals/storage.html, where this is
documented.
@carlopi carlopi added Needs Documentation Use for issues or PRs that require changes in the documentation Ready For Review labels Jan 14, 2025
@carlopi carlopi force-pushed the storage_version_65 branch from 4b0b9bc to 7b4ba5f Compare January 14, 2025 12:54
@carlopi carlopi force-pushed the storage_version_65 branch from 7b4ba5f to a0105ca Compare January 14, 2025 13:28
@duckdb-draftbot duckdb-draftbot marked this pull request as draft January 14, 2025 14:02
@carlopi carlopi marked this pull request as ready for review January 14, 2025 14:17
@duckdb-draftbot duckdb-draftbot marked this pull request as draft January 14, 2025 14:47
@carlopi carlopi marked this pull request as ready for review January 14, 2025 14:50
@carlopi carlopi force-pushed the storage_version_65 branch from a0105ca to 598ff3d Compare January 14, 2025 15:54
@duckdb-draftbot duckdb-draftbot marked this pull request as draft January 14, 2025 17:14
@carlopi carlopi marked this pull request as ready for review January 14, 2025 19:12
@Mytherin Mytherin changed the base branch from main to v1.2-histrionicus January 16, 2025 08:12
@Mytherin Mytherin merged commit 9966a56 into duckdb:v1.2-histrionicus Jan 16, 2025
48 of 49 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

@carlopi carlopi deleted the storage_version_65 branch January 16, 2025 08:15
Mytherin added a commit that referenced this pull request Jan 20, 2025
…orage version when serializing a database (#15794)

Follow-up from #15702
Supersedes/builds on top of #14981

This PR change the `storage_compatibility_version` from being a setting
set on every session to be written in the database file.

Previously we would set this setting at run-time, and it would be shared
across all database instances:

```sql
ATTACH 'file1.db';
-- write something, to be serialized targeting version v0.10.0
SET storage_compatibility_version = 'v1.0.0';
ATTACH 'file2.db';
-- write something, to be serialized targeting v1.0.0
```

This has a number of issues:

* The storage compatibility version is shared across all attached
databases
* When restarting the system, the `storage_compatibility_version` would
revert back towards the default setting (currently `v0.10.0`)
* When reading a database, we did not know which storage compatibility
version was used, which could lead to hard to understand errors when
reading databases with an older version

### STORAGE_VERSION parameter

This PR reworks this so that the storage version is instead specified on
`ATTACH`. When none is specified:

* The version set in the `storage_compatibility_version` is used when
creating a new database
* The version stored within the database is used when loading an
existing database

As a result, we can target the storage version towards the desired
supported version when creating a new database. When opening an existing
database, we will keep on writing targeting the same DuckDB version
(i.e. we never automatically "upgrade" the file to a newer DuckDB
version). The user can *manually* upgrade a file by opening an older
file while targeting a later storage version.

For example:

```sql
-- use default `storage_compatibility_version`
ATTACH 'new_file.db';
-- explicitly target versions >= v1.2.0
ATTACH 'new_file.db' (STORAGE_VERSION 'v1.2.0');

-- use the storage version stored within the file
ATTACH 'existing_file.db';
-- use storage version v1.2.0 - if the file uses an older storage version, this upgrades the file
ATTACH 'existing_file.db' (STORAGE_VERSION 'v1.2.0');
```

Note that we cannot *downgrade* a file. If we try to open a file that
targets e.g. version v1.2.0 with an explicit storage version of v1.0.0,
we get an error:

```sql
ATTACH 'database_file.db' (STORAGE_VERSION 'v1.2.0');
DETACH database_file;

ATTACH 'database_file.db' (STORAGE_VERSION 'v1.0.0');
-- Error opening "database_file.db": cannot initialize database with storage version 2 - which is lower than what the database itself uses (4). The storage version of an existing database cannot be lowered.
```

### Opening with DuckDB < v1.1.3

When opening a file that targets `v1.2.0` in an older DuckDB version, we
now get a storage incompatibility error:

```sql
duckdb database_file.db
```

```
Error: unable to open database "database_file.db": IO Error: Trying to read a database file with version number 65, but we can only read version 64.
The database file was created with an newer version of DuckDB.

The storage of DuckDB is not yet stable; newer versions of DuckDB cannot read old database files and vice versa.
The storage will be stabilized when version 1.0 releases.

For now, we recommend that you load the database file in a supported version of DuckDB, and use the EXPORT DATABASE command followed by IMPORT DATABASE on the current version of DuckDB.

See the storage page for more information: https://duckdb.org/internals/storage
```

The description in the error is not entirely correct - but the error is
a lot more descriptive than the previous error that would be thrown in
this scenario (which was `INTERNAL Error: Unsupported compression
function type`).

The error message has also been improved in
#15702 already.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs Documentation Use for issues or PRs that require changes in the documentation Ready For Review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants