Skip to content

feat(core): make partitions attached via soft link read-only, protected from upserts#2710

Merged
bluestreak01 merged 222 commits intomasterfrom
ma/access-permission-on-partitions
Jan 18, 2023
Merged

feat(core): make partitions attached via soft link read-only, protected from upserts#2710
bluestreak01 merged 222 commits intomasterfrom
ma/access-permission-on-partitions

Conversation

@marregui
Copy link
Contributor

@marregui marregui commented Nov 2, 2022

TLDR:

  • partition table's size field now carries a mask
  • one bit of the mask makes the partition read-only
  • the bit is set when a partition is attached by soft linking it
  • inserts on read-only partitions are ignored and logged (to offer continuous service on ILP)
  • updates on read-only partitions result in an error
  • directories and soft links to directories work the same way across the codebase, any folder can in reality be a soft link to anywhere in the file system, including other volumes
  • revisited Path class to improve usability

The partition table segment is a block within the _txn file that contains 4 longs per attached partition:

  1. partition timestamp
  2. partition size
  3. partition txn (transaction number/version)
  4. partition column version (data version)

Each group of 4 longs is addressed by partitionIndex. These values are updated on commit. When a partition is detached/droppped, its associated 4 longs in the partition segment are removed.

The highest value for a partition's size is 0xFFFFFFFFFFFL (15 Tera Rows), leaving the highest 0xFFFFF of the 64 bit long available to use as a mask. This PR introduces a change in the partition segment size long to be structured in this way:

reserved read-only available bits partition size
1 bit 1 bit 18 bits 44 bits

when read-only bit is set, the partition is read only.

To check a partition's size we obtain maskedSize from the partition table (offset + PARTITION_MASKED_SIZE_OFFSET) and we conduct an & with PARTITION_SIZE_MASK.

To update a partition's size we obtain maskedSize from the partition table and update it like this:

maskedSize = (maskedSize & PARTITION_MASK_MASK) | (partitionSize & PARTITION_SIZE_MASK)

where PARTITION_MASK_MASK is 0x7FFFF00000000000L.

To check that a partition is read only we obtain maskedSize and test the 62nd bit.

The read-only flag is set by the attach partition operation, for partitions attached from soft link. There are no other code paths to set it (although in testing I have hacked a way).

When read-only is set, these operations FAIL (ReadOnlyViolationException "skipping RW operation on RO partition"):

  • any update

When read-only is set, these operations are NO-OP:

  • any ingests that occur in TableWriter
  • any purge operation such as delete columns or partitions (ColumnPurgeOperator and O3PartitionPurgeJob)

These operations will SUCCEED:

  • detach partition (if via soft link, just unlink and no creation of further .detached partition)
  • drop partition (if via soft link, just unlink)
  • rename column
  • add column
  • drop column (leaves column files, does not remove them)
  • add index
  • drop index

Class Path made me constantly make mistakes all due to the requirement that all strings need to be null terminated. Under the scenes we would add a null char to the buffer and move the write pointer forward (a call to .$()). To later use the path again you would have to .chop() it first, to move the write pointer a position back. Now we do not move the pointer when we add the null char, and .chop() no longer exists. The path is ready for .concat() right after a call to .$() (which btw is now idempotent), the write pointer is pointing at the null so it will be overriden.

@marregui marregui added the New feature Feature requests label Nov 2, 2022
@marregui marregui self-assigned this Nov 2, 2022
@marregui marregui changed the title double partition table size from 4 longs to 8, per partition, the 5th is a mask with RO flag feat(core): double partition table size from 4 longs to 8, per partition, the 5th is a mask with RO flag Nov 2, 2022
@marregui marregui force-pushed the ma/access-permission-on-partitions branch from ab2c2c6 to ec0d08c Compare November 7, 2022 11:50
@marregui
Copy link
Contributor Author

I have removed the in volume feature form this PR, and will create a follow up PR when this one is merged.
It was a miscalculation on how much work it was involced, apologies.

@marregui marregui requested a review from ideoma January 16, 2023 14:38
@marregui marregui changed the title feat(core): adapt partition segment size to carry both size and partition mask, and introduce partition read-only bit feat(core): make partitions attached via soft link read-only, protected from upserts Jan 17, 2023
@marregui marregui requested a review from ideoma January 17, 2023 15:37
@ideoma
Copy link
Collaborator

ideoma commented Jan 17, 2023

[PR Coverage check]

😍 pass : 480 / 538 (89.22%)

file detail

path covered line new line coverage
🔵 io/questdb/cairo/ColumnPurgeOperator.java 4 10 40.00%
🔵 io/questdb/Bootstrap.java 1 2 50.00%
🔵 io/questdb/cairo/TableUtils.java 2 3 66.67%
🔵 io/questdb/griffin/SqlCompiler.java 3 4 75.00%
🔵 io/questdb/std/Files.java 19 25 76.00%
🔵 io/questdb/cairo/TableWriter.java 101 127 79.53%
🔵 io/questdb/cairo/CairoEngine.java 4 5 80.00%
🔵 io/questdb/griffin/engine/functions/catalogue/PgAttrDefFunctionFactory.java 4 5 80.00%
🔵 io/questdb/cutlass/Services.java 45 54 83.33%
🔵 io/questdb/std/FilesFacadeImpl.java 14 16 87.50%
🔵 io/questdb/cairo/mig/EngineMigration.java 28 32 87.50%
🔵 io/questdb/griffin/engine/functions/catalogue/TableListFunctionFactory.java 1 1 100.00%
🔵 io/questdb/std/str/Path.java 72 72 100.00%
🔵 io/questdb/cairo/TxWriter.java 34 34 100.00%
🔵 io/questdb/cairo/TableReader.java 50 50 100.00%
🔵 io/questdb/cairo/VacuumColumnVersions.java 2 2 100.00%
🔵 io/questdb/PropServerConfiguration.java 2 2 100.00%
🔵 io/questdb/griffin/engine/table/FwdTableReaderPageFrameCursor.java 5 5 100.00%
🔵 io/questdb/network/Net.java 1 1 100.00%
🔵 io/questdb/cairo/mig/Mig620.java 1 1 100.00%
🔵 io/questdb/griffin/PurgingOperator.java 8 8 100.00%
🔵 io/questdb/cairo/TxReader.java 46 46 100.00%
🔵 io/questdb/cairo/DefaultCairoConfiguration.java 1 1 100.00%
🔵 io/questdb/griffin/DatabaseSnapshotAgent.java 2 2 100.00%
🔵 io/questdb/cairo/O3PartitionJob.java 3 3 100.00%
🔵 io/questdb/griffin/engine/functions/catalogue/PgAttributeFunctionFactory.java 1 1 100.00%
🔵 io/questdb/cairo/mig/Mig607.java 1 1 100.00%
🔵 io/questdb/cairo/O3PartitionPurgeJob.java 8 8 100.00%
🔵 io/questdb/cutlass/text/CsvFileIndexer.java 1 1 100.00%
🔵 io/questdb/griffin/UpdateOperatorImpl.java 5 5 100.00%
🔵 io/questdb/cairo/wal/ApplyWal2TableJob.java 3 3 100.00%
🔵 io/questdb/network/IODispatcherConfiguration.java 1 1 100.00%
🔵 io/questdb/cairo/pool/AbstractMultiTenantPool.java 1 1 100.00%
🔵 io/questdb/std/Os.java 1 1 100.00%
🔵 io/questdb/cairo/FullFwdDataFrameCursor.java 1 1 100.00%
🔵 io/questdb/cutlass/http/DefaultHttpServerConfiguration.java 1 1 100.00%
🔵 io/questdb/cairo/TableNameRegistryFileStore.java 3 3 100.00%

@marregui
Copy link
Contributor Author

approval!!!! oh!!!!!

@bluestreak01 bluestreak01 merged commit 0dc22eb into master Jan 18, 2023
@bluestreak01 bluestreak01 deleted the ma/access-permission-on-partitions branch January 18, 2023 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

New feature Feature requests

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

10 participants