feat: add orioledb support for vector buckets#897
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Central YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds explicit locking and pre-condition lookups to vector storage transactions. Introduces exported constant VECTOR_BUCKET_COUNT_LOCK and a new VectorLockResourceType ('bucket' | 'index' | 'global'), and updates VectorMetadataDB.lockResource signatures. createBucket acquires the global bucket-count lock and verifies counts under lock; deleteBucket acquires the per-bucket lock then the global bucket-count lock and re-checks emptiness under locks; getBucket and createVectorIndex acquire per-bucket locks and validate existence before proceeding. A new test suite exercises lifecycle, concurrency, and lock ordering. package.json test:oriole now runs jest without the prior testPathIgnorePatterns filter. Sequence Diagram(s)sequenceDiagram
actor Client
participant VectorStoreManager
participant LockManager as Lock Manager
participant VectorMetadataDB as Metadata DB
participant VectorStore
rect rgba(100,150,200,0.5)
Note over Client,VectorStore: createBucket Flow (with Locking)
end
Client->>VectorStoreManager: createBucket(name)
VectorStoreManager->>VectorStoreManager: Begin transaction
VectorStoreManager->>LockManager: Acquire VECTOR_BUCKET_COUNT_LOCK
LockManager-->>VectorStoreManager: Lock acquired
VectorStoreManager->>VectorMetadataDB: findVectorBucket(name) & count buckets
VectorMetadataDB-->>VectorStoreManager: Bucket not found, count validated
VectorStoreManager->>VectorStore: Create bucket in storage
VectorStore-->>VectorStoreManager: Bucket created
VectorStoreManager->>VectorMetadataDB: Persist bucket metadata
VectorMetadataDB-->>VectorStoreManager: Metadata saved
VectorStoreManager->>LockManager: Release VECTOR_BUCKET_COUNT_LOCK
LockManager-->>VectorStoreManager: Lock released
VectorStoreManager->>VectorStoreManager: Commit transaction
VectorStoreManager-->>Client: Bucket created
sequenceDiagram
actor Client
participant VectorStoreManager
participant LockManager as Lock Manager
participant VectorMetadataDB as Metadata DB
participant VectorStore
rect rgba(150,100,200,0.5)
Note over Client,VectorStore: deleteBucket Flow (with Locking — bucket lock before global)
end
Client->>VectorStoreManager: deleteBucket(name)
VectorStoreManager->>VectorStoreManager: Begin transaction
VectorStoreManager->>LockManager: Acquire bucket-specific lock
LockManager-->>VectorStoreManager: Bucket lock acquired
VectorStoreManager->>LockManager: Acquire VECTOR_BUCKET_COUNT_LOCK
LockManager-->>VectorStoreManager: Global lock acquired
VectorStoreManager->>VectorMetadataDB: findVectorBucket(name) & check empty
VectorMetadataDB-->>VectorStoreManager: Bucket exists and empty
VectorStoreManager->>VectorStore: Delete bucket from storage
VectorStore-->>VectorStoreManager: Bucket deleted
VectorStoreManager->>VectorMetadataDB: Remove bucket metadata
VectorMetadataDB-->>VectorStoreManager: Metadata removed
VectorStoreManager->>LockManager: Release bucket-specific lock
LockManager-->>VectorStoreManager: Bucket lock released
VectorStoreManager->>LockManager: Release VECTOR_BUCKET_COUNT_LOCK
LockManager-->>VectorStoreManager: Global lock released
VectorStoreManager->>VectorStoreManager: Commit transaction
VectorStoreManager-->>Client: Bucket deleted
sequenceDiagram
actor Client
participant VectorStoreManager
participant LockManager as Lock Manager
participant VectorMetadataDB as Metadata DB
participant Sharder
rect rgba(200,150,100,0.5)
Note over Client,Sharder: createVectorIndex Flow (with Validation)
end
Client->>VectorStoreManager: createVectorIndex(bucket, index)
VectorStoreManager->>VectorStoreManager: Begin transaction
VectorStoreManager->>LockManager: Acquire bucket-level lock
LockManager-->>VectorStoreManager: Lock acquired
VectorStoreManager->>VectorMetadataDB: findVectorBucket(bucket) — ensure exists
VectorMetadataDB-->>VectorStoreManager: Bucket confirmed
VectorStoreManager->>VectorMetadataDB: Count existing indexes
VectorMetadataDB-->>VectorStoreManager: Index count retrieved
VectorStoreManager->>Sharder: Reserve shard resources
Sharder-->>VectorStoreManager: Shard reserved
VectorStoreManager->>VectorMetadataDB: Create index metadata
VectorMetadataDB-->>VectorStoreManager: Index created
VectorStoreManager->>LockManager: Release bucket-level lock
LockManager-->>VectorStoreManager: Lock released
VectorStoreManager->>VectorStoreManager: Commit transaction
VectorStoreManager-->>Client: Index created
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
b07b49e to
5171214
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/storage/protocols/vector/vector-store.ts`:
- Around line 65-67: The global VECTOR_BUCKET_COUNT_LOCK is held across long
transaction work (db.withTransaction +
reserve()/vectorStore.createVectorIndex()/confirm()), causing unrelated bucket
ops to stall; change the locking order and scope so the per-bucket lock
(lockResource('bucket', bucketName)) is acquired before or instead of holding
the global count lock for the whole transaction, and confine
VECTOR_BUCKET_COUNT_LOCK to the minimal critical section that only updates the
bucket count (i.e., take/release VECTOR_BUCKET_COUNT_LOCK just around the count
increment/decrement outside the long-running
createVectorIndex()/reserve()/confirm() work or move those long ops into a
separate transaction after releasing the global lock), updating calls in
deleteBucket(), createVectorIndex(), and the db.withTransaction blocks
accordingly to avoid holding the advisory global lock during long operations.
- Around line 45-60: The capacity check (tnx.countBuckets() compared to
this.config.maxBucketCount) runs before attempting
tnx.createVectorBucket(bucketName), causing retries for an already-existing
bucket to throw ERRORS.S3VectorMaxBucketsExceeded; fix by making bucket-creation
idempotent under the lock: attempt tnx.createVectorBucket(bucketName) first
inside the withTransaction, catch ConflictException and return immediately, and
only after a successful creation call tnx.countBuckets() and throw
ERRORS.S3VectorMaxBucketsExceeded if the new count exceeds
this.config.maxBucketCount; use the existing symbols (withTransaction,
lockResource, tnx.createVectorBucket, tnx.countBuckets, ConflictException,
ERRORS.S3VectorMaxBucketsExceeded, this.config.maxBucketCount) to locate and
update the logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: cdeb213e-fd3e-4f8c-b638-c4740f721793
📒 Files selected for processing (3)
package.jsonsrc/storage/protocols/vector/vector-store.tssrc/test/vector-store-manager.test.ts
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/storage/protocols/vector/vector-store.ts (2)
45-60:⚠️ Potential issue | 🔴 CriticalPreserve idempotent
createBucket()once the bucket limit is reached.Because the max-count check at Line 48 runs before the
ConflictExceptionfast-path at Lines 53-57, a retry for an already-created bucket starts failing withS3VectorMaxBucketsExceededas soon as the limit is full. The insert needs to happen first under the lock, then the limit should be enforced only if a new row was actually created.Possible fix
await this.db.withTransaction(async (tnx) => { await tnx.lockResource('global', VECTOR_BUCKET_COUNT_LOCK) - const bucketCount = await tnx.countBuckets() - if (bucketCount >= this.config.maxBucketCount) { - throw ERRORS.S3VectorMaxBucketsExceeded(this.config.maxBucketCount) - } - try { await tnx.createVectorBucket(bucketName) } catch (e) { if (e instanceof ConflictException) { return } throw e } + + const bucketCount = await tnx.countBuckets() + if (bucketCount > this.config.maxBucketCount) { + throw ERRORS.S3VectorMaxBucketsExceeded(this.config.maxBucketCount) + } })As per coding guidelines, "Highlight only issues that could cause runtime errors, data loss, or severe maintainability issues."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/protocols/vector/vector-store.ts` around lines 45 - 60, The transaction currently checks bucketCount via tnx.countBuckets() before attempting tnx.createVectorBucket(bucketName), causing retries to throw ERRORS.S3VectorMaxBucketsExceeded even when the bucket already exists; change the logic inside withTransaction/tnx so you first attempt tnx.createVectorBucket(bucketName) and handle ConflictException (return) as a fast-path, then after a successful create check tnx.countBuckets() and throw ERRORS.S3VectorMaxBucketsExceeded only if the create actually added a new row and the count exceeds this.config.maxBucketCount; keep the lockResource('global', VECTOR_BUCKET_COUNT_LOCK) and preserve the ConflictException handling around createVectorBucket to maintain idempotency.
65-67:⚠️ Potential issue | 🟠 MajorDon't let a busy bucket monopolize the global count lock.
deleteBucket()takesVECTOR_BUCKET_COUNT_LOCKat Line 66 before waiting on the per-bucket lock, whilecreateVectorIndex()keeps that bucket lock until the transaction ends afterreserve(), remotecreateVectorIndex(), andconfirm(). One slow index creation on bucket A can therefore block every unrelatedcreateBucket()/deleteBucket()behind the global lock. The lock scheme needs to avoid waiting on a bucket while already holding the global count lock.As per coding guidelines, "Highlight only issues that could cause runtime errors, data loss, or severe maintainability issues."
Also applies to: 137-185
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/protocols/vector/vector-store.ts` around lines 65 - 67, deleteBucket() currently acquires VECTOR_BUCKET_COUNT_LOCK before waiting on the per-bucket lock which can cause the global lock to be held while waiting for slow bucket operations (and createVectorIndex() holds the bucket lock across reserve()/createVectorIndex()/confirm()), so reorder locking to avoid holding the global count lock while waiting on a bucket: change deleteBucket() (and the similar path in createVectorIndex()/reserve()/confirm()) to acquire the per-bucket lock (lockResource('bucket', bucketName)) first, then perform any bucket-specific checks/operations, and only acquire VECTOR_BUCKET_COUNT_LOCK when you actually need to update the global count (release the bucket lock if necessary before obtaining the global lock and re-acquire to finalize if you must modify both); alternatively use a two-phase approach: lock bucket → check state → release bucket → lock global → re-lock bucket to perform count mutation, ensuring no code path holds both locks while blocking on I/O or long operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/storage/protocols/vector/vector-store.ts`:
- Around line 45-60: The transaction currently checks bucketCount via
tnx.countBuckets() before attempting tnx.createVectorBucket(bucketName), causing
retries to throw ERRORS.S3VectorMaxBucketsExceeded even when the bucket already
exists; change the logic inside withTransaction/tnx so you first attempt
tnx.createVectorBucket(bucketName) and handle ConflictException (return) as a
fast-path, then after a successful create check tnx.countBuckets() and throw
ERRORS.S3VectorMaxBucketsExceeded only if the create actually added a new row
and the count exceeds this.config.maxBucketCount; keep the
lockResource('global', VECTOR_BUCKET_COUNT_LOCK) and preserve the
ConflictException handling around createVectorBucket to maintain idempotency.
- Around line 65-67: deleteBucket() currently acquires VECTOR_BUCKET_COUNT_LOCK
before waiting on the per-bucket lock which can cause the global lock to be held
while waiting for slow bucket operations (and createVectorIndex() holds the
bucket lock across reserve()/createVectorIndex()/confirm()), so reorder locking
to avoid holding the global count lock while waiting on a bucket: change
deleteBucket() (and the similar path in createVectorIndex()/reserve()/confirm())
to acquire the per-bucket lock (lockResource('bucket', bucketName)) first, then
perform any bucket-specific checks/operations, and only acquire
VECTOR_BUCKET_COUNT_LOCK when you actually need to update the global count
(release the bucket lock if necessary before obtaining the global lock and
re-acquire to finalize if you must modify both); alternatively use a two-phase
approach: lock bucket → check state → release bucket → lock global → re-lock
bucket to perform count mutation, ensuring no code path holds both locks while
blocking on I/O or long operations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 1f4a4229-f05c-4c59-98ae-d901c079e529
📒 Files selected for processing (4)
package.jsonsrc/storage/protocols/vector/knex.tssrc/storage/protocols/vector/vector-store.tssrc/test/vector-store-manager.test.ts
5171214 to
2f497ca
Compare
Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
2f497ca to
f2d4a8f
Compare
What kind of change does this PR introduce?
Feat
What is the current behavior?
Vector store uses serializable isolation level, which isn't supported by OrioleDB.
What is the new behavior?
Switch into advisory row locks to drop serializable isolation level requirement and to support OrioleDB.
Additional context
Improve test coverage for the possible races.