Skip to content

feat: add orioledb support for vector buckets#897

Open
ferhatelmas wants to merge 1 commit intomasterfrom
ferhat/oriole-vector
Open

feat: add orioledb support for vector buckets#897
ferhatelmas wants to merge 1 commit intomasterfrom
ferhat/oriole-vector

Conversation

@ferhatelmas
Copy link
Member

What kind of change does this PR introduce?

Feat

What is the current behavior?

Vector store uses serializable isolation level, which isn't supported by OrioleDB.

What is the new behavior?

Switch into advisory row locks to drop serializable isolation level requirement and to support OrioleDB.

Additional context

Improve test coverage for the possible races.

@ferhatelmas ferhatelmas requested a review from a team as a code owner March 6, 2026 15:56
@coderabbitai
Copy link

coderabbitai bot commented Mar 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2658c0c9-a462-4cf8-815b-a298c7fac0e5

📥 Commits

Reviewing files that changed from the base of the PR and between 2f497ca and f2d4a8f.

📒 Files selected for processing (4)
  • package.json
  • src/storage/protocols/vector/knex.ts
  • src/storage/protocols/vector/vector-store.ts
  • src/test/vector-store-manager.test.ts

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Improved concurrency and consistency for vector storage operations to prevent race conditions during bucket/index lifecycle.
  • Tests

    • Added comprehensive tests covering bucket/index lifecycle, lock ordering, and concurrent create/delete scenarios.
    • Vector-related tests are now included in the standard test run.
  • Chores

    • Updated test script to run previously excluded vector tests.

Walkthrough

Adds explicit locking and pre-condition lookups to vector storage transactions. Introduces exported constant VECTOR_BUCKET_COUNT_LOCK and a new VectorLockResourceType ('bucket' | 'index' | 'global'), and updates VectorMetadataDB.lockResource signatures. createBucket acquires the global bucket-count lock and verifies counts under lock; deleteBucket acquires the per-bucket lock then the global bucket-count lock and re-checks emptiness under locks; getBucket and createVectorIndex acquire per-bucket locks and validate existence before proceeding. A new test suite exercises lifecycle, concurrency, and lock ordering. package.json test:oriole now runs jest without the prior testPathIgnorePatterns filter.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant VectorStoreManager
    participant LockManager as Lock Manager
    participant VectorMetadataDB as Metadata DB
    participant VectorStore

    rect rgba(100,150,200,0.5)
    Note over Client,VectorStore: createBucket Flow (with Locking)
    end

    Client->>VectorStoreManager: createBucket(name)
    VectorStoreManager->>VectorStoreManager: Begin transaction
    VectorStoreManager->>LockManager: Acquire VECTOR_BUCKET_COUNT_LOCK
    LockManager-->>VectorStoreManager: Lock acquired
    VectorStoreManager->>VectorMetadataDB: findVectorBucket(name) & count buckets
    VectorMetadataDB-->>VectorStoreManager: Bucket not found, count validated
    VectorStoreManager->>VectorStore: Create bucket in storage
    VectorStore-->>VectorStoreManager: Bucket created
    VectorStoreManager->>VectorMetadataDB: Persist bucket metadata
    VectorMetadataDB-->>VectorStoreManager: Metadata saved
    VectorStoreManager->>LockManager: Release VECTOR_BUCKET_COUNT_LOCK
    LockManager-->>VectorStoreManager: Lock released
    VectorStoreManager->>VectorStoreManager: Commit transaction
    VectorStoreManager-->>Client: Bucket created
Loading
sequenceDiagram
    actor Client
    participant VectorStoreManager
    participant LockManager as Lock Manager
    participant VectorMetadataDB as Metadata DB
    participant VectorStore

    rect rgba(150,100,200,0.5)
    Note over Client,VectorStore: deleteBucket Flow (with Locking — bucket lock before global)
    end

    Client->>VectorStoreManager: deleteBucket(name)
    VectorStoreManager->>VectorStoreManager: Begin transaction
    VectorStoreManager->>LockManager: Acquire bucket-specific lock
    LockManager-->>VectorStoreManager: Bucket lock acquired
    VectorStoreManager->>LockManager: Acquire VECTOR_BUCKET_COUNT_LOCK
    LockManager-->>VectorStoreManager: Global lock acquired
    VectorStoreManager->>VectorMetadataDB: findVectorBucket(name) & check empty
    VectorMetadataDB-->>VectorStoreManager: Bucket exists and empty
    VectorStoreManager->>VectorStore: Delete bucket from storage
    VectorStore-->>VectorStoreManager: Bucket deleted
    VectorStoreManager->>VectorMetadataDB: Remove bucket metadata
    VectorMetadataDB-->>VectorStoreManager: Metadata removed
    VectorStoreManager->>LockManager: Release bucket-specific lock
    LockManager-->>VectorStoreManager: Bucket lock released
    VectorStoreManager->>LockManager: Release VECTOR_BUCKET_COUNT_LOCK
    LockManager-->>VectorStoreManager: Global lock released
    VectorStoreManager->>VectorStoreManager: Commit transaction
    VectorStoreManager-->>Client: Bucket deleted
Loading
sequenceDiagram
    actor Client
    participant VectorStoreManager
    participant LockManager as Lock Manager
    participant VectorMetadataDB as Metadata DB
    participant Sharder

    rect rgba(200,150,100,0.5)
    Note over Client,Sharder: createVectorIndex Flow (with Validation)
    end

    Client->>VectorStoreManager: createVectorIndex(bucket, index)
    VectorStoreManager->>VectorStoreManager: Begin transaction
    VectorStoreManager->>LockManager: Acquire bucket-level lock
    LockManager-->>VectorStoreManager: Lock acquired
    VectorStoreManager->>VectorMetadataDB: findVectorBucket(bucket) — ensure exists
    VectorMetadataDB-->>VectorStoreManager: Bucket confirmed
    VectorStoreManager->>VectorMetadataDB: Count existing indexes
    VectorMetadataDB-->>VectorStoreManager: Index count retrieved
    VectorStoreManager->>Sharder: Reserve shard resources
    Sharder-->>VectorStoreManager: Shard reserved
    VectorStoreManager->>VectorMetadataDB: Create index metadata
    VectorMetadataDB-->>VectorStoreManager: Index created
    VectorStoreManager->>LockManager: Release bucket-level lock
    LockManager-->>VectorStoreManager: Lock released
    VectorStoreManager->>VectorStoreManager: Commit transaction
    VectorStoreManager-->>Client: Index created
Loading

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ferhatelmas ferhatelmas force-pushed the ferhat/oriole-vector branch 2 times, most recently from b07b49e to 5171214 Compare March 6, 2026 16:06
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/storage/protocols/vector/vector-store.ts`:
- Around line 65-67: The global VECTOR_BUCKET_COUNT_LOCK is held across long
transaction work (db.withTransaction +
reserve()/vectorStore.createVectorIndex()/confirm()), causing unrelated bucket
ops to stall; change the locking order and scope so the per-bucket lock
(lockResource('bucket', bucketName)) is acquired before or instead of holding
the global count lock for the whole transaction, and confine
VECTOR_BUCKET_COUNT_LOCK to the minimal critical section that only updates the
bucket count (i.e., take/release VECTOR_BUCKET_COUNT_LOCK just around the count
increment/decrement outside the long-running
createVectorIndex()/reserve()/confirm() work or move those long ops into a
separate transaction after releasing the global lock), updating calls in
deleteBucket(), createVectorIndex(), and the db.withTransaction blocks
accordingly to avoid holding the advisory global lock during long operations.
- Around line 45-60: The capacity check (tnx.countBuckets() compared to
this.config.maxBucketCount) runs before attempting
tnx.createVectorBucket(bucketName), causing retries for an already-existing
bucket to throw ERRORS.S3VectorMaxBucketsExceeded; fix by making bucket-creation
idempotent under the lock: attempt tnx.createVectorBucket(bucketName) first
inside the withTransaction, catch ConflictException and return immediately, and
only after a successful creation call tnx.countBuckets() and throw
ERRORS.S3VectorMaxBucketsExceeded if the new count exceeds
this.config.maxBucketCount; use the existing symbols (withTransaction,
lockResource, tnx.createVectorBucket, tnx.countBuckets, ConflictException,
ERRORS.S3VectorMaxBucketsExceeded, this.config.maxBucketCount) to locate and
update the logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: cdeb213e-fd3e-4f8c-b638-c4740f721793

📥 Commits

Reviewing files that changed from the base of the PR and between bd1edfd and 3342270.

📒 Files selected for processing (3)
  • package.json
  • src/storage/protocols/vector/vector-store.ts
  • src/test/vector-store-manager.test.ts

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/storage/protocols/vector/vector-store.ts (2)

45-60: ⚠️ Potential issue | 🔴 Critical

Preserve idempotent createBucket() once the bucket limit is reached.

Because the max-count check at Line 48 runs before the ConflictException fast-path at Lines 53-57, a retry for an already-created bucket starts failing with S3VectorMaxBucketsExceeded as soon as the limit is full. The insert needs to happen first under the lock, then the limit should be enforced only if a new row was actually created.

Possible fix
     await this.db.withTransaction(async (tnx) => {
       await tnx.lockResource('global', VECTOR_BUCKET_COUNT_LOCK)
 
-      const bucketCount = await tnx.countBuckets()
-      if (bucketCount >= this.config.maxBucketCount) {
-        throw ERRORS.S3VectorMaxBucketsExceeded(this.config.maxBucketCount)
-      }
-
       try {
         await tnx.createVectorBucket(bucketName)
       } catch (e) {
         if (e instanceof ConflictException) {
           return
         }
         throw e
       }
+
+      const bucketCount = await tnx.countBuckets()
+      if (bucketCount > this.config.maxBucketCount) {
+        throw ERRORS.S3VectorMaxBucketsExceeded(this.config.maxBucketCount)
+      }
     })

As per coding guidelines, "Highlight only issues that could cause runtime errors, data loss, or severe maintainability issues."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/protocols/vector/vector-store.ts` around lines 45 - 60, The
transaction currently checks bucketCount via tnx.countBuckets() before
attempting tnx.createVectorBucket(bucketName), causing retries to throw
ERRORS.S3VectorMaxBucketsExceeded even when the bucket already exists; change
the logic inside withTransaction/tnx so you first attempt
tnx.createVectorBucket(bucketName) and handle ConflictException (return) as a
fast-path, then after a successful create check tnx.countBuckets() and throw
ERRORS.S3VectorMaxBucketsExceeded only if the create actually added a new row
and the count exceeds this.config.maxBucketCount; keep the
lockResource('global', VECTOR_BUCKET_COUNT_LOCK) and preserve the
ConflictException handling around createVectorBucket to maintain idempotency.

65-67: ⚠️ Potential issue | 🟠 Major

Don't let a busy bucket monopolize the global count lock.

deleteBucket() takes VECTOR_BUCKET_COUNT_LOCK at Line 66 before waiting on the per-bucket lock, while createVectorIndex() keeps that bucket lock until the transaction ends after reserve(), remote createVectorIndex(), and confirm(). One slow index creation on bucket A can therefore block every unrelated createBucket()/deleteBucket() behind the global lock. The lock scheme needs to avoid waiting on a bucket while already holding the global count lock.

As per coding guidelines, "Highlight only issues that could cause runtime errors, data loss, or severe maintainability issues."

Also applies to: 137-185

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/protocols/vector/vector-store.ts` around lines 65 - 67,
deleteBucket() currently acquires VECTOR_BUCKET_COUNT_LOCK before waiting on the
per-bucket lock which can cause the global lock to be held while waiting for
slow bucket operations (and createVectorIndex() holds the bucket lock across
reserve()/createVectorIndex()/confirm()), so reorder locking to avoid holding
the global count lock while waiting on a bucket: change deleteBucket() (and the
similar path in createVectorIndex()/reserve()/confirm()) to acquire the
per-bucket lock (lockResource('bucket', bucketName)) first, then perform any
bucket-specific checks/operations, and only acquire VECTOR_BUCKET_COUNT_LOCK
when you actually need to update the global count (release the bucket lock if
necessary before obtaining the global lock and re-acquire to finalize if you
must modify both); alternatively use a two-phase approach: lock bucket → check
state → release bucket → lock global → re-lock bucket to perform count mutation,
ensuring no code path holds both locks while blocking on I/O or long operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/storage/protocols/vector/vector-store.ts`:
- Around line 45-60: The transaction currently checks bucketCount via
tnx.countBuckets() before attempting tnx.createVectorBucket(bucketName), causing
retries to throw ERRORS.S3VectorMaxBucketsExceeded even when the bucket already
exists; change the logic inside withTransaction/tnx so you first attempt
tnx.createVectorBucket(bucketName) and handle ConflictException (return) as a
fast-path, then after a successful create check tnx.countBuckets() and throw
ERRORS.S3VectorMaxBucketsExceeded only if the create actually added a new row
and the count exceeds this.config.maxBucketCount; keep the
lockResource('global', VECTOR_BUCKET_COUNT_LOCK) and preserve the
ConflictException handling around createVectorBucket to maintain idempotency.
- Around line 65-67: deleteBucket() currently acquires VECTOR_BUCKET_COUNT_LOCK
before waiting on the per-bucket lock which can cause the global lock to be held
while waiting for slow bucket operations (and createVectorIndex() holds the
bucket lock across reserve()/createVectorIndex()/confirm()), so reorder locking
to avoid holding the global count lock while waiting on a bucket: change
deleteBucket() (and the similar path in createVectorIndex()/reserve()/confirm())
to acquire the per-bucket lock (lockResource('bucket', bucketName)) first, then
perform any bucket-specific checks/operations, and only acquire
VECTOR_BUCKET_COUNT_LOCK when you actually need to update the global count
(release the bucket lock if necessary before obtaining the global lock and
re-acquire to finalize if you must modify both); alternatively use a two-phase
approach: lock bucket → check state → release bucket → lock global → re-lock
bucket to perform count mutation, ensuring no code path holds both locks while
blocking on I/O or long operations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 1f4a4229-f05c-4c59-98ae-d901c079e529

📥 Commits

Reviewing files that changed from the base of the PR and between 3342270 and 5171214.

📒 Files selected for processing (4)
  • package.json
  • src/storage/protocols/vector/knex.ts
  • src/storage/protocols/vector/vector-store.ts
  • src/test/vector-store-manager.test.ts

@ferhatelmas ferhatelmas force-pushed the ferhat/oriole-vector branch from 5171214 to 2f497ca Compare March 6, 2026 16:20
Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
@ferhatelmas ferhatelmas force-pushed the ferhat/oriole-vector branch from 2f497ca to f2d4a8f Compare March 6, 2026 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant