Skip to content

fix: optimize bulk secret upsert to prevent 502/500 timeouts on large batches#5537

Merged
varonix0 merged 2 commits intomainfrom
devin/1771941909-fix-bulk-upsert-timeout
Feb 25, 2026
Merged

fix: optimize bulk secret upsert to prevent 502/500 timeouts on large batches#5537
varonix0 merged 2 commits intomainfrom
devin/1771941909-fix-bulk-upsert-timeout

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Feb 24, 2026

Context

Customers report 502 errors when bulk upserting ~740 secrets via PATCH /api/v3/secrets/batch/raw with mode: "upsert", and 500 errors even when chunking down to 50 secrets. This blocks migrations.

Root cause: Two compounding inefficiencies in the bulk secret write path:

  1. Massive OR queries: updateManySecret, createManySecret, and deleteManySecret constructed a $complex OR clause with N individual (key = ? AND type = ?) conditions. Since every condition uses type = 'shared', this is equivalent to a single WHERE type = 'shared' AND key IN (...), which is dramatically more efficient for the query planner.

  2. Unbatched writes: bulkUpdate fired all N individual UPDATE queries via Promise.all simultaneously, and insertMany for secrets/versions sent all rows in a single query—both problematic at scale.

Changes:

  • Replace all $complex OR queries with $in + direct type filter (4 call sites in secret-v2-bridge-service.ts)
  • Batch bulkUpdate in the DAL to process 500 updates per iteration (secret-v2-bridge-dal.ts)
  • Batch insertMany calls for secrets and secret versions in fnSecretBulkInsert and fnSecretBulkUpdate (secret-v2-bridge-fns.ts)

Important notes for reviewers

  • The $in pattern ([${TableName.SecretV2}.key as "key"]) is already used elsewhere in the same files (e.g., fnSecretBulkUpdate final find call), so it's a proven pattern.
  • BULK_BATCH_SIZE = 500 is extracted as a shared constant in secret-v2-bridge-fns.ts (used in 3 places). The DAL still has its own local BATCH_SIZE = 500—could be unified if preferred.
  • Within each batch of 500, bulkUpdate still uses Promise.all (i.e., 500 concurrent UPDATE queries per batch). This is much better than 740 at once but still parallel within the batch—worth confirming this is acceptable for your DB connection pool.
  • Not tested with actual large batch at runtime (CI passes including lint, TS, BDD tests, and integration tests). Recommend testing with 500+ secrets in staging before deploying.

Human review checklist

  • Confirm the $in queries behave identically to the old $complex OR queries for all edge cases (e.g., empty arrays, duplicate keys)
  • Verify the $in pattern with the computed column name [${TableName.SecretV2}.key as "key"] produces correct SQL in all four call sites
  • Check that batched insertMany results maintain correct index alignment with inputSecrets (used for tag mapping and version metadata)
  • Validate that 500 concurrent UPDATEs per batch in bulkUpdate is safe for your production DB connection pool

Steps to verify the change

  1. Test bulk upsert with 740 secrets (the original failing case)
  2. Test bulk create with 500+ secrets
  3. Test bulk delete with 500+ secrets
  4. Verify no regressions on smaller batches (1-50 secrets)
  5. Check query performance in DB logs—should see IN clauses instead of massive OR chains

Updates since last revision

  • Added explicit type annotations: newSecrets: TSecretsV2[] and secretVersions: TSecretVersionsV2[] for all batched arrays (addresses Greptile review feedback)
  • Extracted BULK_BATCH_SIZE = 500 as a shared constant in secret-v2-bridge-fns.ts instead of hardcoding per loop
  • Fixed import sort order (TSecretsV2 before TSecretVersionsV2) to satisfy simple-import-sort/imports
  • Removed unnecessary { tx: undefined } parameter from find() call in createManySecret (addresses Greptile review feedback)

Type

  • Fix
  • Feature
  • Improvement
  • Breaking
  • Docs
  • Chore

Checklist

  • Title follows the conventional commit format: type(scope): short description (scope is optional, e.g., fix: prevent crash on sync or fix(api): handle null response).
  • Tested locally
  • Updated docs (if needed)
  • Read the contributing guide

Link to Devin run: https://app.devin.ai/sessions/006c60068ec248608f44d60ce8ca7792
Requested by: @0xArshdeep

… batches

- Replace complex OR queries with efficient IN clauses in updateManySecret, createManySecret, and deleteManySecret
- Batch bulkUpdate DAL method to process updates in chunks of 500
- Batch insertMany calls for secret versions and secret inserts in fnSecretBulkInsert and fnSecretBulkUpdate

The complex OR queries generated N individual AND conditions (one per secret) which produced enormous SQL queries for large batches. Since all conditions share type=Shared, a single WHERE type='shared' AND key IN (...) is equivalent and far more efficient.

The bulkUpdate method previously fired all N individual UPDATE queries via Promise.all simultaneously. Now processes in batches of 500 to prevent transaction timeouts.

Similarly, insertMany calls for secret versions and secrets are now batched to avoid potential PostgreSQL parameter limits on very large inserts.

Co-Authored-By: unknown <>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@maidul98
Copy link
Collaborator

maidul98 commented Feb 24, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 24, 2026

Greptile Summary

This PR optimizes bulk secret operations to prevent 502/500 timeouts when upserting large batches (740+ secrets). The changes address two key inefficiencies:

  • Query optimization: Replaces $complex OR queries with $in pattern at 4 call sites, reducing massive OR chains to efficient WHERE type = 'shared' AND key IN (...) queries
  • Batching: Adds 500-item batching to bulkUpdate, insertMany for secrets, and insertMany for secret versions to prevent overwhelming the database

The query optimization is semantically equivalent and proven safe. Batching preserves array order correctly, maintaining index alignment for metadata and parent version IDs.

Confidence Score: 4/5

  • This PR is safe to merge with proper testing in staging first
  • The optimizations are well-designed and semantically correct, but need runtime validation with large batches (500+ secrets) in staging before production deployment. The 500 concurrent queries per batch within transactions should work but hasn't been load-tested.
  • All files should be tested with 500+ secrets in staging to validate performance improvements and ensure no regressions with the batching logic

Important Files Changed

Filename Overview
backend/src/services/secret-v2-bridge/secret-v2-bridge-dal.ts Adds batching to bulkUpdate (500 updates per batch) to prevent overwhelming DB with concurrent queries. Order preservation is correct.
backend/src/services/secret-v2-bridge/secret-v2-bridge-fns.ts Batches insertMany for secrets and versions (500 per batch). Index alignment for metadata is preserved. Introduces BULK_BATCH_SIZE constant.
backend/src/services/secret-v2-bridge/secret-v2-bridge-service.ts Replaces inefficient $complex OR queries with optimized $in pattern (4 locations). Semantically equivalent and more efficient for DB query planner.

Last reviewed commit: 4669eed

greptile-apps[bot]

This comment was marked as resolved.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 24, 2026

Additional Comments (1)

backend/src/services/secret-v2-bridge/secret-v2-bridge-dal.ts
⚠️ Unsafe OR Query Detected

deleteMany uses orWhere conditions directly, which breaks the outer folderId filter. With large batches, this could delete secrets from wrong folders.

Security Impact: High - Could lead to unauthorized data deletion

Fix: Should also be converted to use $in pattern like the other bulk operations in this PR:

const deletedSecrets = await (tx || db)(TableName.SecretV2)
  .where({ folderId })
  .whereIn(`${TableName.SecretV2}.key`, data.map(el => el.key))
  .whereIn(`${TableName.SecretV2}.type`, [SecretType.Shared, SecretType.Personal])
  .delete()
  .returning("*");

Context Used: Rule from dashboard - # Greptile Code Review Prompt: OR Query Safety Check (knex.js)

Objective

Flag database queries t... (source)

Copy link
Contributor Author

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

fix: resolve eslint no-await-in-loop and prettier formatting issues

fix: add type annotations and extract shared BATCH_SIZE constant per review feedback

fix: correct import sort order for TSecretVersionsV2
Co-Authored-By: unknown <>
@varonix0 varonix0 force-pushed the devin/1771941909-fix-bulk-upsert-timeout branch from ae233b5 to 4669eed Compare February 25, 2026 05:21
@varonix0
Copy link
Member

@devin I'd like to hear your thoughts on this

Additional Comments (1)
backend/src/services/secret-v2-bridge/secret-v2-bridge-dal.ts ⚠️ Unsafe OR Query Detected

deleteMany uses orWhere conditions directly, which breaks the outer folderId filter. With large batches, this could delete secrets from wrong folders.

Security Impact: High - Could lead to unauthorized data deletion

Fix: Should also be converted to use $in pattern like the other bulk operations in this PR:

const deletedSecrets = await (tx || db)(TableName.SecretV2)
  .where({ folderId })
  .whereIn(`${TableName.SecretV2}.key`, data.map(el => el.key))
  .whereIn(`${TableName.SecretV2}.type`, [SecretType.Shared, SecretType.Personal])
  .delete()
  .returning("*");

Context Used: Rule from dashboard - # Greptile Code Review Prompt: OR Query Safety Check (knex.js)

Objective

Flag database queries t... (source)

@devin-ai-integration
Copy link
Contributor Author

@varonix0 Good question — I looked into this carefully. The Greptile "critical security vulnerability" finding is a false positive.

The deleteMany code uses Knex's nested .where() callback pattern:

.where({ folderId })          // outer AND condition
.where((bd) => {              // grouped inner condition (wrapped in parentheses)
  data.forEach((el) => {
    void bd.orWhere({ ... }); // OR conditions scoped inside the group
  });
})

This generates SQL like:

DELETE FROM secrets_v2 
WHERE folder_id = ? 
AND ((key = ? AND type = ?) OR (key = ? AND type = ?) OR ...)

The folderId filter is not broken — the orWhere calls are scoped inside the .where((bd) => {...}) callback, which Knex wraps in parentheses. This is the standard Knex pattern for grouped OR conditions, and it's used elsewhere in this same file (e.g., findByFolderId at line 348 uses the exact same pattern).

Additionally, Greptile's suggested whereIn fix is semantically wrong — it drops two important behaviors:

  1. When a shared secret is deleted, the existing code also deletes the corresponding personal overrides (el.type === SecretType.Shared triggers an additional orWhere for type: SecretType.Personal)
  2. When deleting personal secrets, the existing code filters by userId — the suggested fix drops this entirely

That said, two valid points:

  • Performance: With 740+ secrets, deleteMany would still generate a large OR chain (same perf issue this PR fixes elsewhere). But this is a performance concern, not a security one, and deleteMany wasn't in scope for this PR since it's pre-existing code.
  • This function was not modified by this PR — the Greptile bot flagged it as context on the DAL file, but it's not part of the changeset.

If you'd like, I can optimize deleteMany for performance in a follow-up, but it would need to preserve the shared→personal cascade deletion logic (not a simple whereIn swap).

@varonix0
Copy link
Member

@greptile re-review this PR with previous comments and addressed feedback in mind

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@varonix0 varonix0 merged commit 1bdeffd into main Feb 25, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants