Skip to content

Validate every DB clause in COPY against ACL db= permissions#3801

Merged
enjoy-binbin merged 1 commit into
valkey-io:unstablefrom
enjoy-binbin:fix_copy
Jun 2, 2026
Merged

Validate every DB clause in COPY against ACL db= permissions#3801
enjoy-binbin merged 1 commit into
valkey-io:unstablefrom
enjoy-binbin:fix_copy

Conversation

@enjoy-binbin

@enjoy-binbin enjoy-binbin commented May 21, 2026

Copy link
Copy Markdown
Member

copyDbIdArgs assumed the DB token was always at argv[3], so
'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could
bypass db= restrictions. Mirror copyCommand's parser to walk
all argv tokens, collect every DB destination, and let the ACL
checker validate each one.

DB ACL was added in #2309.

Please also pick #3888 if you pick this one.

copyDbIdArgs assumed the DB token was always at argv[3], so
'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could
bypass db= restrictions. Mirror copyCommand's parser to walk
all argv tokens, collect every DB destination, and let the ACL
checker validate each one.

Signed-off-by: Binbin <binloveplay1314@qq.com>
@enjoy-binbin enjoy-binbin requested a review from dvkashapov May 21, 2026 12:56
@enjoy-binbin enjoy-binbin added the release-notes This issue should get a line item in the release notes label May 21, 2026
@coderabbitai

coderabbitai Bot commented May 21, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ae560a81-e210-4f94-a9f1-decba3f176f5

📥 Commits

Reviewing files that changed from the base of the PR and between d778b37 and 60c80da.

📒 Files selected for processing (2)
  • src/db.c
  • tests/unit/acl-v2.tcl

📝 Walkthrough

Walkthrough

The copyDbIdArgs() function is rewritten to parse COPY command destination database clauses in an order-independent manner, supporting multiple DB tokens and varied argument positions. A two-pass parser validates syntax, counts DB occurrences, allocates an array, and collects all destination DB ids. Test coverage is expanded to validate the new parsing behavior across database permissions, REPLACE semantics, and DRYRUN assertions.

Changes

COPY Command Order-Independent Database Argument Parsing

Layer / File(s) Summary
COPY argument parser rewrite
src/db.c
copyDbIdArgs() changed from single fixed-position DB parsing to a two-pass, order-independent parser that validates all arguments after COPY for DB/REPLACE tokens, counts every DB occurrence, allocates an array, and returns all collected destination DB ids.
Database permission and DRYRUN tests
tests/unit/acl-v2.tcl
Test coverage extended to validate COPY operations targeting DB 1 with successful COPY calls covering REPLACE and argument-order variations, expanded permission-denied assertions for DB 2 with varied DB/REPLACE combinations, and DRYRUN variants checking both allowed and disallowed database access.

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: validating every DB clause in COPY commands against ACL db= permissions, which is the primary focus of the changeset.
Description check ✅ Passed The description clearly explains the bug being fixed (copyDbIdArgs assumed DB token was at argv[3]), provides concrete examples of bypassed restrictions, and describes the solution approach.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented May 21, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.69%. Comparing base (aa7488f) to head (60c80da).
⚠️ Report is 1 commits behind head on unstable.

Additional details and impacted files
@@            Coverage Diff            @@
##           unstable    #3801   +/-   ##
=========================================
  Coverage     76.69%   76.69%           
=========================================
  Files           162      162           
  Lines         80710    80721   +11     
=========================================
+ Hits          61901    61911   +10     
- Misses        18809    18810    +1     
Files with missing lines Coverage Δ
src/db.c 94.85% <100.00%> (+0.04%) ⬆️

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dvkashapov dvkashapov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, thank you for fixing this.
One question to discuss here: currently for any COPY command in ACL LOG reason would be COPY command name itself and not dbid because argv index of dbid in COPY is variable, here's my suggestion but this fragile and not the best practice, are we OK with that? Or reason as COPY command is fine?

// acl.c, inside the `if (keyidxptr)` in `ACLSelectorCheckCmd()`

else if (cmd->proc == copyCommand) {
    /* Walk argv to find the relevant DB argument. */
    int nth = 0;
    for (int k = 3; k < argc - 1; k++) {
        if (!strcasecmp(objectGetVal(argv[k]), "db")) {
            if (nth++ == i) {
                *keyidxptr = k + 1;   /* argv[k+1] is the db id string */
                break;
            }
            k++;
        }
    }
}

Comment thread src/db.c
@enjoy-binbin

Copy link
Copy Markdown
Member Author

@dvkashapov thanks for the review. Yes i did notice it before, previously, i personally felt that showing COPY is OK, however, if we do indeed have a low-cost way to fix it, then fixing it is definitely a good choice.

@enjoy-binbin

enjoy-binbin commented May 22, 2026

Copy link
Copy Markdown
Member Author

BTW, currently we need to maintain this whole block, is a bit of trouble, i think another way is extend get_dbid_args so that we can also return the positions, in this way, we will only need to maintain get_dbid_args

if (!ACLSelectorCanAccessDb(selector, dbids[i])) {
    if (keyidxptr) {
        if (cmd->proc == selectCommand)
            *keyidxptr = 1;
        else if (cmd->proc == moveCommand)
            *keyidxptr = 2;
        else if (cmd->proc == swapdbCommand)
            *keyidxptr = (i == 0) ? 1 : 2;
        else if (cmd->proc == migrateCommand)
            *keyidxptr = 4;
        else
            *keyidxptr = 0;
    }
    zfree(dbids);
    return ACL_DENIED_DB;
}

I now, probably more inclined toward get_dbid_args, @dvkashapov WDYT? Furthermore, we need to decide whether to classify this as a bug fix or an optimization; it shouldn't really count as a breaking change (especially since 9.1 was just released), but if necessary, we can always address it in a separate PR.

@dvkashapov

Copy link
Copy Markdown
Member

i think another way is extend get_dbid_args so that we can also return the positions, in this way, we will only need to maintain get_dbid_args

This makes sense, I will raise PR for that. Overall I think this will be a bug fix because intended behaviour was to log dbid when possible.

@enjoy-binbin

Copy link
Copy Markdown
Member Author

This makes sense, I will raise PR for that. Overall I think this will be a bug fix because intended behaviour was to log dbid when possible.

I already made a patch in local branch and play around it. After we merge this one, i can send it.

@enjoy-binbin enjoy-binbin requested a review from zuiderkwast May 22, 2026 09:53
@dvkashapov

Copy link
Copy Markdown
Member

I already made a patch in local branch and play around it. After we merge this one, i can send it.

Ok, good! You're very fast xD

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

The two-pass style is a little compex. I think we can do it with only one pass.

There is actually only 0 or 1 "db" arguments that are meaningful. We can traverse argv and allocate the result array (size 1) when we see "db" the first time and later overwrite the value if we see another "db" argument. Or we can iterate argv backwards to find the last occurrence first.

@zuiderkwast

Copy link
Copy Markdown
Contributor

i think another way is extend get_dbid_args so that we can also return the positions, in this way, we will only need to maintain get_dbid_args

What if we introduce named databases later? In this case, maybe there is not always a database number in argv, but we need some lookup and then we can only return the array of database ids. Or we can return the positions too but they just point to named databases in that case.

@enjoy-binbin

Copy link
Copy Markdown
Member Author

There is actually only 0 or 1 "db" arguments that are meaningful. We can traverse argv and allocate the result array (size 1) when we see "db" the first time and later overwrite the value if we see another "db" argument. Or we can iterate argv backwards to find the last occurrence first.

That is a good call out. So you are saying that we actually no need to validate every DB, we just need to validate the last DBID? That is indeed a valid approach.

I feel that allowing duplicate args often causes us some troubles at certain moments. However, I suspect no one would actually doing this. So checking every DB and reporting the first one or just simply check only the last one. Either approach would be acceptable to me, though personally, I might lean slightly toward the former.

@enjoy-binbin

Copy link
Copy Markdown
Member Author

What if we introduce named databases later? In this case, maybe there is not always a database number in argv, but we need some lookup and then we can only return the array of database ids. Or we can return the positions too but they just point to named databases in that case.

So you are saying something like: db=mydb1,mydb2, and then we will also have COPY src dst DBNAME mydb2. I actually haven't thought about it. I suspect they will eventually point to database id, so it might work as well. We can always change the implementation later.

@zuiderkwast

zuiderkwast commented May 22, 2026

Copy link
Copy Markdown
Contributor

I feel that allowing duplicate args often causes us some troubles at certain moments. However, I suspect no one would actually doing this. So checking every DB and reporting the first one or just simply check only the last one. Either approach would be acceptable to me, though personally, I might lean slightly toward the former.

Yes. The best would be to reject multiple "db" arguments, but I guess that is too late now. It would be a breaking change.

@zuiderkwast zuiderkwast moved this to Needs Review in Valkey 9.1 May 28, 2026
@enjoy-binbin enjoy-binbin merged commit 1206103 into valkey-io:unstable Jun 2, 2026
99 checks passed
@github-project-automation github-project-automation Bot moved this from Needs Review to To be backported in Valkey 9.1 Jun 2, 2026
@enjoy-binbin enjoy-binbin deleted the fix_copy branch June 2, 2026 02:15
eifrah-aws pushed a commit to eifrah-aws/valkey that referenced this pull request Jun 2, 2026
…io#3801)

copyDbIdArgs assumed the DB token was always at argv[3], so
'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could
bypass db= restrictions. Mirror copyCommand's parser to walk
all argv tokens, collect every DB destination, and let the ACL
checker validate each one.

DB ACL was added in valkey-io#2309.

Signed-off-by: Binbin <binloveplay1314@qq.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request Jun 3, 2026
copyDbIdArgs assumed the DB token was always at argv[3], so
'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could
bypass db= restrictions. Mirror copyCommand's parser to walk
all argv tokens, collect every DB destination, and let the ACL
checker validate each one.

DB ACL was added in #2309.

Signed-off-by: Binbin <binloveplay1314@qq.com>
bandalgomsu pushed a commit to bandalgomsu/valkey that referenced this pull request Jun 4, 2026
…io#3801)

copyDbIdArgs assumed the DB token was always at argv[3], so
'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could
bypass db= restrictions. Mirror copyCommand's parser to walk
all argv tokens, collect every DB destination, and let the ACL
checker validate each one.

DB ACL was added in valkey-io#2309.

Signed-off-by: Binbin <binloveplay1314@qq.com>
enjoy-binbin added a commit that referenced this pull request Jun 8, 2026
Rework commandDbIdArgs so each helper returns the argv index of every dbid
argument it owns. ACLSelectorCheckCmd uses those positions to set keyidxptr
directly, replacing the cmd->proc-based offset table. The caller now reads
the dbid value via getLongLongFromObject(argv[positions[i]], ...). New
db=-checked commands only need to implement their own *DbIdArgs helper.

The fix is for COPY: the old chain fell through to 0, so the "object"
field in ACL LOG was always showed "copy", it now reports the first
denied dbid.

Based on #3801, also see it for more details, DB ACL was added in #2309.

Signed-off-by: Binbin <binloveplay1314@qq.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request Jun 9, 2026
Rework commandDbIdArgs so each helper returns the argv index of every dbid
argument it owns. ACLSelectorCheckCmd uses those positions to set keyidxptr
directly, replacing the cmd->proc-based offset table. The caller now reads
the dbid value via getLongLongFromObject(argv[positions[i]], ...). New
db=-checked commands only need to implement their own *DbIdArgs helper.

The fix is for COPY: the old chain fell through to 0, so the "object"
field in ACL LOG was always showed "copy", it now reports the first
denied dbid.

Based on #3801, also see it for more details, DB ACL was added in #2309.

Signed-off-by: Binbin <binloveplay1314@qq.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request Jun 10, 2026
copyDbIdArgs assumed the DB token was always at argv[3], so
'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could
bypass db= restrictions. Mirror copyCommand's parser to walk
all argv tokens, collect every DB destination, and let the ACL
checker validate each one.

DB ACL was added in #2309.

Signed-off-by: Binbin <binloveplay1314@qq.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request Jun 10, 2026
Rework commandDbIdArgs so each helper returns the argv index of every dbid
argument it owns. ACLSelectorCheckCmd uses those positions to set keyidxptr
directly, replacing the cmd->proc-based offset table. The caller now reads
the dbid value via getLongLongFromObject(argv[positions[i]], ...). New
db=-checked commands only need to implement their own *DbIdArgs helper.

The fix is for COPY: the old chain fell through to 0, so the "object"
field in ACL LOG was always showed "copy", it now reports the first
denied dbid.

Based on #3801, also see it for more details, DB ACL was added in #2309.

Signed-off-by: Binbin <binloveplay1314@qq.com>
ranshid added a commit that referenced this pull request Jun 11, 2026
# Backport sweep for 9.1

Automated cherry-picks from PRs marked "To be backported".

## Applied

| Source PR | Title | Detail |
|---|---|---|
| #3743 | Fix buffered_reply assert in HFE commands with module keyspace
notifications | cherry-picked in a prior sweep |
| #3766 | Fix flaky block_keyspace_notification test for HGETDEL notify
race | cherry-picked in a prior sweep |
| #3800 | Fix heap-use-after-free in ACL LOAD when client free is
deferred | cherry-picked in a prior sweep |
| #3723 | Fix double-finish and RESP reply violation in cluster slot
migration | cherry-picked in a prior sweep |
| #3872 | Redacting customer information when hide_user_data_from_log is
true in rdb.c, networking.c, debug.c and t_hash | cherry-picked in a
prior sweep |
| #3846 | Fix use-after-free in VM_RegisterClusterMessageReceiver |
cherry-picked in a prior sweep |
| #3806 | Add ALL_DBS flag to CLUSTER FLUSHSLOT for database-level ACL |
cherry-picked in a prior sweep |
| #3847 | Harden SENTINEL commands and config rewrite against
control-character injection | |
| #3801 | Validate every DB clause in COPY against ACL db= permissions |
|
| #3851 | Replace AUTOMATION_PAT with Valkeyrie Bot GitHub App token | |
| #3848 | Fix cluster AUX-field control-character and delimiter
injection | |
| #3544 | Revert "IO-Threads redesign cleanup work (#3544)" |
cherry-picked in a prior sweep |
| #3888 | Report exact dbid for COPY in ACL LOG when db= access is
denied | conflicts resolved by Claude Code |
| #3942 | Fix shard_id format specifier in UPDATE message log |  |
| #3941 | Avoid random() % 0 undefined behaviour when
cluster-node-timeout < 30 | |

---
*Generated by valkey-ci-agent using Claude Code.*

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: chx9 <lovelypiska@outlook.com>
Signed-off-by: zackcam <zackcam@amazon.com>
Signed-off-by: Eran Ifrah <eifrah@amazon.com>
Signed-off-by: Jules Lasarte <lasartej@amazon.com>
Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>
Signed-off-by: akash kumar <akumdev@amazon.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: lovelypiska <lovelypiska@outlook.com>
Co-authored-by: zackcam <zackcam@amazon.com>
Co-authored-by: eifrah-aws <eifrah@amazon.com>
Co-authored-by: jjuleslasarte <jules.lasarte@gmail.com>
Co-authored-by: Jules Lasarte <lasartej@amazon.com>
Co-authored-by: Akash Kumar <45854686+akashkgit@users.noreply.github.com>
@sarthakaggarwal97 sarthakaggarwal97 moved this from To be backported to Done in Valkey 9.1 Jun 15, 2026
sarthakaggarwal97 added a commit to sarthakaggarwal97/valkey that referenced this pull request Jun 24, 2026
…ey-io#3888) (#345)

Rework commandDbIdArgs so each helper returns the argv index of every dbid
argument it owns. ACLSelectorCheckCmd uses those positions to set keyidxptr
directly, replacing the cmd->proc-based offset table. The caller now reads
the dbid value via getLongLongFromObject(argv[positions[i]], ...). New
db=-checked commands only need to implement their own *DbIdArgs helper.

The fix is for COPY: the old chain fell through to 0, so the "object"
field in ACL LOG was always showed "copy", it now reports the first
denied dbid.

Based on valkey-io#3801, also see it for more details, DB ACL was added in valkey-io#2309.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes This issue should get a line item in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants