Validate every DB clause in COPY against ACL db= permissions#3801
Conversation
copyDbIdArgs assumed the DB token was always at argv[3], so 'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could bypass db= restrictions. Mirror copyCommand's parser to walk all argv tokens, collect every DB destination, and let the ACL checker validate each one. Signed-off-by: Binbin <binloveplay1314@qq.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe ChangesCOPY Command Order-Independent Database Argument Parsing
🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3801 +/- ##
=========================================
Coverage 76.69% 76.69%
=========================================
Files 162 162
Lines 80710 80721 +11
=========================================
+ Hits 61901 61911 +10
- Misses 18809 18810 +1
🚀 New features to boost your workflow:
|
dvkashapov
left a comment
There was a problem hiding this comment.
Overall LGTM, thank you for fixing this.
One question to discuss here: currently for any COPY command in ACL LOG reason would be COPY command name itself and not dbid because argv index of dbid in COPY is variable, here's my suggestion but this fragile and not the best practice, are we OK with that? Or reason as COPY command is fine?
// acl.c, inside the `if (keyidxptr)` in `ACLSelectorCheckCmd()`
else if (cmd->proc == copyCommand) {
/* Walk argv to find the relevant DB argument. */
int nth = 0;
for (int k = 3; k < argc - 1; k++) {
if (!strcasecmp(objectGetVal(argv[k]), "db")) {
if (nth++ == i) {
*keyidxptr = k + 1; /* argv[k+1] is the db id string */
break;
}
k++;
}
}
}|
@dvkashapov thanks for the review. Yes i did notice it before, previously, i personally felt that showing |
|
BTW, currently we need to maintain this whole block, is a bit of trouble, i think another way is extend get_dbid_args so that we can also return the I now, probably more inclined toward |
This makes sense, I will raise PR for that. Overall I think this will be a bug fix because intended behaviour was to log dbid when possible. |
I already made a patch in local branch and play around it. After we merge this one, i can send it. |
Ok, good! You're very fast xD |
There was a problem hiding this comment.
Good catch!
The two-pass style is a little compex. I think we can do it with only one pass.
There is actually only 0 or 1 "db" arguments that are meaningful. We can traverse argv and allocate the result array (size 1) when we see "db" the first time and later overwrite the value if we see another "db" argument. Or we can iterate argv backwards to find the last occurrence first.
What if we introduce named databases later? In this case, maybe there is not always a database number in argv, but we need some lookup and then we can only return the array of database ids. Or we can return the positions too but they just point to named databases in that case. |
That is a good call out. So you are saying that we actually no need to validate every DB, we just need to validate the last DBID? That is indeed a valid approach. I feel that allowing duplicate args often causes us some troubles at certain moments. However, I suspect no one would actually doing this. So checking every DB and reporting the first one or just simply check only the last one. Either approach would be acceptable to me, though personally, I might lean slightly toward the former. |
So you are saying something like: db=mydb1,mydb2, and then we will also have COPY src dst DBNAME mydb2. I actually haven't thought about it. I suspect they will eventually point to database id, so it might work as well. We can always change the implementation later. |
Yes. The best would be to reject multiple "db" arguments, but I guess that is too late now. It would be a breaking change. |
…io#3801) copyDbIdArgs assumed the DB token was always at argv[3], so 'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could bypass db= restrictions. Mirror copyCommand's parser to walk all argv tokens, collect every DB destination, and let the ACL checker validate each one. DB ACL was added in valkey-io#2309. Signed-off-by: Binbin <binloveplay1314@qq.com>
copyDbIdArgs assumed the DB token was always at argv[3], so 'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could bypass db= restrictions. Mirror copyCommand's parser to walk all argv tokens, collect every DB destination, and let the ACL checker validate each one. DB ACL was added in #2309. Signed-off-by: Binbin <binloveplay1314@qq.com>
…io#3801) copyDbIdArgs assumed the DB token was always at argv[3], so 'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could bypass db= restrictions. Mirror copyCommand's parser to walk all argv tokens, collect every DB destination, and let the ACL checker validate each one. DB ACL was added in valkey-io#2309. Signed-off-by: Binbin <binloveplay1314@qq.com>
Rework commandDbIdArgs so each helper returns the argv index of every dbid argument it owns. ACLSelectorCheckCmd uses those positions to set keyidxptr directly, replacing the cmd->proc-based offset table. The caller now reads the dbid value via getLongLongFromObject(argv[positions[i]], ...). New db=-checked commands only need to implement their own *DbIdArgs helper. The fix is for COPY: the old chain fell through to 0, so the "object" field in ACL LOG was always showed "copy", it now reports the first denied dbid. Based on #3801, also see it for more details, DB ACL was added in #2309. Signed-off-by: Binbin <binloveplay1314@qq.com>
Rework commandDbIdArgs so each helper returns the argv index of every dbid argument it owns. ACLSelectorCheckCmd uses those positions to set keyidxptr directly, replacing the cmd->proc-based offset table. The caller now reads the dbid value via getLongLongFromObject(argv[positions[i]], ...). New db=-checked commands only need to implement their own *DbIdArgs helper. The fix is for COPY: the old chain fell through to 0, so the "object" field in ACL LOG was always showed "copy", it now reports the first denied dbid. Based on #3801, also see it for more details, DB ACL was added in #2309. Signed-off-by: Binbin <binloveplay1314@qq.com>
copyDbIdArgs assumed the DB token was always at argv[3], so 'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could bypass db= restrictions. Mirror copyCommand's parser to walk all argv tokens, collect every DB destination, and let the ACL checker validate each one. DB ACL was added in #2309. Signed-off-by: Binbin <binloveplay1314@qq.com>
Rework commandDbIdArgs so each helper returns the argv index of every dbid argument it owns. ACLSelectorCheckCmd uses those positions to set keyidxptr directly, replacing the cmd->proc-based offset table. The caller now reads the dbid value via getLongLongFromObject(argv[positions[i]], ...). New db=-checked commands only need to implement their own *DbIdArgs helper. The fix is for COPY: the old chain fell through to 0, so the "object" field in ACL LOG was always showed "copy", it now reports the first denied dbid. Based on #3801, also see it for more details, DB ACL was added in #2309. Signed-off-by: Binbin <binloveplay1314@qq.com>
# Backport sweep for 9.1 Automated cherry-picks from PRs marked "To be backported". ## Applied | Source PR | Title | Detail | |---|---|---| | #3743 | Fix buffered_reply assert in HFE commands with module keyspace notifications | cherry-picked in a prior sweep | | #3766 | Fix flaky block_keyspace_notification test for HGETDEL notify race | cherry-picked in a prior sweep | | #3800 | Fix heap-use-after-free in ACL LOAD when client free is deferred | cherry-picked in a prior sweep | | #3723 | Fix double-finish and RESP reply violation in cluster slot migration | cherry-picked in a prior sweep | | #3872 | Redacting customer information when hide_user_data_from_log is true in rdb.c, networking.c, debug.c and t_hash | cherry-picked in a prior sweep | | #3846 | Fix use-after-free in VM_RegisterClusterMessageReceiver | cherry-picked in a prior sweep | | #3806 | Add ALL_DBS flag to CLUSTER FLUSHSLOT for database-level ACL | cherry-picked in a prior sweep | | #3847 | Harden SENTINEL commands and config rewrite against control-character injection | | | #3801 | Validate every DB clause in COPY against ACL db= permissions | | | #3851 | Replace AUTOMATION_PAT with Valkeyrie Bot GitHub App token | | | #3848 | Fix cluster AUX-field control-character and delimiter injection | | | #3544 | Revert "IO-Threads redesign cleanup work (#3544)" | cherry-picked in a prior sweep | | #3888 | Report exact dbid for COPY in ACL LOG when db= access is denied | conflicts resolved by Claude Code | | #3942 | Fix shard_id format specifier in UPDATE message log | | | #3941 | Avoid random() % 0 undefined behaviour when cluster-node-timeout < 30 | | --- *Generated by valkey-ci-agent using Claude Code.* --------- Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Ran Shidlansik <ranshid@amazon.com> Signed-off-by: chx9 <lovelypiska@outlook.com> Signed-off-by: zackcam <zackcam@amazon.com> Signed-off-by: Eran Ifrah <eifrah@amazon.com> Signed-off-by: Jules Lasarte <lasartej@amazon.com> Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com> Signed-off-by: akash kumar <akumdev@amazon.com> Co-authored-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com> Co-authored-by: lovelypiska <lovelypiska@outlook.com> Co-authored-by: zackcam <zackcam@amazon.com> Co-authored-by: eifrah-aws <eifrah@amazon.com> Co-authored-by: jjuleslasarte <jules.lasarte@gmail.com> Co-authored-by: Jules Lasarte <lasartej@amazon.com> Co-authored-by: Akash Kumar <45854686+akashkgit@users.noreply.github.com>
…ey-io#3888) (#345) Rework commandDbIdArgs so each helper returns the argv index of every dbid argument it owns. ACLSelectorCheckCmd uses those positions to set keyidxptr directly, replacing the cmd->proc-based offset table. The caller now reads the dbid value via getLongLongFromObject(argv[positions[i]], ...). New db=-checked commands only need to implement their own *DbIdArgs helper. The fix is for COPY: the old chain fell through to 0, so the "object" field in ACL LOG was always showed "copy", it now reports the first denied dbid. Based on valkey-io#3801, also see it for more details, DB ACL was added in valkey-io#2309. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Binbin <binloveplay1314@qq.com>
copyDbIdArgs assumed the DB token was always at argv[3], so
'COPY src dst REPLACE DB n' or 'COPY src dst DB a DB b' could
bypass db= restrictions. Mirror copyCommand's parser to walk
all argv tokens, collect every DB destination, and let the ACL
checker validate each one.
DB ACL was added in #2309.
Please also pick #3888 if you pick this one.