Skip to content

Fix invariant checks for Permissioned Domains#6134

Merged
bthomee merged 15 commits intoXRPLF:developfrom
oleks-rip:pd_inv
Feb 10, 2026
Merged

Fix invariant checks for Permissioned Domains#6134
bthomee merged 15 commits intoXRPLF:developfrom
oleks-rip:pd_inv

Conversation

@oleks-rip
Copy link
Copy Markdown
Collaborator

High Level Overview of Change

Fixed typo in invariant checks, found by Immunefi

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (non-breaking change that only restructures code)
  • Performance (increase or change in throughput and/or latency)
  • Tests (you added tests for code that already exists, or your new feature included in this PR)
  • Documentation update
  • Chore (no impact to binary, e.g. .gitignore, formatting, dropping support for older tooling)
  • Release

@oleks-rip oleks-rip requested a review from Tapanito December 10, 2025 16:40
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.9%. Comparing base (db2734c) to head (eeb990f).
⚠️ Report is 191 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           develop   #6134     +/-   ##
=========================================
- Coverage     79.9%   79.9%   -0.0%     
=========================================
  Files          840     840             
  Lines        65483   65514     +31     
  Branches      7251    7256      +5     
=========================================
+ Hits         52332   52343     +11     
- Misses       13151   13171     +20     
Files with missing lines Coverage Δ
src/xrpld/app/tx/detail/InvariantCheck.cpp 92.9% <100.0%> (+0.2%) ⬆️
src/xrpld/app/tx/detail/InvariantCheck.h 100.0% <ø> (ø)

... and 11 files with indirect coverage changes

Impacted file tree graph

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mvadari
Copy link
Copy Markdown
Collaborator

mvadari commented Dec 11, 2025

Does this need an amendment?

@oleks-rip
Copy link
Copy Markdown
Collaborator Author

oleks-rip commented Dec 11, 2025

Does this need an amendment?

No it didn't change anything. I can't imagine situation where before is different from after (except where before is null)

@oleks-rip oleks-rip added the Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. label Dec 11, 2025
Copy link
Copy Markdown
Collaborator

@ximinez ximinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This invariant has a few problems in general:

  1. sleStatus_ is an optional. It should be a vector, in case more than one permissioned domain is affected by a transaction. If that is supposed to be impossible, then that can be one of the first things that finalize checks.
    if (selStatus_.size() > 1) { JLOG...; return false; }
    
  2. The conditions in finalize are only checked for successful PermissionedDomainSet transactions. If no other transaction should be able to modify a permisioned domain object, then it should check that. That probably means adding a check for PermissionedDomainDelete, too. If it's not one of those two types, or not successful, and selStatus_ is not empty, finalize should fail. If it is one of those two types, and selStatus_.size() != 1, finalized should also fail. Maybe add an _isDeleted flag to SleStatus and check that it's false for Set, and true for Delete.
  3. Because you're checking the integrity of the object, and not checking any of the changes between before and after, you probably should ignore before entirely. If it's messed up, there's nothing you can do about it, and if after is correct, then you've fixed whatever problem might have existed.

All that said, that's not what this PR is about, though maybe it should be. The changes above would obviously require an amendment.

As for what the PR is about, I can't convince myself that it's impossible for this change to result in a different ledger. The main reason is that if, somehow, a bad object had ever been written to the ledger, this check will now catch it, whereas it wouldn't have caught it before the change. That will result in the transaction succeeding on some nodes, and failing on others.

@ximinez ximinez removed the Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. label Dec 12, 2025
@oleks-rip
Copy link
Copy Markdown
Collaborator Author

oleks-rip commented Dec 12, 2025

@ximinez

  1. sleStatus_ is an optional. It should be a vector, in case more than one permissioned domain is affected

modified and added check for size() > 1 (which is actually impossible as there is no such transaction)

  1. Because you're checking the integrity of the object, and not checking any of the changes between before and after

removed check for "before"

  1. The conditions in finalize are only checked for successful PermissionedDomainSet transactions.
    2.1 If no other transaction should be able to modify a permisioned domain object, then it should check that

I don't think it is possible. Invariant checks are all called for every tx, and I don't see a way to figure out that this particular PermissiondDomain sle object is modified by this particular tx. We just assumed that if there is "after" then it was PermissionedDomainSet

2.2 That probably means adding a check for PermissionedDomainDelete

Don't think that we need to add check for PermissionedDomainDelete. Do we actually care if the delete incorrect object ?

I can't convince myself that it's impossible for this change to result in a different ledger. The main reason is that if, somehow, a bad object had ever been written to the ledger, this check will now catch it, whereas it wouldn't have caught it before the change. That will result in the transaction succeeding on some nodes, and failing on others.

This is not true. Currently (without this PR) invariants check only "after" object for both cases ("before" and "after"), and only for PermissionedDomainSet transaction. Which mean "after" always will be present (no crash) and verified. And "before" check is not so important (as you mention early, and I agree). Also this fix is related to verification logic, not to the creation/modification logic.
So I don't see a way to create invalid object (remember that "after" was check always). Neither I see the behavior change after this fix, so not need for amendment.

Correct me if I'm wrong somewhere.

@oleks-rip oleks-rip force-pushed the pd_inv branch 3 times, most recently from e586813 to 34ea901 Compare December 12, 2025 20:15
@oleks-rip oleks-rip requested a review from ximinez December 12, 2025 20:30
@ximinez
Copy link
Copy Markdown
Collaborator

ximinez commented Dec 13, 2025

  1. sleStatus_ is an optional. It should be a vector, in case more than one permissioned domain is affected

modified and added check for size() > 1 (which is actually impossible as there is no such transaction)

EVERY failure case in an Invariant should be impossible. The intention is less "double-check that the transaction did the right thing", and more "if something goes extremely wrong, or an attacker finds an exploit, what are some reasonable checks we can do to stop it?"

That's why invariant unit tests have to modify the open ledger directly. The things they're testing for are, as far as we know, impossible to do with a transaction.

  1. Because you're checking the integrity of the object, and not checking any of the changes between before and after

removed check for "before"

👍

  1. The conditions in finalize are only checked for successful PermissionedDomainSet transactions.
    2.1 If no other transaction should be able to modify a permisioned domain object, then it should check that

I don't think it is possible. Invariant checks are all called for every tx, and I don't see a way to figure out that this particular PermissiondDomain sle object is modified by this particular tx. We just assumed that if there is "after" then it was PermissionedDomainSet

2.2 That probably means adding a check for PermissionedDomainDelete

Don't think that we need to add check for PermissionedDomainDelete. Do we actually care if the delete incorrect object ?

We don't care if the object is malformed if it's deleted, but what if it isn't deleted? What if it's deleted incorrectly?

Basic outline of how to check all of the above conditions in finalize() :

auto check = [...I think you can leave the lambda unchanged...]
if (view.rules.enabled(fixPermissionedDomainInvariant))
{
    if (result != tesSUCCESS && sleStatus_.size() == 0)
        // Nothing was changed by a failed tx. All good.
        return true;

    if (sleStatus_.size() > 1)
    {
        JLOG(j.fatal()) << "Invariant failed: transaction affected more than 1 "
                           "entry."; 
        return false;
    }

    switch (tx.getTxnType())
    {
        case ttPERMISSIONED_DOMAIN_SET:
        {
            if (sleStatus_.size() == 0)
            {
                 [Assuming that PDSet _must_ change an object, log and return false, otherwise return true]
                 JLOG(j.fatal()) << "PermissionedDomainSet should only affect one domain object";
                 return false;
            }
            auto const& sleStatus = sleStatus_[0];
            if(sleStatus.isDelete) [Requires adding isDelete to the struct and seting it in `visitEntry()`]
            {
                 JLOG(j.fatal()) << "Domain object deleted by PermissionedDomainSet";
                 return false;
            }
            return check(sleStatus, j);
        }
        case ttPERMISSIONED_DOMAIN_DELETE:
        {
            if(sleStatus_.size() == 0)
            {
                 [Assuming that PDDelete _must_ delete an object, log and return false, otherwise return true]
                 JLOG(j.fatal()) << "No domain objects affected by PermissionedDomainDelete";
                 return false;
            }
            if(!sleStatus_[0].isDelete)	[Requires adding isDelete to the struct and seting it in `visitEntry()`]
            {
                 JLOG(j.fatal()) << "Domain object modified, but not deleted by PermissionedDomainDelete";
                 return false;
            }
            return true;
        }
        default:
        {
            if (sleStatus_.size() != 0)
            {
                JLOG(j.fatal()) << sleStatus_.size() << " domain object(s) affected by an unauthorized transaction.";
                return false;
            }
            return true;
        }
    }
}
else
{
    // The pre-amendment behavior
    if (tx.getTxnType() != ttPERMISSIONED_DOMAIN_SET || result != tesSUCCESS)
        return true;

    return !sleStatus_.empty() ? check(sleStatus_[0], j) : true;
}

I can't convince myself that it's impossible for this change to result in a different ledger. The main reason is that if, somehow, a bad object had ever been written to the ledger, this check will now catch it, whereas it wouldn't have caught it before the change. That will result in the transaction succeeding on some nodes, and failing on others.

This is not true. Currently (without this PR) invariants check only "after" object for both cases ("before" and "after"), and only for PermissionedDomainSet transaction. Which mean "after" always will be present (no crash) and verified. And "before" check is not so important (as you mention early, and I agree).

Ironically, I realized that removing the before check does not require an amendment for exactly this reason.

Also this fix is related to verification logic, not to the creation/modification logic. So I don't see a way to create invalid object (remember that "after" was check always). Neither I see the behavior change after this fix, so not need for amendment.

"I don't see a way to create invalid object" is exactly what Invariants are for. You may not see a way to create an invalid object, or two objects, or to delete an object that shouldn't be deleted, but that doesn't mean that there isn't one.

Back to what I said earlier, the intention of Invariants is to catch those impossible situations that end up happening despite all our best efforts. Maybe you made a mistake. Maybe there's some subtle exploit that nobody can predict. Or maybe an inexperienced developer makes a change five years from now without understanding all the implications and consequences.

The questions to ask yourself are

  • "How could I have screwed up?"
  • "How can someone else screw this up?"
  • "If someone does screw this up, what are the things that are most important to prevent?"

For example, let's assume your implementation of the transactors is perfect. Great. Down the road, there's a new feature that has nothing to do with those transactions. Product, design, or the developer decides it would be cool to automatically create a Permissioned Domain automatically as part of the new HeyThisIsCool transaction. The engineer screws up. They implement it without referring to the current transactors, or using any of the helpers that you've written. The credentials are unsorted, or have duplicates, or there are too many of them, or whatever.

Oh, and they don't look at the Invariants either. Maybe they figure "Hey, if the invariants don't complain, then it must be ok."

As currently written, the Invariants won't complain.

And to add icing to the cake, the code review doesn't catch it. And testing maybe doesn't even know it exists, or they don't thoroughly exercise all the possible edge cases.

"Oh, that can't happen..."

Despite a very good developer implementing the feature, and a boatload of time spent on the code reviews, including onr by me, that issue was discovered and reported by external contributors (including a former Rippler, but we can't count on that). I don't remember the exact timeline, but I think it was before QA got their hands on it, Even so, I don't think they would have caught it, because they're not in the habit of checking the entire object tree for stray objects. (Maybe they should?)

@ximinez
Copy link
Copy Markdown
Collaborator

ximinez commented Dec 22, 2025

image

Please do not rewrite history on active PRs in this repo. It makes life harder for reviewers. Instead, add your changes in new commits, and when updating from the base branch, do a merge instead of a rebase.

(If there are special circumstances that require it, please document what you did and why.)

Copy link
Copy Markdown
Collaborator

@ximinez ximinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so much better and safer! Thank you for making the changes. I left a couple of suggestions, but they're all pretty minor, and easy to fix.

Comment thread src/xrpld/app/tx/detail/InvariantCheck.cpp Outdated
Comment thread src/xrpld/app/tx/detail/InvariantCheck.cpp Outdated
Comment thread src/xrpld/app/tx/detail/InvariantCheck.cpp Outdated
@vvysokikh1 vvysokikh1 added the Blocked on requested changes Reviewers have requested changes which must be addressed or responded to label Jan 2, 2026
@oleks-rip oleks-rip requested a review from ximinez January 5, 2026 22:02
@Tapanito Tapanito self-requested a review January 14, 2026 13:28
Copy link
Copy Markdown
Collaborator

@ximinez ximinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the absurdly long delay getting back to this. Thanks for making the changes.

@oleks-rip oleks-rip added Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. and removed Blocked on requested changes Reviewers have requested changes which must be addressed or responded to labels Jan 29, 2026
Comment thread src/xrpld/app/tx/detail/InvariantCheck.cpp
@bthomee bthomee merged commit e11f619 into XRPLF:develop Feb 10, 2026
26 checks passed
@ximinez ximinez added this to the 3.1.3 milestone Apr 15, 2026
@ximinez ximinez modified the milestones: 3.1.3, 3.1.3 (develop) Apr 15, 2026
mathbunnyru pushed a commit to Kassaking7/rippled that referenced this pull request Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants