Skip to content

[move] implement resource groups#6040

Merged
davidiw merged 11 commits into
aptos-labs:mainfrom
davidiw:resource_groups
Jan 21, 2023
Merged

[move] implement resource groups#6040
davidiw merged 11 commits into
aptos-labs:mainfrom
davidiw:resource_groups

Conversation

@davidiw

@davidiw davidiw commented Jan 2, 2023

Copy link
Copy Markdown
Contributor
  • A resource group allows multiple resources to occupy the same slot in storage
  • Hides resource groups in the Move Resolver layer so that storage need not know explicitly about them
  • Added a new wrapper around the move resolver, so that the VM could be injected for re-use of the VM's metadata cache that includes resource groups
  • Tests that verify the lifetime of resource groups

The following remains incomplete:

  • Adding a feature for resource groups to be enabled
  • Handling compatibility on upgrades of upgrade
  • Potentially caching of resources within a resource group -- though this is not a blocker

@davidiw davidiw requested review from lightmark and msmouse January 2, 2023 08:54
@davidiw davidiw changed the title [move] implement a basic resource groups [move] implement resource groups Jan 2, 2023
@davidiw davidiw added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Jan 2, 2023
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Comment thread aptos-move/framework/src/extended_checks.rs Outdated
Comment thread aptos-move/framework/src/extended_checks.rs Outdated
Comment thread aptos-move/move-examples/resource_groups/secondary/sources/secondary.move Outdated
Comment thread aptos-move/aptos-vm/src/data_cache.rs Outdated
Comment thread aptos-move/aptos-vm/src/move_vm_ext/session.rs Outdated
Comment thread aptos-move/aptos-vm/src/move_vm_ext/session.rs Outdated
Comment thread aptos-move/aptos-vm/src/move_vm_ext/session.rs Outdated
Comment thread aptos-move/aptos-vm/src/move_vm_ext/session.rs Outdated
@movekevin

Copy link
Copy Markdown
Contributor

Are there more unit tests we can add to test the changeset conversions and resource group operation grouping in session? This would allow testing more edge cases than what the e2e test can cover.

@movekevin movekevin left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there should be some limit on how big a resource group can get. By arbitrarily grouping resources into a resource group, especially with cross-module grouping, a resource group can get very large.

Comment thread aptos-move/aptos-vm/src/data_cache.rs Outdated
@CapCap

CapCap commented Jan 4, 2023

Copy link
Copy Markdown
Contributor

I wonder if there should be some limit on how big a resource group can get. By arbitrarily grouping resources into a resource group, especially with cross-module grouping, a resource group can get very large.

assuming rockdsb allows reading arbitrary byte offsets of values, we could potentially change the key to be something like (struct_tag: StructTag, start_offset_bytes: u64, num_bytes: u64), and store the keys separate from values, so that we can deserialize them separately and pick/choose (if desired) which values to deserialize

@zekun000 zekun000 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what @msmouse wanted is to always call this and decide which access path to use?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't

Comment thread aptos-move/aptos-vm/src/data_cache.rs Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this slow? It can damage API performance whenever non-existent resources are requested? -- it's the best if we can cache the metadata and ensure it's cached.

Comment thread aptos-move/aptos-vm/src/data_cache.rs Outdated
Comment thread aptos-move/aptos-vm/src/data_cache.rs Outdated

@msmouse msmouse left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trusting you to fix the cache.

Thank you boss for making this reality.

* A resource group allows multiple resources to occupy the same slot in storage
* Hides resource groups in the Move Resolver layer so that storage need not know explicitly about them
* Added a new wrapper around the move resolver, so that the VM could be injected for re-use of the VM's metadata cache that includes resource groups
* Tests that verify the lifetime of resource groups
* verify at compilation time that scope is correct
* verify during module initialization that scope is correct
* individual resource reads just move to the move resolver, which is
  seamless
* all account resources require expansion -- this can result in adding
  too more than the limit -- but resources in a group are somewhat
  limited in size
Oh the ways to analyze resource groups... this checks for compatible
upgrades:

Perform upgrade checks on resource grups only -- this doesn't validate that the contents are correct.
* Acquire all relevant pieces of metadata
* Verify that there are no duplicate attributes, though this should be handled upstream, so this becomes more an invariant, duplicate check.
* Ensure that each member has a membership and it does not change
* Ensure that each group has a scope and that it does not become more restrictive
This adds a clean path to better support around ResourceGroups on the
storage side without having to retrofit the containers.
From a security persepective, it means one does not have to reason about
the possibility of data existing in a slot prior to resource groups.
The original approach separated upgrade compatibility from resource
group intrinsics. This resulted in evaluating modules twice.
Interestingly enough, this didn't reduce the code much.
namely this reproduces the checks done in the compiler
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

The existing interface here is a bit of a mess:
* We have an adapter that pretends we can have different types of VMs
* We only really have Move and everything in this codebase knows it
* There are typed parameters everywhere, so it isn't possible to
  seamlessly use trait objects, so everything needs to be explicit
* sessions expect references, references cannot be easily moved after
  kicking off a session, so the SessionExt cannot house the resolver /
  storage
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

Copy link
Copy Markdown
Contributor

✅ Forge suite compat success on testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> dac888e04385b98fbb5950890550d4477db337c0

Compatibility test results for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> dac888e04385b98fbb5950890550d4477db337c0 (PR)
1. Check liveness of validators at old version: testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b
compatibility::simple-validator-upgrade::liveness-check : 7467 TPS, 5198 ms latency, 6700 ms p99 latency,no expired txns
2. Upgrading first Validator to new version: dac888e04385b98fbb5950890550d4477db337c0
compatibility::simple-validator-upgrade::single-validator-upgrade : 4516 TPS, 9124 ms latency, 12400 ms p99 latency,no expired txns
3. Upgrading rest of first batch to new version: dac888e04385b98fbb5950890550d4477db337c0
compatibility::simple-validator-upgrade::half-validator-upgrade : 4384 TPS, 9089 ms latency, 11600 ms p99 latency,no expired txns
4. upgrading second batch to new version: dac888e04385b98fbb5950890550d4477db337c0
compatibility::simple-validator-upgrade::rest-validator-upgrade : 6065 TPS, 6497 ms latency, 11300 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> dac888e04385b98fbb5950890550d4477db337c0 passed
Test Ok

@github-actions

Copy link
Copy Markdown
Contributor

✅ Forge suite land_blocking success on dac888e04385b98fbb5950890550d4477db337c0

performance benchmark with full nodes : 5931 TPS, 6691 ms latency, 11400 ms p99 latency,no expired txns
Test Ok

@davidiw davidiw merged commit f279d68 into aptos-labs:main Jan 21, 2023
@davidiw davidiw deleted the resource_groups branch January 21, 2023 10:42
Comment thread api/src/context.rs
.collect::<Result<_>>()?;
.collect::<Result<Vec<(StructTag, Vec<u8>)>>>()?;

// We should be able to do an unwrap here, otherwise the above db read would fail.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seem state_view_at_version just never fail.

struct_tag: &StructTag,
) -> Result<Option<Vec<u8>>, VMError>;

fn get_any_resource(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api only? not for vm, right?

// The use of this implies that we could theoretically call unwrap with no consequences,
// but using unwrap means the code panics if someone can come up with an attack.
let common_error = PartialVMError::new(StatusCode::UNKNOWN_INVARIANT_VIOLATION_ERROR)
.finish(Location::Undefined);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Location is undefined? I thought it would be in StructTag::module

Comment on lines +221 to +224
let data = source_data.insert(struct_tag, data);
if data.is_some() {
return Err(common_error);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be in one line.

match self {
ResourceGroupScope::Global => "global",
ResourceGroupScope::Address => "address",
ResourceGroupScope::Module => "module_",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why underscore?

}
})?;

// Every struct contains the "dummy_field".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants