-
Notifications
You must be signed in to change notification settings - Fork 123
Fixes duplicate contractmetav0 custom sections when users add meta on build #2095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
leighmcculloch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change isn't quite what I meant in #1721, but it's better!
The issue was reporting that:
the cli is writing a whole new section per entry, but it should be joining the new entries together and writing them in one new entry
The issue was that when appending N new meta after build, the file would contain N new custom sections, instead of 1 new custom section. The intent was to still append a new custom section without modifying any of the existing custom sections, but to add all the meta entries into 1 new custom section.
The change in this PR is a bit different because it is fully decoding the file and rewriting the entire file. I had avoided rewriting the file in the past out of a fear we could introduce a problem with the contract by fiddling with the existing sections. But we were likely going to end up needing to rewrite files at some point anyway someday, and this change looks good and straight forward.
Rewriting has the outcome that the resulting file should be a tiny bit smaller than if we kept with appending a 2nd section.
I like that we've shifting from wasm-gen to wasm-encoder, because the former has not had updates for years and the latter is maintained by the Bytecode Alliance.
Few issues inline.
| } | ||
| other => { | ||
| // Reconstruct raw section bytes and add them to the new module | ||
| if let Some((id, range)) = other.as_section() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs for .as_section() say that:
Not all payloads refer to entire sections, such as the
VersionandCodeSectionEntryvariants. These variants will returnNonefrom this function.
Does that mean when rewriting the wasm file in this matter, where we are only copying over entire sections, that we're discarding some information, or that information is being replaced by the wasm-encoder? This might be fine, I'm just wondering if we're going to get bitten by that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! From what I understand, it looks like CodeSectionEntry sections are kind of "subsections", and are nested under a top level section as a stream. Since we are included the full content of a section into a RawSection that information should be included.
And then I think that for the Version section that is something that is automatically generated by encoder on finish().
This info is mostly from chatgpt, so i'm taking it with a grain of salt, and will keep doing some more research to make sure its correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to confirm that the Version is added to a new module automatically by wasm_encoder, however it seems to be added in the new fn: https://github.com/bytecodealliance/wasm-tools/blob/c2fc211e3259292ea91afdabacd93b1e1c7470fb/crates/wasm-encoder/src/core.rs#L137
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To verify the CodeSectionEntry piece, from the doc comment:
/// An entry of the code section, a function, was parsed from a WebAssembly
/// module.
///
/// This entry indicates that a function was successfully received from the
/// code section, and the payload here is the window into the original input
/// where the function resides. Note that the function itself has not been
/// parsed, it's only been outlined. You'll need to process the
/// FunctionBody provided to test whether it parses and/or is valid.
I also printed out the "other" payloads, and this seems to match my chatgpt-aided understanding
===========> other: CodeSectionStart { count: 3, range: 135..390, size: 254 }
===========> other: CodeSectionEntry("...")
===========> other: CodeSectionEntry("...")
===========> other: CodeSectionEntry("...")
===========> other: CustomSection(CustomSectionReader { name: "contractspecv0", data_offset: 407, data: "...", range: 392..467 })
===========> other: CustomSection(CustomSectionReader { name: "contractenvmetav0", data_offset: 487, data: "...", range: 469..499 })
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@graydon do you have thoughts about how safe you think it is for the stellar-cli to reconstruct teh entire wasm file using wasm-encoder?
Today the stellar-cli only appends new custom sections, and so doesn't rewrite or touch anything already existing in a file. This would change that behaviour to rewriting the entire file and rebuilding it using the wasm-encoder crate.
For this specific change we could avoid rewriting the whole wasm file by still just appending the meta into a single new contractmetav0 custom section as I described in #2095 (review), but we have other open issues that'd require rewriting the file anyway so if we avoid doing it here we probably aren't avoiding doing it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried downgrading to rust version 1.75.0 but was seeing other errors when trying to build the stellar cli because it doesn't support a rust version that old, nor do a lot of its dependencies.
error: failed to parse lock file at: /Users/ebethme/Desktop/projects/stellar/stellar-cli/Cargo.lock
Caused by:
lock file version `4` was found, but this version of Cargo does not understand this lock file, perhaps Cargo needs to be updated?
❯ cargo build
error: package `soroban-test v23.0.1 (/Users/ebethme/Desktop/projects/stellar/stellar-cli/cmd/crates/soroban-test)` cannot be built because it requires rustc 1.89.0 or newer, while the currently active rustc version is 1.75.0
I suppose I could try to run the underlying cargo rustc command, or downgrade the stellar cli version... but I'm not sure how to then compare the wasm it generates with wasm with the new meta logic. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leighmcculloch I don't feel especially strongly. I mean, I guess I am worried/skeptical but not about any specific failure mode. I would say:
- wasm itself (really any object file format) will have a bunch of not-obvious-from-the-APIs invariants and relationships that have to be upheld for the resulting file to be correct as per the spec.
- if you happen to generate something incorrect, it might or might not be caught by tools like the validator/parser. being caught and rejected is the good case! it's also possible we'll just wind up with some incorrect wasms on chain forever that "happen to work" for a while, until in some other utterly inscrutable context (eg. after a bugfix to wasmparser) they don't.
- the same is also true of the soroban env host/guest interface. a bunch of rules and assumptions, some baked in to different levels of the tooling and SDK, probably a few of them might break if you do this, and probably not all such breakage will be immediately caught, might linger on chain.
- the same is also true of the internal invariants and assumptions of the rust language. you might literally break some part of rust's semantics by doing this. and that's even less likely to be caught by anyone.
- the same is also true of the internal invariants and assumptions of a given contract (eg. accounting/authorization/storage/etc. logic), that nothing other than their own unit tests will notice you breaking. it's fully possible to miscompile (or mis-edit) user code and make it parse, validate, run, and just do something the user didn't want when it runs.
given the number of layers at which something can go wrong, I would recommend against perturbing much of the stack if you can avoid it. but if you can't avoid it, I mean, there's nothing inherently wrong with doing this, in the sense that "everything will definitely break". everything might be ok!
the soroban env in any case has no idea which PL it's running much less which toolchain produced or post-processed it, nor the accounting semantics of the guest code or whatever.
but there is a ton of stuff that might break, so I'd aim to change as little as possible. everything is delicate and only as good as its tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I share the same concerns. @elizabethengelman @fnando I think we should to revert some of this change, so that it addresses just the original issue #1721, by returning to only appending a custom section, but just one instead of multiple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we end up needing to do this again for another feature where we absolutely need to rewrite the file, we can use this PR as a template for how to do it, because I think for how to do it this PR did a good job of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for that synopsis @graydon - that is super helpful info.
I agree that reverting some of this change, and just appending to the custom section sounds like the way to go. I'll plan to either clean this PR up, or just close it and open a new one shortly
this still creates a new custom section with all the meta, when i want to replace the existing one
don't include the length at the front of the vec
Co-authored-by: Leigh <351529+leighmcculloch@users.noreply.github.com>
307f4d8 to
9661b38
Compare
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
@leighmcculloch @graydon I'm closing this in favor of a PR with the more simple approach here: #2229 |
What
Closes #1721
Updates the logic to add new contract meta data to the contract wasm on build so that there is only one
contractmetav0custom section.Here's a diff of the before (on the left) vs after with this branch (on the right): https://www.diffchecker.com/fshP64p8/
These are the commands that were run to get the outputs for the diff:
Output from uploading and deploying the resulting wasm:
Why
This will make it so the wasm isn't larger than it needs to be with duplicate
contractmetav0sections. This hopefully will also make it easier for the data team to fetch contract data.Known limitations
This does reencode the whole wasm file, instead of just appending a new section. Given the tooling, this was the only way I could figure out how to remove the default
contractmetav0and include just the new, updated section with the user's meta.