feat(test-benchmark): implement opcode count verification#1869
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/amsterdam #1869 +/- ##
==================================================
Coverage ? 86.33%
==================================================
Files ? 538
Lines ? 34557
Branches ? 3222
==================================================
Hits ? 29835
Misses ? 4148
Partials ? 574
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
spencer-tb
left a comment
There was a problem hiding this comment.
Awesome work on implementing this, nice solution to a tricky problem!
Added some questions below. I think we should rebase once #1790 is merged, as there will be some conflicts. :D
4aef34b to
81ee8d7
Compare
| # Handle both Storage objects and plain dicts | ||
| storage_dict = ( | ||
| storage.root if isinstance(storage, Storage) else storage | ||
| ) | ||
| logger.debug( | ||
| f"Deploying storage contract for EOA {eoa} with {len(storage.root)} storage slots" | ||
| f"Deploying storage contract for EOA {eoa} with {len(storage_dict)} storage slots" |
There was a problem hiding this comment.
Solving failing benchmark scenario due to typing issue
There was a problem hiding this comment.
I wonder if instead of manual coercion / handling here, we can make use of pydantic more directly. It wasn't immediately obvious (and I couldn't find) what test was giving issues here but it would be really nice if this worked:
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def fund_eoa(
self,
...
) -> EOA:
...What this does is it will validate all the inputs as the models that they are defined, with the same type coercion that pydantic gives to model instantiation. I wouldn't spend a ton of time on it but it would be nice to use pydantic for all of it's bells and whistles if we can.
Do you know what test was giving issues here?
There was a problem hiding this comment.
I really like this idea, although this would basically touch every single test created, so perhaps for a follow up PR.
What I would fix in this PR though is to make it explicit that this can receive a Dict in that argument.
There was a problem hiding this comment.
This breaks the test_ext_account_query_warm benchmark, its behavior is different under fill and execute remote. Now it passes during fill but fails when using execute remote, so now our CI cannot catch the bug!
fselmo
left a comment
There was a problem hiding this comment.
From an implementation standpoint, this looks good on my end. I tested the tolerance and the counts, CALL vs STATICCALL, looks like it works as intended. The review looks good from an outsider's perspective, I just left some things to think about and some things for me to understand a bit better as I'm not so used to this flow yet.
I will take a look again tomorrow with fresh eyes. Really nice work 👌🏼
| # Handle both Storage objects and plain dicts | ||
| storage_dict = ( | ||
| storage.root if isinstance(storage, Storage) else storage | ||
| ) | ||
| logger.debug( | ||
| f"Deploying storage contract for EOA {eoa} with {len(storage.root)} storage slots" | ||
| f"Deploying storage contract for EOA {eoa} with {len(storage_dict)} storage slots" |
There was a problem hiding this comment.
I wonder if instead of manual coercion / handling here, we can make use of pydantic more directly. It wasn't immediately obvious (and I couldn't find) what test was giving issues here but it would be really nice if this worked:
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def fund_eoa(
self,
...
) -> EOA:
...What this does is it will validate all the inputs as the models that they are defined, with the same type coercion that pydantic gives to model instantiation. I wouldn't spend a ton of time on it but it would be nice to use pydantic for all of it's bells and whistles if we can.
Do you know what test was giving issues here?
adaeee9 to
caa15e9
Compare
|
Rebased to latest |
Co-authored-by: felipe <fselmo2@gmail.com>
|
Created a PR on top of this to use pydantic: LouisTsai-Csie#1 Other than that it looks good to me! @LouisTsai-Csie |
Opcode count verification
danceratopz
left a comment
There was a problem hiding this comment.
This is looking good. A few very minor nits below!
danceratopz
left a comment
There was a problem hiding this comment.
Thanks @LouisTsai-Csie! One more minor nit!
danceratopz
left a comment
There was a problem hiding this comment.
Thanks @LouisTsai-Csie! LGTM!
🗒️ Description
There are several enhancements in this PR:
Implement opcode-count verification for
--fixed-opcode-countmode using the opcount field provided when generating tests withevmonewith 5% deviation.Re-label the worst-case scenarios for the repricing marker based on the Grafana dashboard data and this script that sort the cases based on the MGas result.
tests/benchmark/compute/instruction/are udpated.Based on our discussion about the verification mechanism during sync up call, specifically whether to remove the
target_opcodefield. I would suggest not removing it for now, since this feature depends on it, which is very helpful for me to show coverage to others, without manually update the Notion page.🔗 Related Issues or PRs
Issue #1835
✅ Checklist
toxchecks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:uvx tox -e statictype(scope):.mkdocs servelocally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.@ported_frommarker.Cute Animal Picture