Update EIP-7907: Reduce code limit, increase cost per word, fix EXTCODESIZE issue#9910
Conversation
|
✅ All reviewers have approved. |
EXTCODESIZE issueEXTCODESIZE issue
| ## Rationale | ||
|
|
||
| The gas cost of 2 per word was chosen in-line with [EIP-3860](./eip-3860.md). This accounts for: | ||
| The gas cost of 4 per word was chosen in-line with the per word code defined by [EIP-2929](./eip-2929.md)'s `COLD_ACCOUNT_ACCESS_COST. The value is derived from the current gas per word code of `ceil(2600 / (24676//32)) = 4` where `2600` is the current cold account load cost and `24676` is the maximum allow code size at that price. In general, this accounts for: |
There was a problem hiding this comment.
I agree that we should reuse 2600 in the excess gas calculation. One thought is whether the excess gas should include the cost of reading account from MPT.
The original 2600 gas accounts for:
- cost for reading the account from MPT
- cost for reading the code via codehash (from
account) - jump analysis, memory, etc
When reading a large contract, since the first 2600 already covers the cost of reading the account, should the excess gas include the cost of reading the account or not? It is not, the 2600 per extra 24KB (or 3072 of 24KB using 4 gas per word) may be too conservative? (But I am fine if such a conservative design is intended.)
There was a problem hiding this comment.
I think it is hard to say conclusively. This I err on the side of caution until we have empirical data to guide us. At a 3072 per additional 24KB, it should be pretty close to parity when you consider the overhead of preparing calls to other contracts in something like diamond standard.
There was a problem hiding this comment.
i think the marginal per-word cost of loading data should be lower than the initial 2600 gas per 24kb (which works out to about 3.38 gas per word). i am under the impression that a more comprehensive gas tuning across all opcodes is being prepared. but it still feels weird to price the marginal cost per word over the cost per-word of the initial 24kb, since it's a sequential load and no additional db access needs to be done. if 2 gas per word is considered too cheap, maybe 3 gas per word is better?
There was a problem hiding this comment.
Let us be cautious, ship it, and we can revisit the constants in the future.
| 2. Change the gas schedule for opcodes which load code. Specifically, the opcodes `CALL`, `STATICCALL`, `DELEGATECALL`, `CALLCODE` and `EXTCODECOPY` are modified so that `largeContractCost = ceil32(excess_contract_size) * GAS_INIT_CODE_WORD_COST // 32` gas is added to the access cost if the code is cold, where `excess_contract_size = max(0, contract_size - 0x6000)`, and `GAS_INIT_CODE_WORD_COST = 2`. (Cf. initcode metering: [EELS](https://github.com/ethereum/execution-specs/blob/1a587803e3e698407d204888b02342393f8b4fe5/src/ethereum/cancun/vm/gas.py#L269)). This introduces a new warm state for contract code - warm if the code has been loaded, cold if not. | ||
| 1. Update the [EIP-170](./eip-170.md) contract code size limit of 24KB (`0x6000` bytes) to 48KB (`0xc000` bytes). | ||
| 2. Change the gas schedule for opcodes which load code. Specifically, the opcodes `CALL`, `STATICCALL`, `DELEGATECALL`, `CALLCODE` and `EXTCODECOPY` are modified so that `largeContractCost = ceil32(excess_contract_size) * GAS_CODE_LOAD_WORD_COST // 32` gas is added to the access cost if the code is cold, where `excess_contract_size = max(0, contract_size - 0x6000)`, and `GAS_CODE_LOAD_WORD_COST = 4`. (Cf. initcode metering: [EELS](https://github.com/ethereum/execution-specs/blob/1a587803e3e698407d204888b02342393f8b4fe5/src/ethereum/cancun/vm/gas.py#L269)). This introduces a new warm state for contract code - warm if the code has been loaded, cold if not. | ||
| 3. The cost for `EXTCODESIZE` is updated to acknowlege the potential for two database reads: once for the code hash and once for the code size associated with the code hash. |
There was a problem hiding this comment.
How would EXTCODESIZE charge exactly? I may miss the info somewhere.
There was a problem hiding this comment.
Addressing EXTCODESIZE change that was discused in discord can be found here: https://github.com/lightclient/EIPs/pull/11/files
|
Hey there, I'm Derek from Offchain Labs. Speaking on behalf of Arbitrum teams and developers, we'd very much like to see the original 256 KiB max contract size code rather than the proposed reduction to 48 KiB here in this PR. Devnet-02 will have the original specification and I'd be curious to know what numbers/metrics we're looking for to make the call for reducing the max contract size from 256 KiB to 48 KiB (or other value). Thanks! |
jochem-brouwer
left a comment
There was a problem hiding this comment.
Great PR. It seems upon a first read that we now have 3 categories of "warm";
- Cold/warm account
- Cold/warm extcodesize
- Cold/warm code
Is this correct? When do these get warm? What about current account warming (coinbase, precompiles, sender account, target account, 7702-delegated account (if targeted by tx))? I think this should be specified to self-contain the EIP.
| 1. Update the [EIP-170](./eip-170.md) contract code size limit of 24KB (`0x6000` bytes) to 48KB (`0xc000` bytes). | ||
| 2. Change the gas schedule for opcodes which load code. Specifically, the opcodes `CALL`, `STATICCALL`, `DELEGATECALL`, `CALLCODE` and `EXTCODECOPY` are modified so that `largeContractCost = ceil32(excess_contract_size) * GAS_CODE_LOAD_WORD_COST // 32` gas is added to the access cost if the code is cold, where `excess_contract_size = max(0, contract_size - 0x6000)`, and `GAS_CODE_LOAD_WORD_COST = 4`. (Cf. initcode metering: [EELS](https://github.com/ethereum/execution-specs/blob/1a587803e3e698407d204888b02342393f8b4fe5/src/ethereum/cancun/vm/gas.py#L269)). This introduces a new warm state for contract code - warm if the code has been loaded, cold if not. | ||
| 3. The cost for `EXTCODESIZE` is updated to acknowlege the potential for two database reads: once for the code hash and once for the code size associated with the code hash. | ||
| with the hash. In addition to the current pricing scheme defined under [EIP-2929](./eip-2929.md), the instruction will also be subject to regular storage pricing, e.g. `COLD_SLOAD_COST` and `WARM_SLOAD_COST`. |
There was a problem hiding this comment.
Does querying EXTCODESIZE make the account "code-warm", or is this another category? (warm account, warm code, warm extcodesize (?))
There was a problem hiding this comment.
Yes it would make the account warm and code size for the account warm, but leave the code cold.
There was a problem hiding this comment.
After discurd discussion have made a PR that changes this slightly: https://github.com/lightclient/EIPs/pull/11/files
|
Can further specifications be added to the EIP:
|
|
Hi @lightclient, would you mind summarizing the concerns / blockers that caused the decision to reduce the contract size limit increase to 48KB bytecode / 96KB initcode? As an outsider I have a hard time following from ACDE + chat logs what the exact blockers are and whether there is still a pathway to 256KB after opcode repricing / gas limit increase. Thanks! |
Echo this point by @jochem-brouwer. Current tests expect the code to be warm for the |
i think that |
i think the implementation is cleaner if there is no interaction between the revert journal and the warm code list (i.e., reverts do not clean the warm code list). but if it's better to keep consistency with 2929, that's fine - maybe ACDE should vote here or something. |
|
On another topic, not journaling code warming looks okay to do. |
It feels to me like we should charge this, otherwise it won't scale I think. What about charging it at the start of the transaction execution, and failing with OOG before executing any code if the starting gas is not enough to cover this large-contract extra cost? |
True, we should probably introduce the fee. |
Created a PR for this: #9955 |
|
After discussion in Discord all clients are fine to drop codesize warm/cold flag`. Have made a PR that makes the change to EIP https://github.com/lightclient/EIPs/pull/11/files Otherwise, this PR looks good |
|
The commit 9a54f0d (as a parent of 8c1b1cb) contains errors. |
| | Warm account and code | No change to existing gas schedule. `WARM_STORAGE_READ_COST=100` | Contract created with `CREATE`/`CREATE2`, or `CALL`, `STATICCALL`, `DELEGATECALL`, `CALLCODE` or `EXTCODECOPY` made on the contract, previously in the txn (opcodes that load contract code) | | ||
|
|
||
| `COLD_ACCOUNT_ACCESS_COST` and `WARM_STORAGE_READ_COST` are defined in [EIP-2929](./eip-2929.md#parameters). | ||
| | Cold account and code | Add `COLD_SLOAD_COST=2100`, `EXCESS_CODE_COST`, and `COLD_ACCOUNT_ACCESS_COST=2600` | Contract not in access list nor accessed prior in the txn | |
There was a problem hiding this comment.
I've two questions regarding this new COLD_SLOAD_COST=2100:
- Is there a TL;DR on the rationale behind this?
- This change increases the "breaking surface" compared with the original proposal (pre-PR), no? (i.e., already existing code-access opcodes targeting <=24KiB would have a different gas cost).
There was a problem hiding this comment.
- It is about reading codesize before reading the full code, clients (impl detail) need to do an additional db read for it.
- It does increase the original price for opcodes that cold load code. Those are
CALL,STATICCALL,DELEGATECALL,CALLCODE,EXTCODECOPYandEXTCODESIZE. Warm loads stays the same
There was a problem hiding this comment.
- It is about reading codesize before reading the full code, clients need to do an additional db read for it.
Asking a bit more since I want to understand better: why EL clients need to read the code size first before pulling the bytecode from the DB? In practice is this done by any EL? Or maybe there's a new "mental model" now of baking this cost because maybe in the future if we keep increasing the max size, then maybe EL clients might require the code size to decide on how to pull the bytecode?
- It does increase the original price for opcodes that load code. Those are CALL, STATICCALL, DELEGATECALL, CALLCODE, EXTCODECOPY and EXTCODESIZE.
Do you know if anybody is planning to do some impact analysis on this? Mostly thinking if there are many contracts "baking" gas assumptions about CALL-like opcodes in particular (maybe that isn't normal -- mostly asking since I don't have well calibrated if this is a fair concern)
There was a problem hiding this comment.
-
Ddos risk. Mostly, this
then maybe EL clients might require the code size to decide on how to pull the bytecode?We need to check the size to calculate how much gas we need to spend before loading big bytecode. -
This is long standing question, not just related to this EIP, but for any gas change we do, it is hard to analyze, and I didn't find any good analysis. But if we want to increase the block size or make changes to EVM, changes in gas are generally inevitable.
There was a problem hiding this comment.
I agree with @jsign that adding COLD_SLOAD_COST = 2100 to existing opcodes (e.g., CALL) constitutes a breaking change. Additionally, I believe this cost overcharges contract size lookups: COLD_SLOAD_COST = 2100 reflects the cost of reading from the storage trie, which requires multiple database lookups, whereas a contract size lookup typically involves just a single database access.
There was a problem hiding this comment.
I view COLD_ACCOUNT_ACCESS_COST = 2600 as composed of 2100 + 500, where 2100 covers the cost of accessing the account from the state trie, and 500 accounts for loading the contract code (i.e., a single DB lookup via the codehash => codebytes mapping).
For contracts larger than 24KB, the 2600 cost implicitly includes the code size lookup. One possible solution is to split large contracts into two parts:
- The first 24KB, stored as codehash => codebytes[0:24KB] + codesize.to_bytes();
- The remaining bytes, stored as codehash => codebytes[24KB:].
When reading a contract, we charge the initial 2600 and load the first 24KB. If the loaded data is ≤ 24KB, we know the contract size <= 24KB and no excess gas is needed. If it's > 24KB, the appended codesize indicates the total size, and we can then charge excess gas accordingly.
There was a problem hiding this comment.
Not relevant for this EIP but I am starting to wonder why cold account cost is 500 more than cold slot cost. The tries are both "secure" tries (all keys are 32 bytes -> max depth thus 64). The account RLP itself needs some decoding and a few extra bytes of loading, but charging 500 for that is a bit excessive 🤔 (in context of EIP 2929)
Relevant: @qizhou I get the idea but you have an identical key (codehash) pointing to different data 😃
There was a problem hiding this comment.
Not relevant for this EIP but I am starting to wonder why cold account cost is 500 more than cold slot cost. The tries are both "secure" tries (all keys are 32 bytes -> max depth thus 64). The account RLP itself needs some decoding and a few extra bytes of loading, but charging 500 for that is a bit excessive 🤔 (in context of EIP 2929)
Relevant: @qizhou I get the idea but you have an identical key (
codehash) pointing to different data 😃
You can differentiate mappings by prefixing the same key differently — a standard practice in key-value databases.
There was a problem hiding this comment.
Not relevant for this EIP but I am starting to wonder why cold account cost is 500 more than cold slot cost. The tries are both "secure" tries (all keys are 32 bytes -> max depth thus 64). The account RLP itself needs some decoding and a few extra bytes of loading, but charging 500 for that is a bit excessive 🤔 (in context of EIP 2929)
This likely explains why 500 is used to represent the cost of accessing codebytes via the codehash key.
| if n % 32 == 0: | ||
| return n // 32 | ||
| else: | ||
| return n // 32 + 32 |
There was a problem hiding this comment.
| return n // 32 + 32 | |
| return n // 32 + 1 |
There was a problem hiding this comment.
I am confused, the pricing formula also has division by 32 in it. I think something is not right here 🤔
There was a problem hiding this comment.
EIP 3860 calculates cost as INITCODE_WORD_COST * ceil(len(initcode) / 32)
There was a problem hiding this comment.
I think clearest would be (n + 31) // 32
There was a problem hiding this comment.
Can define it as a "macro" ceil32 in the eip
|
I have a practical and economical point about something which I just realized. We currently charge In terms of an economic and efficient layout of contracts, I think in the end we will only see <=24 KiB contracts, since we do not want to force callers of the contract to pay this extra cost. We only want to pay for what we use (this feels a bit like code chunking which is obviously done in small chunks like ~32 bytes, but this feels like code chunking now using 24 KiB chunks). There is one exception where I might use this fee: if I am sure that all code of my contract is used, it is thus more expensive to deploy a second contract which I have to invoke anyways (2600 gas). However, the big contract is more expensive (4 gas/word) than invoking the new contract ( So from an economic point of view I think we will mostly see 24 KiB contracts, because this means you pay for what you use. It feels "unfair" to mandatory pay for this "big code" even if you only read part of it 🤔 |
9a54f0d to
110719a
Compare
eth-bot
left a comment
There was a problem hiding this comment.
All Reviewers Have Approved; Performing Automatic Merge...
eth-bot
left a comment
There was a problem hiding this comment.
All Reviewers Have Approved; Performing Automatic Merge...
eth-bot
left a comment
There was a problem hiding this comment.
All Reviewers Have Approved; Performing Automatic Merge...
|
Sorry didn't mean for ethbot to merge this. Seems like rebasing brought in the fact that I was added as author so it went ahead and merged. |
Updating with a few things we discussed at interop:
ceil(2600 / (24676//32)) = 4EXTCODESIZE