uint256: optimize WriteToArray and add PutUint256#190
uint256: optimize WriteToArray and add PutUint256#190holiman merged 3 commits intoholiman:masterfrom
Conversation
Make WriteToArray32 and WriteToArray20 use PutUint64 like in Bytes32 and
Bytes20 to remove all branches and increase the number of bytes per 1
load/store to 8 bytes.
goos: linux
goarch: amd64
pkg: github.com/holiman/uint256
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
WriteToArray20-8 31.8000n ± 15% 0.7819n ± 17% -97.54% (p=0.000 n=10)
WriteToArray32-8 57.640n ± 22% 1.050n ± 18% -98.18% (p=0.000 n=10)
geomean 42.81n 0.9059n -97.88%
By changing Memory.Set32 in go-ethereum to use new WriteToArray32 we can reduce
a memory copy which makes OpMstore faster.
The patch
diff --git a/core/vm/memory.go b/core/vm/memory.go
index 1ddd8d1ea..58d6f7383 100644
--- a/core/vm/memory.go
+++ b/core/vm/memory.go
@@ -73,8 +73,7 @@ func (m *Memory) Set32(offset uint64, val *uint256.Int) {
panic("invalid memory: store empty")
}
// Fill in relevant bits
- b32 := val.Bytes32()
- copy(m.store[offset:], b32[:])
+ val.WriteToArray32((*[32]byte)(m.store[offset:]))
}
Benchmark result
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/vm
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
OpMstore-8 19.40n ± 27% 14.18n ± 17% -26.91% (p=0.002 n=10)
8dedb60 to
412a3be
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #190 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 5 5
Lines 1666 1675 +9
=========================================
+ Hits 1666 1675 +9 |
|
Nice work! WDYT? |
| // Note: The dest slice must be at least 32 bytes large, otherwise this | ||
| // method will panic. The method WriteToSlice, which is slower, should be used | ||
| // if the destination slice is smaller or of unknown size. | ||
| func (z *Int) PutUint256(dest []byte) { |
There was a problem hiding this comment.
I think we should add
_ = dest[31]
so that there is only 1 bound check at the beginning, otherwise, before each PutUint64, there will be a bound check.
There was a problem hiding this comment.
Sure. I couldn't see any tangible benefit on a bench, but it doesn't hurt
|
I just leave a comment. Overall, that function looks good to me. |
|
It is interesteing how much faster the new/optimized methods are, vs writing to slice: |
…ory (#30868) (#650) commit ethereum/go-ethereum@5c58612. Updates geth to use the uint256 v1.3.2, and use faster memory-writer to speed up MSTORE. goos: linux goarch: amd64 pkg: github.com/ethereum/go-ethereum/core/vm cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ OpMstore-8 18.18n ± 8% 12.58n ± 8% -30.76% (p=0.000 n=10) Link: holiman/uint256#190 Co-authored-by: Martin HS <martin@swende.se>
Make WriteToArray32 and WriteToArray20 use PutUint64 like in Bytes32 and Bytes20 to remove all branches and increase the number of bytes per 1 load/store to 8 bytes.
By changing Memory.Set32 in go-ethereum to use new WriteToArray32 we can reduce a memory copy which makes OpMstore faster.
The patch
Benchmark result