Add checksum and length to CS WAL record#672
Conversation
| {"time":"2017-04-27T22:23:56.312Z","msg":[2,{"msg":[19,{"Height":1,"Round":0,"Part":{"index":0,"bytes":"0101010F74656E6465726D696E745F74657374010114B96164A30A118001620000000001141F6753D22BACA2180B1EADD722434EB28444D91D0114354594CBFC1A7BCA1AD0050ED6AA010023EADA3900010162011A3631363236333634333133303344363436333632363133313330011A3631363236333634333133313344363436333632363133313331011A3631363236333634333133323344363436333632363133313332011A3631363236333634333133333344363436333632363133313333011A3631363236333634333133343344363436333632363133313334011A3631363236333634333133353344363436333632363133313335011A3631363236333634333133363344363436333632363133313336011A3631363236333634333133373344363436333632363133313337011A3631363236333634333133383344363436333632363133313338011A3631363236333634333133393344363436333632363133313339011A3631363236333634333233303344363436333632363133323330011A3631363236333634333233313344363436333632363133323331011A3631363236333634333233323344363436333632363133323332011A3631363236333634333233333344363436333632363133323333011A3631363236333634333233343344363436333632363133323334011A36313632363336","proof":{"aunts":["49F4B71E3D7C457415069E2EA916DB12F67AA8D0","D35A72BEDAAAAC17045D7BFAAFA94C2EC0B0A4C2","705BC647374F3495EE73C3F44C21E9BDB4731738"]}}}],"peer_key":""}]} | ||
| {"time":"2017-04-27T22:23:56.312Z","msg":[2,{"msg":[19,{"Height":1,"Round":0,"Part":{"index":1,"bytes":"34333233353344363436333632363133323335011A3631363236333634333233363344363436333632363133323336011A3631363236333634333233373344363436333632363133323337011A3631363236333634333233383344363436333632363133323338011A3631363236333634333233393344363436333632363133323339011A3631363236333634333333303344363436333632363133333330011A3631363236333634333333313344363436333632363133333331011A3631363236333634333333323344363436333632363133333332011A3631363236333634333333333344363436333632363133333333011A3631363236333634333333343344363436333632363133333334011A3631363236333634333333353344363436333632363133333335011A3631363236333634333333363344363436333632363133333336011A3631363236333634333333373344363436333632363133333337011A3631363236333634333333383344363436333632363133333338011A3631363236333634333333393344363436333632363133333339011A3631363236333634333433303344363436333632363133343330011A3631363236333634333433313344363436333632363133343331011A3631363236333634333433323344363436333632363133343332011A363136323633363433343333334436","proof":{"aunts":["5AD2A9A1A49A1FD6EF83F05FA4588F800B29DEF1","D35A72BEDAAAAC17045D7BFAAFA94C2EC0B0A4C2","705BC647374F3495EE73C3F44C21E9BDB4731738"]}}}],"peer_key":""}]} | ||
| {"time":"2017-04-27T22:23:56.312Z","msg":[2,{"msg":[19,{"Height":1,"Round":0,"Part":{"index":2,"bytes":"3436333632363133343333011A3631363236333634333433343344363436333632363133343334011A3631363236333634333433353344363436333632363133343335011A3631363236333634333433363344363436333632363133343336011A3631363236333634333433373344363436333632363133343337011A3631363236333634333433383344363436333632363133343338011A3631363236333634333433393344363436333632363133343339011A3631363236333634333533303344363436333632363133353330011A3631363236333634333533313344363436333632363133353331011A3631363236333634333533323344363436333632363133353332011A3631363236333634333533333344363436333632363133353333011A3631363236333634333533343344363436333632363133353334011A3631363236333634333533353344363436333632363133353335011A3631363236333634333533363344363436333632363133353336011A3631363236333634333533373344363436333632363133353337011A3631363236333634333533383344363436333632363133353338011A3631363236333634333533393344363436333632363133353339011A3631363236333634333633303344363436333632363133363330011A3631363236333634333633313344363436333632363133","proof":{"aunts":["8B5786C3D871EE37B0F4B2DECAC39E157340DFBE","705BC647374F3495EE73C3F44C21E9BDB4731738"]}}}],"peer_key":""}]} | ||
| {"time":"2017-04-27T22:23:56.312Z","msg":[2,{"msg":[19,{"Height":1,"Round":0,"Part":{"index":3,"bytes":"363331011A3631363236333634333633323344363436333632363133363332011A3631363236333634333633333344363436333632363133363333011A3631363236333634333633343344363436333632363133363334011A3631363236333634333633353344363436333632363133363335011A3631363236333634333633363344363436333632363133363336011A3631363236333634333633373344363436333632363133363337011A3631363236333634333633383344363436333632363133363338011A3631363236333634333633393344363436333632363133363339011A3631363236333634333733303344363436333632363133373330011A3631363236333634333733313344363436333632363133373331011A3631363236333634333733323344363436333632363133373332011A3631363236333634333733333344363436333632363133373333011A3631363236333634333733343344363436333632363133373334011A3631363236333634333733353344363436333632363133373335011A3631363236333634333733363344363436333632363133373336011A3631363236333634333733373344363436333632363133373337011A3631363236333634333733383344363436333632363133373338011A3631363236333634333733393344363436333632363133373339011A363136","proof":{"aunts":["56097661A1B2707588100586B3B1C2C8A51057D1","6DE889147DF528EEB5F7422E95DC45900CAFB619","247C721D5CEB90BB1FE389BA74C43DF0955E1647"]}}}],"peer_key":""}]} |
There was a problem hiding this comment.
Still, don't understand why TXs are bigger in old version cause I did not change random.sh params!
There was a problem hiding this comment.
I've tried running the build.sh script both locally on my Mac and inside VirtualBox (Linux)
There was a problem hiding this comment.
Well technically I think you did, at https://github.com/tendermint/tendermint/pull/672/files/e874c021dc21d536302e23f3201aa71bf1584fff#diff-d7715a8a80c6b17d47e8a27ed32d059bL72

And also it makes sense that you have an 8X reduction in size
package main
import (
"log"
"github.com/tendermint/go-wire"
"github.com/tendermint/tendermint/consensus"
)
var data = map[string]string{
"old": `{"Height":1,"Round":0,"Part":{"index":0,"bytes":"0101010F74656E6465726D696E745F74657374010114B96164A30A118001620000000001141F6753D22BACA2180B1EADD722434EB28444D91D0114354594CBFC1A7BCA1AD0050ED6AA010023EADA3900010162011A3631363236333634333133303344363436333632363133313330011A3631363236333634333133313344363436333632363133313331011A3631363236333634333133323344363436333632363133313332011A3631363236333634333133333344363436333632363133313333011A3631363236333634333133343344363436333632363133313334011A3631363236333634333133353344363436333632363133313335011A3631363236333634333133363344363436333632363133313336011A3631363236333634333133373344363436333632363133313337011A3631363236333634333133383344363436333632363133313338011A3631363236333634333133393344363436333632363133313339011A3631363236333634333233303344363436333632363133323330011A3631363236333634333233313344363436333632363133323331011A3631363236333634333233323344363436333632363133323332011A3631363236333634333233333344363436333632363133323333011A3631363236333634333233343344363436333632363133323334011A36313632363336","proof":{"aunts":["49F4B71E3D7C457415069E2EA916DB12F67AA8D0","D35A72BEDAAAAC17045D7BFAAFA94C2EC0B0A4C2","705BC647374F3495EE73C3F44C21E9BDB4731738"]}}}`,
"new": `{"Height":1,"Round":0,"Part":{"index":0,"bytes":"0101010F74656E6465726D696E745F74657374010114E67C5AEDA6288000000000000001147297262C6CD96190E46846C9A0DE1227E76077CF00010001000000","proof":{"aunts":["C81B94933420221A7AC004A90242D8B1D3E5070D"]}}}`,
}
func main() {
for name, datum := range data {
var err error
part := wire.ReadJSON(&consensus.BlockPartMessage{}, []byte(datum), &err).(*consensus.BlockPartMessage)
if err != nil {
log.Printf("%s: decoding: %v", name, err)
continue
}
log.Printf("%s: len(part.Bytes): %d\n", name, len(part.Part.Bytes))
}
}giving
go run main.go
2017/09/25 17:53:41 old: len(part.Bytes): 512
2017/09/25 17:53:41 new: len(part.Bytes): 64There was a problem hiding this comment.
This setting only changes part_size for small_block2 case. I was talking more about all cases, not only this one.
Codecov Report
@@ Coverage Diff @@
## develop #672 +/- ##
==========================================
Coverage ? 58.19%
==========================================
Files ? 93
Lines ? 9037
Branches ? 0
==========================================
Hits ? 5259
Misses ? 3315
Partials ? 463 |
consensus/replay.go
Outdated
| } | ||
| // check checksum | ||
| innerMsgBytes := wire.JSONBytes(msg.Msg) | ||
| crc32c := crc32.MakeTable(crc32.Castagnoli) |
There was a problem hiding this comment.
Minor nit: if this function is in a hot path, we can declare as a global var crc32c = crc32.MakeTable(crc32.Castagnoli)
consensus/test_data/build.sh
Outdated
| sleep 5 | ||
| killall tendermint | ||
| kill -9 $PID | ||
| { echo ""; echo "[consensus]"; echo "block_part_size = 64"; } >> ~/.tendermint/config.toml |
There was a problem hiding this comment.
echo -e "\n[consensus]\nblock_part_size = 64" >> ~/.tendermint/config.toml
consensus/wal.go
Outdated
| } | ||
| // Write the wal message | ||
| var wmsgBytes = wire.JSONBytes(TimedWALMessage{time.Now(), wmsg}) | ||
| innerMsgBytes := wire.JSONBytes(wmsg) |
There was a problem hiding this comment.
I see similar code duplicated from consensus/replay.go, perhaps we can extract it to a common helper
var crc32CastagnoliTable = crc32.MakeTable(crc32.Castagnoli)
func crc32Castagnoli(msg interface{}) (msgBytes []byte, crc uint32) {
msgBytes = wire.JSONBytes(msg)
crc = crc32.Checksum(msgBytes, crc32CastagnoliTable)
return msgBytes, crc
}There was a problem hiding this comment.
for 2 lines. don't think it's worth doing
odeke-em
left a comment
There was a problem hiding this comment.
Thank you @melekes for the feature. IMO this is a great change that's going to help with fault tolerance and data integrity/verification. I've added some suggestions to it. I'd give it a concrete LGTM but am on OSX and this change trips out on mine since that's an already existent issue with build.sh
consensus/wal.go
Outdated
| } | ||
| // Write the wal message | ||
| var wmsgBytes = wire.JSONBytes(TimedWALMessage{time.Now(), wmsg}) | ||
| innerMsgBytes := wire.JSONBytes(wmsg) |
There was a problem hiding this comment.
I think we probably shouldn't use JSON here. It's kind of convenient for glancing at the wal, but for the sake of speed we should use a binary format. We can have a tool to map the binary lines to json for human readability.
Also, in that setting, I think we want the message size immediately after the checksum, ie. as Kyle put it "One simple approach to writing recoverable WALs is to write a fixed-size checksum, fixed-size length header, then variable-length record."
There was a problem hiding this comment.
By here you mean the whole WAL? or just this intermediate encoding?
There was a problem hiding this comment.
yeh both. certainly dont really see any utility to the outer encoding being json if the inner encoding is binary, so might as well do it all binary. Do you have a different opinion?
There was a problem hiding this comment.
No, it is just that we will lose readability. I guess if we will provide a wal2json-viewer script, which outputs WAL as json, it should be fine.
There was a problem hiding this comment.
doh, all tmlibs/autofile/group code is about strings (separated with "\n") including searching for "#ENDHEIGHT". I am rewriting it right now
There was a problem hiding this comment.
"You can open a Group to keep restrictions on an AutoFile, like
the maximum size of each chunk, and/or the total amount of bytes
stored in the group."
interesting. so now it is possible to have one part of an msg in file X and the rest of it in file Y.
There was a problem hiding this comment.
That doesn't sound good. Are you saying AutoFile won't work with binary format?
There was a problem hiding this comment.
No, I will work, but autofile/group won't. There is no Write([]byte) method to start with. And search functionality will be broken once we start using binary format.
consensus/test_data/build.sh
Outdated
| # http://unix.stackexchange.com/questions/11305/grep-show-all-the-file-up-to-the-match | ||
| sed '/ENDHEIGHT: 1/Q' ~/.tendermint/data/cs.wal/wal > consensus/test_data/empty_block.cswal | ||
| # /q would print up to and including the match, then quit. | ||
| # /q doesn't include the match. |
consensus/test_data/build.sh
Outdated
| # /q would print up to and including the match, then quit. | ||
| # /q doesn't include the match. | ||
| # http://unix.stackexchange.com/questions/11305/grep-show-all-the-file-up-to-the-match | ||
| sed -e "/ENDHEIGHT: 1/Q" ~/.tendermint/data/cs.wal/wal > consensus/test_data/empty_block.cswal |
There was a problem hiding this comment.
it does not work on MacOS without -e
consensus/test_data/build.sh
Outdated
| sleep 5 | ||
| killall tendermint | ||
| kill -9 $PID | ||
| echo -e "\n[consensus]\nblock_part_size = 64" >> ~/.tendermint/config.toml |
There was a problem hiding this comment.
block_part_size is now in the genesis file, not the config, so we have to edit that!
|
I am gonna work on this someday, thanks |
7d424a9 to
71caca6
Compare
3cac368 to
12025c2
Compare
updated test_data/build.sh script
change echo format in build.sh script
12025c2 to
3115c23
Compare
|
TODO:
|
ebuchman
left a comment
There was a problem hiding this comment.
Looking good. A little unsure about the separator but not immediately sure how else to do things.
Also, is scripts/wal2json/wal2json accidentally checked in ?
consensus/replay_test.go
Outdated
|
|
||
| // we write those lines up to (not including) one with the signature | ||
| walFile := writeWAL(strings.Join(split[:nLines], "\n") + "\n") | ||
| bytes := bytes.Join(split[:nLines], walSeparator) |
There was a problem hiding this comment.
lets try not to reuse import names
consensus/test_data/build.sh
Outdated
| #!/usr/bin/env bash | ||
|
|
||
| # XXX: removes tendermint dir | ||
| # TODO: does not work on OSX |
There was a problem hiding this comment.
i think I figured it out - you need to install gnu-sed. eg. brew install gnu-sed --with-default-names ... that should make your sed point to usr/local/bin/sed which is actually gnu-sed.
We should document this better, but eg. https://www.topbug.net/blog/2013/04/14/install-and-use-gnu-command-line-tools-in-mac-os-x/
There was a problem hiding this comment.
Actually, the script does not contain sed anymore, so we can remove this line safely. But good to know anyway!
consensus/test_data/build.sh
Outdated
|
|
||
|
|
||
| # small block 2 (part size = 512) | ||
| # small block 2 (part size = 64) |
| // types and functions for savings consensus messages | ||
|
|
||
| var ( | ||
| walSeparator = []byte{55, 127, 6, 130} // 0x377f0682 - magic number |
There was a problem hiding this comment.
what happens when this sequence shows up in a block ?
There was a problem hiding this comment.
If we're using our old group#Search method or gr.ReadLine, we would definitely get bad results. Imagine for some reason we failed to write a separator "\n" or it's lost due to FS errors, then if we use gr.ReadLine we'll get 2 messages, but only parse one.
{"time":"2017-10-24T08:13:04.564Z","msg":{"height":6,"round":0,"step":"RoundStepNewHeight"}}{"time":"2017-10-24T08:13:04.564Z","msg":{"height":7,"round":0,"step":"RoundStepNewHeight"}}\n
new WAL#SearchForHeight method sequentially reads the WAL file and panics if no separator is found.
what happens when this sequence shows up in a block ?
answering your original question, nothing bad will happen because as I explained above we're no longer splitting the log (using strings.Split or bytes.Split; bytes.Split is only used in tests), but rather read it byte by byte (4bytes CRC -> 4 bytes length -> (length) data -> separator -> 4bytes CRC -> 4 bytes length ...)
| _, err := enc.wr.Write(msg) | ||
|
|
||
| if err == nil { | ||
| // TODO [Anton Kaliaev 23 Oct 2017]: remove separator |
There was a problem hiding this comment.
does this mean we're hoping to do all this without a separator ?
We only need the separator for tests, where we use it to split WAL into pieces (from these pieces we are building temporary WAL). But nobody prevents us from just reading N pieces from WAL directly without splitting it.
yes, sorry. |
48dc80f to
c74a359
Compare
| return nil, err | ||
| } | ||
| if err != nil { | ||
| return nil, fmt.Errorf("not enough bytes for data: %v (want: %d, read: %v)", err, length, n) |
There was a problem hiding this comment.
if err == nil, do we always have n == length ?
And how is this reported err different from EOF here?
There was a problem hiding this comment.
I had an impression that Read always returns err if n < length, but that seems not to be the case When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call.. We probably need to write more tests and revisit implementation of groupReader#Read.
scripts/cutWALUntil/main.go
Outdated
|
|
||
| func main() { | ||
| if len(os.Args) < 4 { | ||
| fmt.Println("3 arguments required: <path-to-wal> height-to-stop <output-wal>") |
| if !found { | ||
| return errors.New(cmn.Fmt("Cannot replay height %d. WAL does not contain #ENDHEIGHT for %d.", csHeight, csHeight-1)) | ||
| } | ||
| defer gr.Close() |
There was a problem hiding this comment.
I think it should be above if !found.
There was a problem hiding this comment.
It seems in all cases where !found, we also have gr == nil. Otherwise, we panic when we try to close gr.Close().
* cmd: Remove snake case deprecation warning (#626) * cmd: Remove snake case deprecation warning Signed-off-by: Thane Thomson <connect@thanethomson.com> * cmd: Add snake_case aliases for commands that do not have Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com> (cherry picked from commit c1214b2) # Conflicts: # cmd/cometbft/commands/reindex_event.go # cmd/cometbft/commands/root.go * Resolve conflicts Signed-off-by: Thane Thomson <connect@thanethomson.com> * cmd: Restore short description for reindex-event to conform to v0.34 convention Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com> Co-authored-by: Thane Thomson <connect@thanethomson.com>
See #573