Skip to content

Add checksum and length to CS WAL record#672

Merged
ebuchman merged 9 commits intodevelopfrom
573-wal-issues
Oct 26, 2017
Merged

Add checksum and length to CS WAL record#672
ebuchman merged 9 commits intodevelopfrom
573-wal-issues

Conversation

@melekes
Copy link
Contributor

@melekes melekes commented Sep 21, 2017

See #573

{"time":"2017-04-27T22:23:56.312Z","msg":[2,{"msg":[19,{"Height":1,"Round":0,"Part":{"index":0,"bytes":"0101010F74656E6465726D696E745F74657374010114B96164A30A118001620000000001141F6753D22BACA2180B1EADD722434EB28444D91D0114354594CBFC1A7BCA1AD0050ED6AA010023EADA3900010162011A3631363236333634333133303344363436333632363133313330011A3631363236333634333133313344363436333632363133313331011A3631363236333634333133323344363436333632363133313332011A3631363236333634333133333344363436333632363133313333011A3631363236333634333133343344363436333632363133313334011A3631363236333634333133353344363436333632363133313335011A3631363236333634333133363344363436333632363133313336011A3631363236333634333133373344363436333632363133313337011A3631363236333634333133383344363436333632363133313338011A3631363236333634333133393344363436333632363133313339011A3631363236333634333233303344363436333632363133323330011A3631363236333634333233313344363436333632363133323331011A3631363236333634333233323344363436333632363133323332011A3631363236333634333233333344363436333632363133323333011A3631363236333634333233343344363436333632363133323334011A36313632363336","proof":{"aunts":["49F4B71E3D7C457415069E2EA916DB12F67AA8D0","D35A72BEDAAAAC17045D7BFAAFA94C2EC0B0A4C2","705BC647374F3495EE73C3F44C21E9BDB4731738"]}}}],"peer_key":""}]}
{"time":"2017-04-27T22:23:56.312Z","msg":[2,{"msg":[19,{"Height":1,"Round":0,"Part":{"index":1,"bytes":"34333233353344363436333632363133323335011A3631363236333634333233363344363436333632363133323336011A3631363236333634333233373344363436333632363133323337011A3631363236333634333233383344363436333632363133323338011A3631363236333634333233393344363436333632363133323339011A3631363236333634333333303344363436333632363133333330011A3631363236333634333333313344363436333632363133333331011A3631363236333634333333323344363436333632363133333332011A3631363236333634333333333344363436333632363133333333011A3631363236333634333333343344363436333632363133333334011A3631363236333634333333353344363436333632363133333335011A3631363236333634333333363344363436333632363133333336011A3631363236333634333333373344363436333632363133333337011A3631363236333634333333383344363436333632363133333338011A3631363236333634333333393344363436333632363133333339011A3631363236333634333433303344363436333632363133343330011A3631363236333634333433313344363436333632363133343331011A3631363236333634333433323344363436333632363133343332011A363136323633363433343333334436","proof":{"aunts":["5AD2A9A1A49A1FD6EF83F05FA4588F800B29DEF1","D35A72BEDAAAAC17045D7BFAAFA94C2EC0B0A4C2","705BC647374F3495EE73C3F44C21E9BDB4731738"]}}}],"peer_key":""}]}
{"time":"2017-04-27T22:23:56.312Z","msg":[2,{"msg":[19,{"Height":1,"Round":0,"Part":{"index":2,"bytes":"3436333632363133343333011A3631363236333634333433343344363436333632363133343334011A3631363236333634333433353344363436333632363133343335011A3631363236333634333433363344363436333632363133343336011A3631363236333634333433373344363436333632363133343337011A3631363236333634333433383344363436333632363133343338011A3631363236333634333433393344363436333632363133343339011A3631363236333634333533303344363436333632363133353330011A3631363236333634333533313344363436333632363133353331011A3631363236333634333533323344363436333632363133353332011A3631363236333634333533333344363436333632363133353333011A3631363236333634333533343344363436333632363133353334011A3631363236333634333533353344363436333632363133353335011A3631363236333634333533363344363436333632363133353336011A3631363236333634333533373344363436333632363133353337011A3631363236333634333533383344363436333632363133353338011A3631363236333634333533393344363436333632363133353339011A3631363236333634333633303344363436333632363133363330011A3631363236333634333633313344363436333632363133","proof":{"aunts":["8B5786C3D871EE37B0F4B2DECAC39E157340DFBE","705BC647374F3495EE73C3F44C21E9BDB4731738"]}}}],"peer_key":""}]}
{"time":"2017-04-27T22:23:56.312Z","msg":[2,{"msg":[19,{"Height":1,"Round":0,"Part":{"index":3,"bytes":"363331011A3631363236333634333633323344363436333632363133363332011A3631363236333634333633333344363436333632363133363333011A3631363236333634333633343344363436333632363133363334011A3631363236333634333633353344363436333632363133363335011A3631363236333634333633363344363436333632363133363336011A3631363236333634333633373344363436333632363133363337011A3631363236333634333633383344363436333632363133363338011A3631363236333634333633393344363436333632363133363339011A3631363236333634333733303344363436333632363133373330011A3631363236333634333733313344363436333632363133373331011A3631363236333634333733323344363436333632363133373332011A3631363236333634333733333344363436333632363133373333011A3631363236333634333733343344363436333632363133373334011A3631363236333634333733353344363436333632363133373335011A3631363236333634333733363344363436333632363133373336011A3631363236333634333733373344363436333632363133373337011A3631363236333634333733383344363436333632363133373338011A3631363236333634333733393344363436333632363133373339011A363136","proof":{"aunts":["56097661A1B2707588100586B3B1C2C8A51057D1","6DE889147DF528EEB5F7422E95DC45900CAFB619","247C721D5CEB90BB1FE389BA74C43DF0955E1647"]}}}],"peer_key":""}]}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, don't understand why TXs are bigger in old version cause I did not change random.sh params!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried running the build.sh script both locally on my Mac and inside VirtualBox (Linux)

Copy link
Contributor

@odeke-em odeke-em Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well technically I think you did, at https://github.com/tendermint/tendermint/pull/672/files/e874c021dc21d536302e23f3201aa71bf1584fff#diff-d7715a8a80c6b17d47e8a27ed32d059bL72
screen shot 2017-09-25 at 5 54 47 pm

And also it makes sense that you have an 8X reduction in size

package main

import (
	"log"

	"github.com/tendermint/go-wire"
	"github.com/tendermint/tendermint/consensus"
)

var data = map[string]string{
	"old": `{"Height":1,"Round":0,"Part":{"index":0,"bytes":"0101010F74656E6465726D696E745F74657374010114B96164A30A118001620000000001141F6753D22BACA2180B1EADD722434EB28444D91D0114354594CBFC1A7BCA1AD0050ED6AA010023EADA3900010162011A3631363236333634333133303344363436333632363133313330011A3631363236333634333133313344363436333632363133313331011A3631363236333634333133323344363436333632363133313332011A3631363236333634333133333344363436333632363133313333011A3631363236333634333133343344363436333632363133313334011A3631363236333634333133353344363436333632363133313335011A3631363236333634333133363344363436333632363133313336011A3631363236333634333133373344363436333632363133313337011A3631363236333634333133383344363436333632363133313338011A3631363236333634333133393344363436333632363133313339011A3631363236333634333233303344363436333632363133323330011A3631363236333634333233313344363436333632363133323331011A3631363236333634333233323344363436333632363133323332011A3631363236333634333233333344363436333632363133323333011A3631363236333634333233343344363436333632363133323334011A36313632363336","proof":{"aunts":["49F4B71E3D7C457415069E2EA916DB12F67AA8D0","D35A72BEDAAAAC17045D7BFAAFA94C2EC0B0A4C2","705BC647374F3495EE73C3F44C21E9BDB4731738"]}}}`,
	"new": `{"Height":1,"Round":0,"Part":{"index":0,"bytes":"0101010F74656E6465726D696E745F74657374010114E67C5AEDA6288000000000000001147297262C6CD96190E46846C9A0DE1227E76077CF00010001000000","proof":{"aunts":["C81B94933420221A7AC004A90242D8B1D3E5070D"]}}}`,
}

func main() {
	for name, datum := range data {
		var err error
		part := wire.ReadJSON(&consensus.BlockPartMessage{}, []byte(datum), &err).(*consensus.BlockPartMessage)
		if err != nil {
			log.Printf("%s: decoding: %v", name, err)
			continue
		}
		log.Printf("%s: len(part.Bytes): %d\n", name, len(part.Part.Bytes))
	}
}

giving

 go run main.go 
2017/09/25 17:53:41 old: len(part.Bytes): 512
2017/09/25 17:53:41 new: len(part.Bytes): 64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting only changes part_size for small_block2 case. I was talking more about all cases, not only this one.

@codecov-io
Copy link

codecov-io commented Sep 21, 2017

Codecov Report

❗ No coverage uploaded for pull request base (develop@ceedd4d). Click here to learn what that means.
The diff coverage is 58.11%.

@@            Coverage Diff             @@
##             develop     #672   +/-   ##
==========================================
  Coverage           ?   58.19%           
==========================================
  Files              ?       93           
  Lines              ?     9037           
  Branches           ?        0           
==========================================
  Hits               ?     5259           
  Misses             ?     3315           
  Partials           ?      463

@melekes melekes changed the title WIP: WAL issues Add checksum and length to CS WAL record Sep 22, 2017
@melekes melekes mentioned this pull request Sep 23, 2017
}
// check checksum
innerMsgBytes := wire.JSONBytes(msg.Msg)
crc32c := crc32.MakeTable(crc32.Castagnoli)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: if this function is in a hot path, we can declare as a global var crc32c = crc32.MakeTable(crc32.Castagnoli)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice idea

sleep 5
killall tendermint
kill -9 $PID
{ echo ""; echo "[consensus]"; echo "block_part_size = 64"; } >> ~/.tendermint/config.toml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo -e "\n[consensus]\nblock_part_size = 64" >> ~/.tendermint/config.toml

consensus/wal.go Outdated
}
// Write the wal message
var wmsgBytes = wire.JSONBytes(TimedWALMessage{time.Now(), wmsg})
innerMsgBytes := wire.JSONBytes(wmsg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see similar code duplicated from consensus/replay.go, perhaps we can extract it to a common helper

var crc32CastagnoliTable = crc32.MakeTable(crc32.Castagnoli)
func crc32Castagnoli(msg interface{}) (msgBytes []byte, crc uint32) {
   msgBytes = wire.JSONBytes(msg)
   crc = crc32.Checksum(msgBytes, crc32CastagnoliTable)
   return msgBytes, crc
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for 2 lines. don't think it's worth doing

Copy link
Contributor

@odeke-em odeke-em left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @melekes for the feature. IMO this is a great change that's going to help with fault tolerance and data integrity/verification. I've added some suggestions to it. I'd give it a concrete LGTM but am on OSX and this change trips out on mine since that's an already existent issue with build.sh

consensus/wal.go Outdated
}
// Write the wal message
var wmsgBytes = wire.JSONBytes(TimedWALMessage{time.Now(), wmsg})
innerMsgBytes := wire.JSONBytes(wmsg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably shouldn't use JSON here. It's kind of convenient for glancing at the wal, but for the sake of speed we should use a binary format. We can have a tool to map the binary lines to json for human readability.

Also, in that setting, I think we want the message size immediately after the checksum, ie. as Kyle put it "One simple approach to writing recoverable WALs is to write a fixed-size checksum, fixed-size length header, then variable-length record."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By here you mean the whole WAL? or just this intermediate encoding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh both. certainly dont really see any utility to the outer encoding being json if the inner encoding is binary, so might as well do it all binary. Do you have a different opinion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is just that we will lose readability. I guess if we will provide a wal2json-viewer script, which outputs WAL as json, it should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh, all tmlibs/autofile/group code is about strings (separated with "\n") including searching for "#ENDHEIGHT". I am rewriting it right now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"You can open a Group to keep restrictions on an AutoFile, like
the maximum size of each chunk, and/or the total amount of bytes
stored in the group."

interesting. so now it is possible to have one part of an msg in file X and the rest of it in file Y.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't sound good. Are you saying AutoFile won't work with binary format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I will work, but autofile/group won't. There is no Write([]byte) method to start with. And search functionality will be broken once we start using binary format.

# http://unix.stackexchange.com/questions/11305/grep-show-all-the-file-up-to-the-match
sed '/ENDHEIGHT: 1/Q' ~/.tendermint/data/cs.wal/wal > consensus/test_data/empty_block.cswal
# /q would print up to and including the match, then quit.
# /q doesn't include the match.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh ?

# /q would print up to and including the match, then quit.
# /q doesn't include the match.
# http://unix.stackexchange.com/questions/11305/grep-show-all-the-file-up-to-the-match
sed -e "/ENDHEIGHT: 1/Q" ~/.tendermint/data/cs.wal/wal > consensus/test_data/empty_block.cswal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-e ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not work on MacOS without -e

sleep 5
killall tendermint
kill -9 $PID
echo -e "\n[consensus]\nblock_part_size = 64" >> ~/.tendermint/config.toml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block_part_size is now in the genesis file, not the config, so we have to edit that!

Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See last review

@melekes
Copy link
Contributor Author

melekes commented Oct 5, 2017

I am gonna work on this someday, thanks

@melekes melekes self-assigned this Oct 9, 2017
@melekes melekes changed the title Add checksum and length to CS WAL record WIP: Add checksum and length to CS WAL record Oct 17, 2017
@melekes melekes force-pushed the 573-wal-issues branch 2 times, most recently from 3cac368 to 12025c2 Compare October 23, 2017 18:02
updated test_data/build.sh script
change echo format in build.sh script
@melekes
Copy link
Contributor Author

melekes commented Oct 23, 2017

TODO:

  • write docs for cutWALUntil and wal2json

Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. A little unsure about the separator but not immediately sure how else to do things.

Also, is scripts/wal2json/wal2json accidentally checked in ?


// we write those lines up to (not including) one with the signature
walFile := writeWAL(strings.Join(split[:nLines], "\n") + "\n")
bytes := bytes.Join(split[:nLines], walSeparator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets try not to reuse import names

#!/usr/bin/env bash

# XXX: removes tendermint dir
# TODO: does not work on OSX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think I figured it out - you need to install gnu-sed. eg. brew install gnu-sed --with-default-names ... that should make your sed point to usr/local/bin/sed which is actually gnu-sed.

We should document this better, but eg. https://www.topbug.net/blog/2013/04/14/install-and-use-gnu-command-line-tools-in-mac-os-x/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the script does not contain sed anymore, so we can remove this line safely. But good to know anyway!



# small block 2 (part size = 512)
# small block 2 (part size = 64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

512

// types and functions for savings consensus messages

var (
walSeparator = []byte{55, 127, 6, 130} // 0x377f0682 - magic number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when this sequence shows up in a block ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're using our old group#Search method or gr.ReadLine, we would definitely get bad results. Imagine for some reason we failed to write a separator "\n" or it's lost due to FS errors, then if we use gr.ReadLine we'll get 2 messages, but only parse one.

{"time":"2017-10-24T08:13:04.564Z","msg":{"height":6,"round":0,"step":"RoundStepNewHeight"}}{"time":"2017-10-24T08:13:04.564Z","msg":{"height":7,"round":0,"step":"RoundStepNewHeight"}}\n

new WAL#SearchForHeight method sequentially reads the WAL file and panics if no separator is found.

what happens when this sequence shows up in a block ?

answering your original question, nothing bad will happen because as I explained above we're no longer splitting the log (using strings.Split or bytes.Split; bytes.Split is only used in tests), but rather read it byte by byte (4bytes CRC -> 4 bytes length -> (length) data -> separator -> 4bytes CRC -> 4 bytes length ...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep thanks

_, err := enc.wr.Write(msg)

if err == nil {
// TODO [Anton Kaliaev 23 Oct 2017]: remove separator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean we're hoping to do all this without a separator ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@melekes
Copy link
Contributor Author

melekes commented Oct 24, 2017

Looking good. A little unsure about the separator but not immediately sure how else to do things.

We only need the separator for tests, where we use it to split WAL into pieces (from these pieces we are building temporary WAL). But nobody prevents us from just reading N pieces from WAL directly without splitting it.

Also, is scripts/wal2json/wal2json accidentally checked in ?

yes, sorry.

@ebuchman ebuchman changed the title WIP: Add checksum and length to CS WAL record Add checksum and length to CS WAL record Oct 25, 2017
return nil, err
}
if err != nil {
return nil, fmt.Errorf("not enough bytes for data: %v (want: %d, read: %v)", err, length, n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err == nil, do we always have n == length ?

And how is this reported err different from EOF here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an impression that Read always returns err if n < length, but that seems not to be the case When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call.. We probably need to write more tests and revisit implementation of groupReader#Read.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


func main() {
if len(os.Args) < 4 {
fmt.Println("3 arguments required: <path-to-wal> height-to-stop <output-wal>")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

< >

@ebuchman ebuchman merged commit 6af28ea into develop Oct 26, 2017
@ebuchman ebuchman deleted the 573-wal-issues branch October 26, 2017 04:31
if !found {
return errors.New(cmn.Fmt("Cannot replay height %d. WAL does not contain #ENDHEIGHT for %d.", csHeight, csHeight-1))
}
defer gr.Close()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be above if !found.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems in all cases where !found, we also have gr == nil. Otherwise, we panic when we try to close gr.Close().

faddat referenced this pull request in notional-labs/tendermint May 3, 2023
* cmd: Remove snake case deprecation warning (#626)

* cmd: Remove snake case deprecation warning

Signed-off-by: Thane Thomson <connect@thanethomson.com>

* cmd: Add snake_case aliases for commands that do not have

Signed-off-by: Thane Thomson <connect@thanethomson.com>

---------

Signed-off-by: Thane Thomson <connect@thanethomson.com>
(cherry picked from commit c1214b2)

# Conflicts:
#	cmd/cometbft/commands/reindex_event.go
#	cmd/cometbft/commands/root.go

* Resolve conflicts

Signed-off-by: Thane Thomson <connect@thanethomson.com>

* cmd: Restore short description for reindex-event to conform to v0.34 convention

Signed-off-by: Thane Thomson <connect@thanethomson.com>

---------

Signed-off-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: Thane Thomson <connect@thanethomson.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants