Skip to content

Store: batching writes to the block and state stores#1152

Closed
werty144 wants to merge 16 commits intomainfrom
anton/issue1040_db_batching
Closed

Store: batching writes to the block and state stores#1152
werty144 wants to merge 16 commits intomainfrom
anton/issue1040_db_batching

Conversation

@werty144
Copy link
Contributor

@werty144 werty144 commented Jul 21, 2023

Closes #1040


PR checklist

  • Tests written/updated
  • Changelog entry added in .changelog (we use unclog to manage our changelog)
  • Updated relevant documentation (docs/ or spec/) and code comments

@werty144 werty144 self-assigned this Jul 21, 2023
@werty144 werty144 marked this pull request as ready for review July 24, 2023 09:40
@werty144 werty144 requested a review from a team as a code owner July 24, 2023 09:40
@werty144 werty144 requested a review from a team July 24, 2023 09:40
Copy link
Contributor

@adizere adizere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Anton!

@thanethomson should we use main as the base branch for this PR? Unclear if there's a feature branch with storage improvements or not.

@werty144 Shall we also add a table with performance results so we can understand the impact of this batching mechanism?

state/store.go Outdated
}

err = store.db.Set(calcValidatorsKey(height), bz)
//err = store.db.Set(calcValidatorsKey(height), bz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we delete this comment ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

state/store.go Outdated
}

err = store.db.Set(calcConsensusParamsKey(nextHeight), bz)
//err = store.db.Set(calcConsensusParamsKey(nextHeight), bz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe delete this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@werty144
Copy link
Contributor Author

I've also performed some measurements using state/state_test.go/BenchmarkStateSave function.
Here are the results: https://docs.google.com/spreadsheets/d/1aZDcb4F6_ZCAqFm8E-JpBT5OwOZZPi-0gDzgsR5tmjg/edit?usp=sharing

  • Batching gives a minor improvement for CLevel, Rocks, Badger and Bolt DBs.
  • Batching is actually worse with GoLevelDB.
  • Interestingly BoltDB is ~2 times worse than others

@sergio-mena
Copy link
Collaborator

I quickly went through this and looks pretty good to me. I would say we can merge this to main with very little risk, unless the storage squad has decided on a different policy regarding feature branches.

Comment on lines +398 to +403
defer func(batch dbm.Batch) {
err := batch.Close()
if err != nil {
panic(err)
}
}(batch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would extract this as a function:

func closeBatchOrPanic(batch dbm.Batch) {
	if err := batch.Close(); err != nil {
		panic(err)
	}
}

Then you could simply:

Suggested change
defer func(batch dbm.Batch) {
err := batch.Close()
if err != nil {
panic(err)
}
}(batch)
defer closeBatchOrPanic(batch)

Makes the enclosing function a little easier to read. Similarly for this same type of pattern in other packages.

@thanethomson
Copy link
Contributor

I quickly went through this and looks pretty good to me. I would say we can merge this to main with very little risk, unless the storage squad has decided on a different policy regarding feature branches.

If the performance worsens with GoLevelDB (the most popular underlying DB for Comet atm, afaict) then I'm not sure we'd want to merge this.

I still find it a very useful exercise to have the data. Might be worthwhile to transcribe the sheet into a table that can be added into a comment on this PR for posterity.

@sergio-mena
Copy link
Collaborator

If the performance worsens with GoLevelDB (the most popular underlying DB for Comet atm, afaict) then I'm not sure we'd want to merge this.

I see, I have to believe the measurements as that's all we have, so yes, it's probably better to wait.

However, I'm surprised that simply batching DB transactions incurs an extra performance cost, as batching is done precisely to have to sync less often. Also, remember that the IG team had implemented batching in v0.35.x/v0.36.x although I don't know if they measured performance when they did it.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale For use by stalebot label Aug 15, 2023
@github-actions github-actions bot closed this Aug 20, 2023
@p0mvn
Copy link

p0mvn commented Aug 31, 2023

This might be related to why goleveldb is having issues: cosmos/iavl#619

TLDR; There is an optimal batch size. Having the batch be too large causes garbage collector overhead since GoLevelDB simply has a bytes slice where it stores data until commit.

@jmalicevic jmalicevic added wip Work in progress and removed stale For use by stalebot labels Sep 7, 2023
@jmalicevic
Copy link
Collaborator

Reopening this because we are still to decide whether to go with this after we benchmark it further.

@jmalicevic jmalicevic reopened this Sep 7, 2023
@jmalicevic jmalicevic added the P:storage-optimization Priority: Give operators greater control over storage and storage optimization label Sep 7, 2023
@melekes melekes mentioned this pull request Dec 4, 2023
3 tasks
@melekes
Copy link
Collaborator

melekes commented Dec 4, 2023

Superseded by #1735

@melekes melekes closed this Dec 4, 2023
@melekes melekes deleted the anton/issue1040_db_batching branch December 5, 2023 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P:storage-optimization Priority: Give operators greater control over storage and storage optimization storage wip Work in progress

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

Batch commits to the state and block store

8 participants