fix(storagenode): restore uncommitted logs#492
Merged
Conversation
Member
Author
|
Current dependencies on/for this PR:
This comment was auto-generated by Graphite. |
0f85da0 to
aee027a
Compare
11968c6 to
50212da
Compare
aee027a to
3c767bf
Compare
50212da to
e1a5841
Compare
3c767bf to
9e8209f
Compare
e1a5841 to
9cdc574
Compare
9e8209f to
aee1b52
Compare
9cdc574 to
3957d42
Compare
aee1b52 to
6f87f44
Compare
hungryjang
approved these changes
Jul 27, 2023
Currently, all log stream replicas belonging to the storage nodes that are just restarted can't commit logs written before restarting if all log stream replicas in a log stream were restarted simultaneously. They have logs uncommitted in their storages, but they can't process Commit RPC sent from the metadata repository. This PR fixes the above issue. While recovering the log stream context after restarting the storage nodes, it restores uncommitted logs. Resolves #490
3957d42 to
43b59f4
Compare
6f87f44 to
267cccc
Compare
Member
Author
Member
Author
ijsong
added a commit
that referenced
this pull request
Aug 7, 2023
🤖 I have created a release *beep* *boop* --- ## [0.15.0](v0.14.1...v0.15.0) (2023-07-31) ### Features * **admin:** add otelgrpc metric interceptor ([d9ca9aa](d9ca9aa)) * **admin:** add otelgrpc metric interceptor ([#509](#509)) ([db7a1a2](db7a1a2)) * **admin:** speed up fetching cluster metadata ([3e46f62](3e46f62)) * **admin:** speed up fetching cluster metadata ([#480](#480)) ([53a8f19](53a8f19)) * **all:** add common flags for telemetry ([fcacd1a](fcacd1a)) * **all:** add common flags for telemetry ([#494](#494)) ([63355e9](63355e9)) * **benchmark:** share a connection between appenders in a target ([7dc53e9](7dc53e9)) * **benchmark:** share a connection between appenders in a target ([#524](#524)) ([2cd9196](2cd9196)) * **client:** add Clear to the log stream appender manager ([9a89065](9a89065)) * **client:** add Clear to the log stream appender manager ([#514](#514)) ([e5b6a2e](e5b6a2e)) * **storagenode:** add --storage-trim-delay to set a delay before the deletion of log entries ([db39713](db39713)) * **storagenode:** add --storage-trim-delay to set a delay before the deletion of log entries ([#529](#529)) ([015bfa4](015bfa4)) * **storagenode:** add --storage-trim-rate to set throttling rate of Trim ([83b7496](83b7496)) * **storagenode:** add --storage-trim-rate to set throttling rate of Trim ([#530](#530)) ([6e69306](6e69306)) * **telemetry:** customize bucket size of process.runtime.go.gc.pause_ns ([b181132](b181132)) * **telemetry:** customize bucket size of process.runtime.go.gc.pause_ns ([#510](#510)) ([9d99520](9d99520)) * **telemetry:** customize bucket size of rpc.server.duration ([a0e5973](a0e5973)) * **telemetry:** customize bucket size of rpc.server.duration ([#511](#511)) ([e41fe1c](e41fe1c)) ### Bug Fixes * **benchmark:** make append duration's precision high ([e3a091d](e3a091d)) * **benchmark:** make append duration's precision high ([#522](#522)) ([815af53](815af53)) * **benchmark:** support graceful stop ([8616d55](8616d55)) * **benchmark:** support graceful stop ([#527](#527)) ([fc4ed81](fc4ed81)) * **metarepos:** add TestMRIgnoreDirtyReport ([fe2a550](fe2a550)) * **metarepos:** allow set commitTick ([bdca20a](bdca20a)) * **metarepos:** ignore invalid report ([e8620de](e8620de)) * **storagenode:** ignore context error while checking to interleave of Append RPC errors ([04d1052](04d1052)) * **storagenode:** ignore context error while checking to interleave of Append RPC errors ([#504](#504)) ([5a7a3b0](5a7a3b0)) * **storagenode:** restore uncommitted logs ([267cccc](267cccc)), closes [#490](#490) * **storagenode:** restore uncommitted logs ([#492](#492)) ([a9832ee](a9832ee)), closes [#490](#490) ### Performance Improvements * **admin:** use singleflight to handle Admin's RPCs ([c231888](c231888)) * **admin:** use singleflight to handle Admin's RPCs ([#482](#482)) ([1a6a96d](1a6a96d)) * **metarepos:** add a pool for []*mrpb.Report ([fa8c89d](fa8c89d)) * **metarepos:** add a pool for []*mrpb.Report ([#534](#534)) ([16b2181](16b2181)) * **metarepos:** add a pool for *mrpb.RaftEntry ([be9f121](be9f121)) * **metarepos:** add a pool for *mrpb.RaftEntry ([#536](#536)) ([96ab5e2](96ab5e2)) * **metarepos:** add a pool for mrpb.Reports ([59a6a5a](59a6a5a)) * **metarepos:** add a pool for mrpb.Reports ([#533](#533)) ([b227c75](b227c75)) * **metarepos:** avoid copy overhead by removing unnecessary converting from byte slice to string ([a775628](a775628)) * **metarepos:** avoid copy overhead by removing unnecessary converting from byte slice to string ([#532](#532)) ([1702769](1702769)) * **metarepos:** reuse mrpb.StorageNodeUncommitReport while changed ([57d8039](57d8039)) * **metarepos:** reuse mrpb.StorageNodeUncommitReport while changed ([#537](#537)) ([8f6e097](8f6e097)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What this PR does
Currently, all log stream replicas belonging to the storage nodes that are just
restarted can't commit logs written before restarting if all log stream
replicas in a log stream were restarted simultaneously. They have logs
uncommitted in their storages, but they can't process Commit RPC sent from the
metadata repository.
This PR fixes the above issue. While recovering the log stream context after
restarting the storage nodes, it restores uncommitted logs.
Which issue(s) this PR resolves
Resolves #490