add RemoveCorruptedShardDataCommand#32281
Conversation
|
Pinging @elastic/es-distributed |
|
Such a tool would be very helpful when no sane shards are not available anymore as it is often better to lose a single segment than an entire shard! I left some comment on #31389 about the API. |
5828119 to
843f977
Compare
183b3da to
4116084
Compare
There was a problem hiding this comment.
I just had a chat with @vladimirdolzhenko and dropping the whole index is probably the safest option at this point when CheckIndex.Status.missing is true
There was a problem hiding this comment.
I added a link to LUCENE-6762 to the comment
4116084 to
2f116d8
Compare
9b6aece to
8e0891f
Compare
…elasticsearch-shard
8e0891f to
a8f1488
Compare
|
@lcawl could you please have a look into documentation part change ? |
|
I've added a "Command-line tools" section in the Index Modules page, rather than including that link under the settings. @debadair will also provide some feedback. |
|
@DaveCTurner I've addressed your comments, could you please have another look ? |
DaveCTurner
left a comment
There was a problem hiding this comment.
Minor stuff now, and I asked for another reviewer.
| ] | ||
| } | ||
|
|
||
| You must accept the possibility of data loss changing parameter `accept_data_loss` to `true`. |
There was a problem hiding this comment.
nits, sorry:
You must accept the possibility of data loss by changing the parameter
accept_data_losstotrue.
| for (int possibleLockId = fromNodeId; possibleLockId < toNodeId; possibleLockId++) { | ||
| final NodeEnvironment.NodeLock nodeLock; | ||
| try { | ||
| nodeLock = new NodeEnvironment.NodeLock(possibleLockId, logger, environment, p -> {}); |
There was a problem hiding this comment.
Could this just be try (Releasable nodeLock = new NodeEnvironment.NodeLock(possibleLockId, logger, environment, p -> {})) {... with the catch at the bottom of this method?
Also, if the directory doesn't exist, I think that FSDirectory.open() creates it, which is not what we want here, so the consumer passed to NodeLock's constructor needs to check for this and bail out sooner. See the protected FSDirectory(Path path, LockFactory lockFactory) constructor for instance. I think this warrants a test that we don't accidentally create a bunch of directories if we can't find the index we were looking for.
| private final Lock[] locks; | ||
| private final NodePath[] nodePaths; | ||
|
|
||
| public NodeLock(final int nodeId, final Logger logger, |
There was a problem hiding this comment.
This looks good to me but I would like another pair of eyes, as it's quite important not to break this. @ywelsch WDYT?
ywelsch
left a comment
There was a problem hiding this comment.
The NodeLock change looks ok to me. I would also like to revive the discussion that was initiated here about removing the max-local-storage "feature".
server/src/main/java/org/elasticsearch/index/shard/RemoveCorruptedShardDataCommand.java
Outdated
Show resolved
Hide resolved
DaveCTurner
left a comment
There was a problem hiding this comment.
LGTM, great work @vladimirdolzhenko
The docs are much improved (and much shorter) compared with the version that was reviewed
|
thank you @DaveCTurner for the valuable comments and review 💯 |
* master: (46 commits) Fixing assertions in integration test (elastic#33833) [CCR] Rename idle_shard_retry_delay to poll_timout in auto follow patterns (elastic#33821) HLRC: Delete ML calendar (elastic#33775) Move DocsStats into Engine (elastic#33835) [Docs] Clarify accessing Date methods in painless (elastic#33560) add elasticsearch-shard tool (elastic#32281) Cut over to unwrap segment reader (elastic#33843) SQL: Fix issue with options for QUERY() and MATCH(). (elastic#33828) Emphasize that filesystem-level backups don't work (elastic#33102) Use the global doc id to generate a random score (elastic#33599) Add minimal sanity checks to custom/scripted similarities. (elastic#33564) Profiler: Don’t profile NEXTDOC for ConstantScoreQuery. (elastic#33196) [CCR] Change FollowIndexAction.Request class to be more user friendly (elastic#33810) SQL: day and month name functions tests locale providers enforcement (elastic#33653) TESTS: Set SO_LINGER = 0 for MockNioTransport (elastic#32560) Test: Relax jarhell gradle test (elastic#33787) [CCR] Fail with a descriptive error if leader index does not exist (elastic#33797) Add ES version 6.4.2 (elastic#33831) MINOR: Remove Some Dead Code in Scripting (elastic#33800) Ensure realtime `_get` and `_termvectors` don't run on the network thread (elastic#33814) ...
Relates elastic#31389 (cherry picked from commit a3e8b83)
add
elasticsearch-shardcommand-line toolThe tool will refuse to run if there are no existing corruption markers (i.e., it will only work on known corrupted shards)
elasticsearch-shard remove-corrupted-segmentstool instead of dropped in dropindex.shard.check_on_startup: fix#32279index.shard.check_on_startup: fixsetting-dry-runoption could be there to provide an overviewRelates #31389