Extend reroute with an option to force assign stale primary shard copies by ywelsch · Pull Request #15708 · elastic/elasticsearch

ywelsch · 2015-12-30T10:16:30Z

Relates to #14739.

nik9000 · 2016-01-04T14:40:02Z

...a/org/elasticsearch/cluster/routing/allocation/command/AllocatePrimaryAllocationCommand.java

Should this use the ObjectParser? I think everything is going that way.

I essentially copy pasted this stuff from the AllocateAllocationCommand. You are right though, I should transform it to use ObjectParser.

bleskes · 2016-01-07T11:10:34Z

@ywelsch and I discussed this yesterday. Here is the gist of the discussion:

Instead of fetching the store data in the command, adapt unassigned info so that the node initializing the shard will expect local data and fail if it's not there.
Remove the ignore allocation decider param and always assign if the user asked for it (in the case of the primary assignment)
Try to fold the new AllocatePrimaryAllocationCommand in the current AllocateAllocationCommand.

Here's a suggest for how the rest layer will look like for that:

POST _cluster/reroute
{
  "commands": [
    {
      "allocate": {
        "index": "index",
        "shard": 0,
        "node": "node",
        "allow_primary": "never|existing_only|force_empty"
      }
    }
  ]

Note the allow_primary option. The default, never refuses to allocate a primary. existing_only allocates it but must reuse an existing copy. force_empty does what true does now - i.e., force the creation of an empty primary. Also the last two should imply ignoring the allocation deciders.

@ywelsch thoughts?

ywelsch · 2016-01-07T17:03:32Z

Before this PR, users only had the possibility to either allocate a replica or an empty primary. The flag "allow_primary" was misleading in that context. I would very much like to get rid of this flag altogether and introduce something new. What about introducing a new "mode" field:

"mode": "replica" // default, only allocates unassigned replica shard (fails if unassigned primary exists)
"mode": "primary" // only allocates if primary unassigned and if data available on node
"mode": "empty_primary" // only allocates if primary unassigned, creates empty shard

"mode": "replica" is equivalent to the old "allowPrimary" : "false"
"mode": "empty_primary" is equivalent to the old "allowPrimary" : "true"

Also, we should check for consistency with the cancel command that currently has an "allowPrimary" flag as well.

ywelsch · 2016-01-08T09:06:56Z

Also, relates to #14352

bleskes · 2016-01-08T09:18:59Z

I think mode is a bit ambiguous as a flag. I do understand the concerns that allow_primary will now do something different than the allow primary in the cancel command. As far as the values. I like "existing_only" better than just "primary" as it better communicates the restriction (we must use local data). Also I like "force_empty" better than "empty_primary" because it communicates the implications better and the severity better. It also better describes what we do today - always create an empty shard, even if there is local data. That last one is of course subject to change (but that's a bigger discussion)

ywelsch · 2016-01-08T12:27:39Z

As far as the values. I like "existing_only" better than just "primary" as it better communicates the restriction (we must use local data). Also I like "force_empty" better than "empty_primary" because it communicates the implications better and the severity better. It also better describes what we do today - always create an empty shard, even if there is local data.

$ grep -o "better" | wc -l
6

sorry ;-)

clintongormley · 2016-01-09T11:12:51Z

We have a stalled issue (#14352) about choosing a better name for allow_primary which properly indicates the fact that you can lose your data. The issue is stalled because it is difficult to come up with the right name.

Instead, what about changing the action name from allocate to:

allocate_replica
allocate_empty_primary
and for this PR: allocate_stale_primary

@s1monw is concerned about exposing these last two actions via the API and would prefer to provide a command line tool (perhaps as a plugin) to do this job instead. The idea is to make it harder for users to make a change that will delete data, and will hopefully force them to read about the consequences of their action.

While I understand this sentiment, I'm not convinced that a command line tool is the way to go here. ES is API driven so this would be quite a change from what we do today, and there are probably many users who don't have command line access to their hosted clusters, and who need to fix a problem quickly without having to wait for support to respond to their tickets.

Instead, I'd propose adding an accept_data_loss flag to these actions which must be set to true for the action to be taken. I think this would be enough to warn the user that they're doing something dangerous.

bleskes · 2016-01-09T11:42:31Z

+1. My only suggestion would be to use 'force_empty_primary' and 'force_stale_primary' as command names, to make it even clearer that (all) shard content will be lost.

On 9 jan. 2016 12:13 PM +0100, Clinton Gormleynotifications@github.com, wrote:

We have a stalled issue (#14352(#14352)) about choosing a better name forallow_primarywhich properly indicates the fact that you can lose your data. The issue is stalled because it is difficult to come up with the right name.

Instead, what about changing the action name fromallocateto:

allocate_replica
allocate_empty_primary
and for this PR:allocate_stale_primary

@s1monw(https://github.com/s1monw)is concerned about exposing these last two actions via the API and would prefer to provide a command line tool (perhaps as a plugin) to do this job instead. The idea is to make it harder for users to make a change that will delete data, and will hopefully force them to read about the consequences of their action.

While I understand this sentiment, I'm not convinced that a command line tool is the way to go here. ES is API driven so this would be quite a change from what we do today, and there are probably many users who don't have command line access to their hosted clusters, and who need to fix a problem quickly without having to wait for support to respond to their tickets.

Instead, I'd propose adding anaccept_data_lossflag to these actions which must be set to true for the action to be taken. I think this would be enough to warn the user that they're doing something dangerous.

—
Reply to this email directly orview it on GitHub(#15708 (comment)).

ywelsch · 2016-01-14T10:09:09Z

@bleskes @clintongormley I pushed a new set of changes reflecting the naming suggested by @clintongormley. This new changeset also contains documentation for the new commands. Two open points remain for me:

I am still not fully convinced about the command naming. I like @clintongormley's suggestion of command names allocate_replica, allocate_empty_primary and allocate_stale_primary. The accept_data_loss flag, however, feels unnatural to me in the rest api and even weirder in the Java API. I am not completely against a safeguard, but would prefer to have it in another form. As an inspiration, we might look at the way it is done for wildcard indices deletion (action.destructive_requires_name).
To keep the commands symmetric, we should also split the cancel command (which currently has an allowPrimary flag) into two commands cancel_replica and cancel_primary.

@bleskes What I don't like about force_empty_primary is that the action verb allocate is lost.

bleskes · 2016-01-18T13:09:41Z

.../org/elasticsearch/cluster/routing/allocation/command/AbstractAllocateAllocationCommand.java

Since this is a major rewrite anyway, can we try to move from custom factories to NamedWriteableRegistry ?

I see now that MoveAllocationCommand is not touched by the PR. I think moving to NamedWriteableRegistry is a good idea, but I'm fine with putting it out of scope for this PR

bleskes · 2016-01-18T13:50:29Z

.../elasticsearch/cluster/routing/allocation/command/AllocateStalePrimaryAllocationCommand.java

see text from other suggestion for empty primary allocation

bleskes · 2016-01-18T14:13:49Z

Thanks @ywelsch . I really like how it turned out. Left some minor comments here and there.

ywelsch · 2016-01-18T21:11:24Z

pushed another round of changes addressing the comments. Please have another look @bleskes. (The NamedWriteableRegistry refactoring will be for another time)

bleskes · 2016-01-19T10:37:20Z

core/src/test/java/org/elasticsearch/cluster/routing/PrimaryAllocationIT.java

I think this can be made simpler by just creating a one node cluster, create an index with no replicas, add a node to the cluster. Shutdown the first node and force assign. Makes sense?

bleskes · 2016-01-19T10:40:44Z

LGTM. Made one suggestion for simplifying a test (and asked a question). Feel free to adopt the suggestion (or not) and push. Thanks for all the iterations, @ywelsch .

Closes elastic#15708

ywelsch · 2016-01-19T11:25:54Z

Thanks for the thorough review @bleskes !

Extend reroute with an option to force assign stale primary shard copies

ywelsch added review v5.0.0-alpha1 labels Dec 30, 2015

ywelsch assigned bleskes Dec 30, 2015

ywelsch mentioned this pull request Dec 30, 2015

Allocate primary shard based on allocation IDs #14739

Closed

7 tasks

nik9000 reviewed Jan 4, 2016
View reviewed changes

bleskes added the discuss label Jan 8, 2016

ywelsch force-pushed the feature/reroute-stale-primary branch 2 times, most recently from 7cb4da8 to 17541c1 Compare January 14, 2016 09:44

bleskes reviewed Jan 18, 2016
View reviewed changes

bleskes reviewed Jan 19, 2016
View reviewed changes

Extend reroute with an option to force assign stale primary shard copies

d5b691b

Closes elastic#15708

ywelsch force-pushed the feature/reroute-stale-primary branch from fe93318 to d5b691b Compare January 19, 2016 11:07

ywelsch removed the discuss label Jan 19, 2016

ywelsch mentioned this pull request Jan 19, 2016

Index can get stuck closed if any of its shards has half or less of its lucene indexes running #3354

Closed

ywelsch pushed a commit that referenced this pull request Jan 19, 2016

Merge pull request #15708 from ywelsch/feature/reroute-stale-primary

0a69f1e

Extend reroute with an option to force assign stale primary shard copies

ywelsch merged commit 0a69f1e into elastic:master Jan 19, 2016

clintongormley added the >enhancement label Jan 20, 2016

clintongormley mentioned this pull request Jan 26, 2016

Rename the allow_primary param of the _reroute API #14352

Closed

lcawl added :Distributed/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Allocation labels Feb 13, 2018

clintongormley added :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) and removed :Distributed/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Feb 14, 2018

Conversation

ywelsch commented Dec 30, 2015

Uh oh!

nik9000 Jan 4, 2016

Choose a reason for hiding this comment

Uh oh!

ywelsch Jan 4, 2016

Choose a reason for hiding this comment

Uh oh!

bleskes commented Jan 7, 2016

Uh oh!

ywelsch commented Jan 7, 2016

Uh oh!

ywelsch commented Jan 8, 2016

Uh oh!

bleskes commented Jan 8, 2016

Uh oh!

ywelsch commented Jan 8, 2016

Uh oh!

clintongormley commented Jan 9, 2016

Uh oh!

bleskes commented Jan 9, 2016

Uh oh!

ywelsch commented Jan 14, 2016

Uh oh!

bleskes Jan 18, 2016

Choose a reason for hiding this comment

Uh oh!

bleskes Jan 18, 2016

Choose a reason for hiding this comment

Uh oh!

bleskes Jan 18, 2016

Choose a reason for hiding this comment

Uh oh!

bleskes commented Jan 18, 2016

Uh oh!

ywelsch commented Jan 18, 2016

Uh oh!

bleskes Jan 19, 2016

Choose a reason for hiding this comment

Uh oh!

ywelsch Jan 19, 2016

Choose a reason for hiding this comment

Uh oh!

bleskes commented Jan 19, 2016

Uh oh!

ywelsch commented Jan 19, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants