Skip to content

Extend reroute with an option to force assign stale primary shard copies#15708

Merged
ywelsch merged 1 commit intoelastic:masterfrom
ywelsch:feature/reroute-stale-primary
Jan 19, 2016
Merged

Extend reroute with an option to force assign stale primary shard copies#15708
ywelsch merged 1 commit intoelastic:masterfrom
ywelsch:feature/reroute-stale-primary

Conversation

@ywelsch
Copy link
Copy Markdown
Contributor

@ywelsch ywelsch commented Dec 30, 2015

Relates to #14739.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use the ObjectParser? I think everything is going that way.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I essentially copy pasted this stuff from the AllocateAllocationCommand. You are right though, I should transform it to use ObjectParser.

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Jan 7, 2016

@ywelsch and I discussed this yesterday. Here is the gist of the discussion:

  1. Instead of fetching the store data in the command, adapt unassigned info so that the node initializing the shard will expect local data and fail if it's not there.
  2. Remove the ignore allocation decider param and always assign if the user asked for it (in the case of the primary assignment)
  3. Try to fold the new AllocatePrimaryAllocationCommand in the current AllocateAllocationCommand.

Here's a suggest for how the rest layer will look like for that:

POST _cluster/reroute
{
  "commands": [
    {
      "allocate": {
        "index": "index",
        "shard": 0,
        "node": "node",
        "allow_primary": "never|existing_only|force_empty"
      }
    }
  ]

Note the allow_primary option. The default, never refuses to allocate a primary. existing_only allocates it but must reuse an existing copy. force_empty does what true does now - i.e., force the creation of an empty primary. Also the last two should imply ignoring the allocation deciders.

@ywelsch thoughts?

@ywelsch
Copy link
Copy Markdown
Contributor Author

ywelsch commented Jan 7, 2016

Before this PR, users only had the possibility to either allocate a replica or an empty primary. The flag "allow_primary" was misleading in that context. I would very much like to get rid of this flag altogether and introduce something new. What about introducing a new "mode" field:

"mode": "replica" // default, only allocates unassigned replica shard (fails if unassigned primary exists)
"mode": "primary" // only allocates if primary unassigned and if data available on node
"mode": "empty_primary" // only allocates if primary unassigned, creates empty shard

"mode": "replica" is equivalent to the old "allowPrimary" : "false"
"mode": "empty_primary" is equivalent to the old "allowPrimary" : "true"

Also, we should check for consistency with the cancel command that currently has an "allowPrimary" flag as well.

@ywelsch
Copy link
Copy Markdown
Contributor Author

ywelsch commented Jan 8, 2016

Also, relates to #14352

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Jan 8, 2016

I think mode is a bit ambiguous as a flag. I do understand the concerns that allow_primary will now do something different than the allow primary in the cancel command. As far as the values. I like "existing_only" better than just "primary" as it better communicates the restriction (we must use local data). Also I like "force_empty" better than "empty_primary" because it communicates the implications better and the severity better. It also better describes what we do today - always create an empty shard, even if there is local data. That last one is of course subject to change (but that's a bigger discussion)

@bleskes bleskes added the discuss label Jan 8, 2016
@ywelsch
Copy link
Copy Markdown
Contributor Author

ywelsch commented Jan 8, 2016

As far as the values. I like "existing_only" better than just "primary" as it better communicates the restriction (we must use local data). Also I like "force_empty" better than "empty_primary" because it communicates the implications better and the severity better. It also better describes what we do today - always create an empty shard, even if there is local data.

$ grep -o "better" | wc -l
6

sorry ;-)

@clintongormley
Copy link
Copy Markdown
Contributor

We have a stalled issue (#14352) about choosing a better name for allow_primary which properly indicates the fact that you can lose your data. The issue is stalled because it is difficult to come up with the right name.

Instead, what about changing the action name from allocate to:

  • allocate_replica
  • allocate_empty_primary
  • and for this PR: allocate_stale_primary

@s1monw is concerned about exposing these last two actions via the API and would prefer to provide a command line tool (perhaps as a plugin) to do this job instead. The idea is to make it harder for users to make a change that will delete data, and will hopefully force them to read about the consequences of their action.

While I understand this sentiment, I'm not convinced that a command line tool is the way to go here. ES is API driven so this would be quite a change from what we do today, and there are probably many users who don't have command line access to their hosted clusters, and who need to fix a problem quickly without having to wait for support to respond to their tickets.

Instead, I'd propose adding an accept_data_loss flag to these actions which must be set to true for the action to be taken. I think this would be enough to warn the user that they're doing something dangerous.

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Jan 9, 2016

+1. My only suggestion would be to use 'force_empty_primary' and 'force_stale_primary' as command names, to make it even clearer that (all) shard content will be lost.

On 9 jan. 2016 12:13 PM +0100, Clinton Gormleynotifications@github.com, wrote:

We have a stalled issue (#14352(#14352)) about choosing a better name forallow_primarywhich properly indicates the fact that you can lose your data. The issue is stalled because it is difficult to come up with the right name.

Instead, what about changing the action name fromallocateto:

allocate_replica
allocate_empty_primary
and for this PR:allocate_stale_primary

@s1monw(https://github.com/s1monw)is concerned about exposing these last two actions via the API and would prefer to provide a command line tool (perhaps as a plugin) to do this job instead. The idea is to make it harder for users to make a change that will delete data, and will hopefully force them to read about the consequences of their action.

While I understand this sentiment, I'm not convinced that a command line tool is the way to go here. ES is API driven so this would be quite a change from what we do today, and there are probably many users who don't have command line access to their hosted clusters, and who need to fix a problem quickly without having to wait for support to respond to their tickets.

Instead, I'd propose adding anaccept_data_lossflag to these actions which must be set to true for the action to be taken. I think this would be enough to warn the user that they're doing something dangerous.


Reply to this email directly orview it on GitHub(#15708 (comment)).

@ywelsch ywelsch force-pushed the feature/reroute-stale-primary branch 2 times, most recently from 7cb4da8 to 17541c1 Compare January 14, 2016 09:44
@ywelsch
Copy link
Copy Markdown
Contributor Author

ywelsch commented Jan 14, 2016

@bleskes @clintongormley I pushed a new set of changes reflecting the naming suggested by @clintongormley. This new changeset also contains documentation for the new commands. Two open points remain for me:

  • I am still not fully convinced about the command naming. I like @clintongormley's suggestion of command names allocate_replica, allocate_empty_primary and allocate_stale_primary. The accept_data_loss flag, however, feels unnatural to me in the rest api and even weirder in the Java API. I am not completely against a safeguard, but would prefer to have it in another form. As an inspiration, we might look at the way it is done for wildcard indices deletion (action.destructive_requires_name).
  • To keep the commands symmetric, we should also split the cancel command (which currently has an allowPrimary flag) into two commands cancel_replica and cancel_primary.

@bleskes What I don't like about force_empty_primary is that the action verb allocate is lost.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a major rewrite anyway, can we try to move from custom factories to NamedWriteableRegistry ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now that MoveAllocationCommand is not touched by the PR. I think moving to NamedWriteableRegistry is a good idea, but I'm fine with putting it out of scope for this PR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see text from other suggestion for empty primary allocation

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Jan 18, 2016

Thanks @ywelsch . I really like how it turned out. Left some minor comments here and there.

@ywelsch
Copy link
Copy Markdown
Contributor Author

ywelsch commented Jan 18, 2016

pushed another round of changes addressing the comments. Please have another look @bleskes. (The NamedWriteableRegistry refactoring will be for another time)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be made simpler by just creating a one node cluster, create an index with no replicas, add a node to the cluster. Shutdown the first node and force assign. Makes sense?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Jan 19, 2016

LGTM. Made one suggestion for simplifying a test (and asked a question). Feel free to adopt the suggestion (or not) and push. Thanks for all the iterations, @ywelsch .

@ywelsch
Copy link
Copy Markdown
Contributor Author

ywelsch commented Jan 19, 2016

Thanks for the thorough review @bleskes !

ywelsch pushed a commit that referenced this pull request Jan 19, 2016
Extend reroute with an option to force assign stale primary shard copies
@ywelsch ywelsch merged commit 0a69f1e into elastic:master Jan 19, 2016
@lcawl lcawl added :Distributed/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Allocation labels Feb 13, 2018
@clintongormley clintongormley added :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) and removed :Distributed/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement v5.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants