cluster stuck in loop "failed to create shard"

Hi,

Our process involve a bulk loading of one index (index_b) while a second is used in production (index_a), at the end of the bulk loading we swap the alias (index_alias) to index_b and close the first one, this process happen three times a day. Here's a more detailled list of actions we perform on the index
- (index used in production => index_a)
- open index_b
- disable refresh_interval
- bulk load in index_b
- send a refresh request
- optimize
- wait for each node to have merges: 0 for index_b
- swap alias
- close index_a

During the bulk loading of index_b (or index_a since the process happen 3 times a day) two of the three nodes will be stuck in relocating loop of one or multiple shards, not always the same on a different node with the following log :

```
[2015-07-02 08:57:40,322][WARN ][indices.cluster          ] [elasticsearch-nodes2.localdomain] [[index_a][1]] marking and sending shard failed due to [failed to create shard]
org.elasticsearch.index.shard.IndexShardCreationException: [index_a][1] failed to create shard
   at org.elasticsearch.index.IndexService.createShard(IndexService.java:357)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:704)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:605)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:185)
   at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:480)
   at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:188)
   at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:158)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.lucene.store.LockObtainFailedException: Can't lock shard [index_a][1], timed out after 5000ms
   at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:576)
   at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:504)
   at org.elasticsearch.index.IndexService.createShard(IndexService.java:310)
   ... 9 more
[2015-07-02 08:57:40,339][DEBUG][discovery.zen.publish    ] [elasticsearch-nodes2.localdomain] received cluster state version 73587
[2015-07-02 08:57:45,339][WARN ][indices.cluster          ] [elasticsearch-nodes2.localdomain] [[index_a][0]] marking and sending shard failed due to [failed to create shard]
org.elasticsearch.index.shard.IndexShardCreationException: [index_a][0] failed to create shard
   at org.elasticsearch.index.IndexService.createShard(IndexService.java:357)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:704)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:605)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:185)
   at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:480)
   at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:188)
   at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:158)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.lucene.store.LockObtainFailedException: Can't lock shard [index_a][0], timed out after 5000ms
   at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:576)
   at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:504)
   at org.elasticsearch.index.IndexService.createShard(IndexService.java:310)
   ... 9 more
[2015-07-02 08:57:50,340][WARN ][indices.cluster          ] [elasticsearch-nodes2.localdomain] [[index_a][1]] marking and sending shard failed due to [failed to create shard]
org.elasticsearch.index.shard.IndexShardCreationException: [index_a][1] failed to create shard
   at org.elasticsearch.index.IndexService.createShard(IndexService.java:357)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:704)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:605)
   at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:185)
   at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:480)
   at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:188)
   at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:158)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.lucene.store.LockObtainFailedException: Can't lock shard [index_a][1], timed out after 5000ms
   at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:576)
   at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:504)
   at org.elasticsearch.index.IndexService.createShard(IndexService.java:310)
   ... 9 more
[2015-07-02 08:57:50,341][INFO ][cluster.routing.allocation.decider] [elasticsearch-nodes2.localdomain] updating [cluster.routing.allocation.enable] from [ALL] to [NONE]
[2015-07-02 08:57:50,345][DEBUG][discovery.zen.publish    ] [elasticsearch-nodes2.localdomain] received cluster state version 73588
```

The nodes will (until restart) exchange cluster state version indefinitely 

```
received cluster state version 76797
received cluster state version 76798
received cluster state version 76799
received cluster state version 76800
received cluster state version 76801
received cluster state version 76802 
```

At this stage the cluster is green, but when the second time our process run, it'll trigger a yellow and sometimes a red cluster

Things we've tried:
- upgrading (1.5.2, 1.5.3, 1.6.0)
- rolling restart of each e.s process
- complete rolling replacement of each of the 3 nodes (including data on disk)
- adding more nodes (from 3 to 5)

The only way to fix this is to restart one of the nodes involved in the relocation process

```
[2015-07-02 08:57:59,708][INFO ][node                     ] [elasticsearch-nodes2.localdomain] stopping ...
[2015-07-02 08:57:59,725][DEBUG][discovery.zen.fd         ] [elasticsearch-nodes2.localdomain] [master] stopping fault detection against master [[elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}], reason [zen disco stop]
[2015-07-02 08:57:59,792][INFO ][node                     ] [elasticsearch-nodes2.localdomain] stopped
[2015-07-02 08:57:59,792][INFO ][node                     ] [elasticsearch-nodes2.localdomain] closing ...
[2015-07-02 08:57:59,799][DEBUG][com.amazonaws.http.IdleConnectionReaper] Reaper thread:
java.lang.InterruptedException: sleep interrupted
   at java.lang.Thread.sleep(Native Method)
   at com.amazonaws.http.IdleConnectionReaper.run(IdleConnectionReaper.java:112)
[2015-07-02 08:57:59,800][DEBUG][com.amazonaws.http.IdleConnectionReaper] Shutting down reaper thread.
[2015-07-02 08:58:09,806][WARN ][cluster.action.index     ] [elasticsearch-nodes2.localdomain] [shopper_chargement_prod_a] failed to lock all shards for index - timed out after 30 seconds
[2015-07-02 08:58:09,814][INFO ][node                     ] [elasticsearch-nodes2.localdomain] closed



[2015-07-02 08:58:41,442][INFO ][node                     ] [elasticsearch-nodes2.localdomain] version[1.6.0], pid[45115], build[cdd3ac4/2015-06-09T13:36:34Z]
[2015-07-02 08:58:41,443][INFO ][node                     ] [elasticsearch-nodes2.localdomain] initializing ...
[2015-07-02 08:58:41,459][INFO ][plugins                  ] [elasticsearch-nodes2.localdomain] loaded [cloud-aws], sites [HQ, whatson, kopf]
[2015-07-02 08:58:41,499][INFO ][env                      ] [elasticsearch-nodes2.localdomain] using [1] data paths, mounts [[/srv/data (/dev/mapper/lvm--raid--0-lvm0)]], net usable_space [471.4gb], net total_space [499.6gb], types [xfs]
[2015-07-02 08:58:44,350][INFO ][node                     ] [elasticsearch-nodes2.localdomain] initialized
[2015-07-02 08:58:44,350][INFO ][node                     ] [elasticsearch-nodes2.localdomain] starting ...
[2015-07-02 08:58:44,527][INFO ][transport                ] [elasticsearch-nodes2.localdomain] bound_address {inet[/0.0.0.0:9300]}, publish_address {inet[/10.210.14.19:9300]}
[2015-07-02 08:58:44,546][INFO ][discovery                ] [elasticsearch-nodes2.localdomain] es-cluster/p6IyMeFHRCey0kRffbHgDw
[2015-07-02 08:58:48,598][INFO ][cluster.service          ] [elasticsearch-nodes2.localdomain] detected_master [elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}, added {[elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1},[elasticsearch-nodes1.localdomain][DsZ08lNfSF6EwvAMSoehng][elasticsearch-nodes1.localdomain][inet[/10.210.14.138:9300]]{aws_availability_zone=eu-west-1c, max_local_storage_nodes=1},}, reason: zen-disco-receive(from master [[elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}])
[2015-07-02 08:58:48,606][INFO ][cluster.routing.allocation.decider] [elasticsearch-nodes2.localdomain] updating [cluster.routing.allocation.enable] from [ALL] to [NONE]
[2015-07-02 08:58:48,606][INFO ][indices.recovery         ] [elasticsearch-nodes2.localdomain] updating [indices.recovery.translog_size] from [512kb] to [2mb]
[2015-07-02 08:58:48,649][INFO ][http                     ] [elasticsearch-nodes2.localdomain] bound_address {inet[/0.0.0.0:9200]}, publish_address {inet[/10.210.14.19:9200]}
[2015-07-02 08:58:48,649][INFO ][node                     ] [elasticsearch-nodes2.localdomain] started
[2015-07-02 08:59:07,416][DEBUG][discovery.zen.publish    ] [elasticsearch-nodes2.localdomain] received cluster state version 73591
[2015-07-02 08:59:07,417][INFO ][cluster.routing.allocation.decider] [elasticsearch-nodes2.localdomain] updating [cluster.routing.allocation.enable] from [NONE] to [ALL]
[2015-07-02 08:59:07,424][DEBUG][discovery.zen.publish    ] [elasticsearch-nodes2.localdomain] received cluster state version 73592
[2015-07-02 08:59:07,842][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] [index_a][0] started recovery from [elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}, id [1]
[2015-07-02 08:59:07,844][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] collecting local files for [index_a][0] [1]
[2015-07-02 08:59:07,845][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] [index_a][0] starting recovery from [elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}
[2015-07-02 08:59:07,858][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] [index_a][1] started recovery from [elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}, id [2]
[2015-07-02 08:59:07,859][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] collecting local files for [index_a][1] [2]
[2015-07-02 08:59:07,859][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] [index_a][1] starting recovery from [elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}
[2015-07-02 08:59:07,866][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] [index_a][0] Got exception on recovery
org.elasticsearch.transport.RemoteTransportException: [elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.indices.recovery.DelayRecoveryException: source node does not have the shard listed in its state as allocated on the node
   at org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:108)
   at org.elasticsearch.indices.recovery.RecoverySource.access$200(RecoverySource.java:49)
   at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:146)
   at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:132)
   at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
   at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
[2015-07-02 08:59:07,866][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] [index_a][1] Got exception on recovery
org.elasticsearch.transport.RemoteTransportException: [elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.indices.recovery.DelayRecoveryException: source node does not have the shard listed in its state as allocated on the node
   at org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:108)
   at org.elasticsearch.indices.recovery.RecoverySource.access$200(RecoverySource.java:49)
   at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:146)
   at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:132)
   at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
   at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
[2015-07-02 08:59:07,868][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] will retrying recovery with id [1] in [500ms] (reason [source node does not have the shard listed in its state as allocated on the node])
[2015-07-02 08:59:07,868][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] will retrying recovery with id [2] in [500ms] (reason [source node does not have the shard listed in its state as allocated on the node])
[2015-07-02 08:59:08,369][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] collecting local files for [index_a][1] [2]
[2015-07-02 08:59:08,369][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] collecting local files for [index_a][0] [1]
[2015-07-02 08:59:08,370][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] [index_a][0] starting recovery from [elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}
[2015-07-02 08:59:08,370][TRACE][indices.recovery         ] [elasticsearch-nodes2.localdomain] [index_a][1] starting recovery from [elasticsearch-nodes3.localdomain][o_9lrwRfTn6bbIztX9OCvA][elasticsearch-nodes3.localdomain][inet[/10.210.14.50:9300]]{aws_availability_zone=eu-west-1a, max_local_storage_nodes=1}
```

Setup:
AWS 3 nodes, eu-west1 a,b,c zones, nginx proxy on each of the three nodes, one ELB with SSL offloading and round robin on the three nginx
data stored on EBS volumes

ES config:

```
cluster.name: es-cluster
node.name: elasticsearch-nodes2.localdomain
node.max_local_storage_nodes: 1

index.mapper.dynamic: true
action.auto_create_index: true
action.disable_delete_all_indices: true

path.conf: /usr/local/etc/elasticsearch
path.data: /srv/data/elasticsearch/data
path.logs: /srv/data/elasticsearch/logs

bootstrap.mlockall: true

http.port: 9200

gateway.expected_nodes: 1
discovery.type: ec2

discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false

cloud.node.auto_attributes: true
cloud.aws.region: eu-west-1
discovery.ec2.tag: custom-es-cluster-tag

action.disable_delete_all_indices: true
discovery.zen.fd.ping_timeout: 15s
gateway.recover_after_nodes: 2
gateway.recover_after_time: 5m
http.compression: true
http.cors.allow-origin: '*'
http.cors.enabled: true
indices.store.throttle.max_bytes_per_sec: 100mb
threadpool.bulk.queue_size: 3000
transport.tcp.compress: true
```

ES Index settings:

```
{
 "index_b":{
   "settings":{
     "index":{
       "creation_date":"1435765694828",
       "uuid":"JjqbLn6CS1q0nwaw5HhIpA",
       "analysis":{
         "filter":{
           "my_word_delimiter":{
             "type":"word_delimiter",
             "split_on_numerics":"true"
           },
           "french_stemmer":{
             "type":"stemmer",
             "name":"light_french"
           },
           "limit_token":{
             "type":"limit",
             "max_token_count":"7"
           },
           "french_stopwords":{
             "type":"stop",
             "stopwords":"_french_"
           }
         },
         "analyzer":{
           "word_delimiter_stopwfr":{
             "filter":[
               "my_word_delimiter",
               "asciifolding",
               "lowercase",
               "french_stopwords"
             ],
             "tokenizer":"digitletter"
           },
           "word_delimiter":{
             "filter":[
               "my_word_delimiter",
               "lowercase",
               "asciifolding"
             ],
             "tokenizer":"digitletter"
           },
           "word_delimiter_stopwfr_stemfr":{
             "filter":[
               "my_word_delimiter",
               "asciifolding",
               "lowercase",
               "french_stopwords",
               "french_stemmer"
             ],
             "tokenizer":"digitletter"
           },
           "word_delimiter_limit":{
             "filter":[
               "my_word_delimiter",
               "lowercase",
               "limit_token",
               "asciifolding"
             ],
             "tokenizer":"digitletter"
           }
         },
         "tokenizer":{
           "digitletter":{
             "pattern":"[^\\p{Ll}\\p{Lu}0-9]+",
             "type":"pattern"
           }
         }
       },
       "number_of_replicas":"1",
       "number_of_shards":"6",
       "refresh_interval":"-1",
       "version":{
         "created":"1060099"
       }
     }
   }
 }
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster stuck in loop "failed to create shard" #12011

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

cluster stuck in loop "failed to create shard" #12011

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions