Elasticsearch version (bin/elasticsearch --version):
Version: 5.5.3, Build: 9305a5e/2017-09-07T15:56:59.599Z, JVM: 1.8.0_151
Plugins installed: []
JVM version (java -version):
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
OS version (uname -a if on a Unix-like system): Darwin Thiagos-MacBook-Pro.local 17.0.0 Darwin Kernel Version 17.0.0: Thu Aug 24 21:48:19 PDT 2017; root:xnu-4570.1.46~2/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
If a non-data node, that contains dangling indices in it's data path, joins a cluster these dangling indices will be detected and auto-imported.
IMO, a non-data node that contains index data in it's data path is probably accidental and unintended. In this case, those dangling indices should not be detected, better yet if the node does not even starts (maybe a bootstrap check that fails if a non-data node contains index data in it's data path).
Steps to reproduce:
This can be done in a single machine:
- Start
node-1 with bin/elasticsearch -E path.data=/Users/thiago/data-1 -E node.name=node-1
- Start
node-2 with bin/elasticsearch -E path.data=/Users/thiago/data-2 -E node.name=node-2
- Create an index
test configured with 1S/0R with curl -XPUT localhost:9200/test -d '{ "settings": { "index": { "number_of_shards": 1, "number_of_replicas": 0 } } }' -H "Content-Type: application/json"
- Create a document
curl -XPOST localhost:9200/test -d '{ "test": 1 }' -H "Content-Type: application/json"
- Stop both nodes
- Check which data directory, either
data-1 or data-2, that the shard for index test was created in and delete the other empty data directory (so we effectively make a dangling index).
- Consider that
data-2 was deleted. So start node-2 again with bin/elasticsearch -E path.data=/Users/thiago/data-2 -E node.name=node-2
- Start
node-1 (which contains dangling indices) as a non-data node with bin/elasticsearch -E path.data=/Users/thiago/data-1 -E node.name=node-1 -E node.data=false
Provide logs (if relevant):
After non-data node node-1 starts, node-2 will detect and auto-import dangling indices even though node-1 is a non-data node:
[2017-10-21T18:02:14,158][INFO ][o.e.g.LocalAllocateDangledIndices] [node-2] auto importing dangled indices [[test/R2Nh9sERThmkJ-0IZ0ppwA]/OPEN] from [{node-1}{RqWMW2AeSXWOpkUm4cT1TA}{lEqpWLIhRqqU_n1DSFuv2Q}{127.0.0.1}{127.0.0.1:9301}]
Elasticsearch version (
bin/elasticsearch --version):Plugins installed: []
JVM version (
java -version):OS version (
uname -aif on a Unix-like system):Darwin Thiagos-MacBook-Pro.local 17.0.0 Darwin Kernel Version 17.0.0: Thu Aug 24 21:48:19 PDT 2017; root:xnu-4570.1.46~2/RELEASE_X86_64 x86_64Description of the problem including expected versus actual behavior:
If a non-data node, that contains dangling indices in it's data path, joins a cluster these dangling indices will be detected and auto-imported.
IMO, a non-data node that contains index data in it's data path is probably accidental and unintended. In this case, those dangling indices should not be detected, better yet if the node does not even starts (maybe a bootstrap check that fails if a non-data node contains index data in it's data path).
Steps to reproduce:
This can be done in a single machine:
node-1withbin/elasticsearch -E path.data=/Users/thiago/data-1 -E node.name=node-1node-2withbin/elasticsearch -E path.data=/Users/thiago/data-2 -E node.name=node-2testconfigured with1S/0Rwithcurl -XPUT localhost:9200/test -d '{ "settings": { "index": { "number_of_shards": 1, "number_of_replicas": 0 } } }' -H "Content-Type: application/json"curl -XPOST localhost:9200/test -d '{ "test": 1 }' -H "Content-Type: application/json"data-1ordata-2, that the shard for indextestwas created in and delete the other empty data directory (so we effectively make a dangling index).data-2was deleted. So startnode-2again withbin/elasticsearch -E path.data=/Users/thiago/data-2 -E node.name=node-2node-1(which contains dangling indices) as a non-data node withbin/elasticsearch -E path.data=/Users/thiago/data-1 -E node.name=node-1 -E node.data=falseProvide logs (if relevant):
After non-data node
node-1starts,node-2will detect and auto-import dangling indices even thoughnode-1is a non-data node: