Hi,
we have been running into strange errors lately. We get a lot of exceptions of the type:
[2014-10-07 06:41:26,235][WARN ][cluster.action.shard ] [Madcap] [my_index][1] sending failed shard for [my_index][1], node[-QGTVk8RRcuKuBwdqD8l1A], [P], s[INITIALIZING], indexUUID [u68VqfHsRii16gYXtPj1cQ], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[my_index][1] failed recovery]; nested: IllegalArgumentException[cannot change DocValues type from BINARY to SORTED_SET for field "custom.my_field"]; ]]
[2014-10-07 06:41:26,587][WARN ][indices.cluster ] [Madcap] [my_index][1] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [my_index][1] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:185)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: cannot change DocValues type from BINARY to SORTED_SET for field "custom.my_field"
at org.apache.lucene.index.FieldInfos$FieldNumbers.addOrGet(FieldInfos.java:198)
at org.apache.lucene.index.IndexWriter.getFieldNumberMap(IndexWriter.java:868)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:819)
at org.elasticsearch.index.engine.internal.InternalEngine.createWriter(InternalEngine.java:1420)
at org.elasticsearch.index.engine.internal.InternalEngine.start(InternalEngine.java:271)
at org.elasticsearch.index.shard.service.InternalIndexShard.postRecovery(InternalIndexShard.java:692)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:217)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
... 3 more
We are in a daily index situation so the index is quite new. It contains 10 to 20 millions of documents spread over 10 shards and 5 nodes. At some point (after hours of the index being green), one of the shards becomes INITALIZING and can never start (because of the aforementioned exception). The index is then in red state and we cannot set it back on track... In this case the only solution we have found is to scroll over the whole index and reindex the data into a new index (but we most likely have lost the data from the failing shard). The field that causes the issue has the following definition {"type":"long","doc_values":true,"include_in_all":false}. This mapping is inferred from a dynamic template {"mapping":{"index":"not_analyzed","include_in_all":false,"doc_values":true,"type":"{dynamic_type}"},"match":"*"}.
One important note is that this part of the data is free-form (ie user input) and it is possible that some documents have conflicting types (one document having the field as string, the other as long); that's why the index has the setting index.mapping.ignore_malformed set to true.
Also, it might not be relevant, but this only happened on days when we had at least one node that was restarted.
We have noticed this issue since running ElasticSearch 1.3.4 (but can't be 100% sure that it did not happen before).
We cannot isolate and reproduce the issue but have faced it several times over the past few days. Feel free to suggest actions we can undertake should it happen again, to get more details to help fix it. Also, if you have suggestions to help bypass the issue when it happens (so that we can avoid reindexing the data), that'd be great.
Thanks,
Emmanuel
Hi,
we have been running into strange errors lately. We get a lot of exceptions of the type:
We are in a daily index situation so the index is quite new. It contains 10 to 20 millions of documents spread over 10 shards and 5 nodes. At some point (after hours of the index being green), one of the shards becomes INITALIZING and can never start (because of the aforementioned exception). The index is then in red state and we cannot set it back on track... In this case the only solution we have found is to scroll over the whole index and reindex the data into a new index (but we most likely have lost the data from the failing shard). The field that causes the issue has the following definition
{"type":"long","doc_values":true,"include_in_all":false}. This mapping is inferred from a dynamic template{"mapping":{"index":"not_analyzed","include_in_all":false,"doc_values":true,"type":"{dynamic_type}"},"match":"*"}.One important note is that this part of the data is free-form (ie user input) and it is possible that some documents have conflicting types (one document having the field as string, the other as long); that's why the index has the setting
index.mapping.ignore_malformedset totrue.Also, it might not be relevant, but this only happened on days when we had at least one node that was restarted.
We have noticed this issue since running ElasticSearch 1.3.4 (but can't be 100% sure that it did not happen before).
We cannot isolate and reproduce the issue but have faced it several times over the past few days. Feel free to suggest actions we can undertake should it happen again, to get more details to help fix it. Also, if you have suggestions to help bypass the issue when it happens (so that we can avoid reindexing the data), that'd be great.
Thanks,
Emmanuel