Skip to content

Index with no changes gets new sync_id when becoming inactive. #27838

@larschri

Description

@larschri

Elasticsearch version (bin/elasticsearch --version):

Version: 5.5.2, Build: b2f0c09/2017-08-14T12:33:14.154Z, JVM: 1.8.0_144

Plugins installed: []

JVM version (java -version):

java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

OS version (uname -a if on a Unix-like system):

Linux xxx 3.13.0-33-generic #58-Ubuntu SMP Tue Jul 29 16:45:05 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

The problem is that restarting a node takes very long. It takes several hours to wait for a cluster to become green after restarting just one single node, even if there are no indexing and sync flush was successfully executed. The restarted node should be able to recover from local files since they are up to date, but it doesn't because the sync_id doesn't match.

Expected behavior: After shutting down indexing and performing a successful synced flush it is expected that sync_ids stays the same.

Actual behavior: After shutting down indexing and performing a successful synced flush the sync_id changes when the index becomes inactive.

Steps to reproduce:

# Create replicated index with single shard
curl -XPUT localhost:9200/testindex -d '{"settings":{"index":{
  "number_of_shards" : 1,
  "number_of_replicas" : 1
}}}'

# Index one document
curl -XPUT localhost:9200/testindex/doc/1 -d '{}'

# Perform a synced flush
curl -XPOST localhost:9200/_flush/synced

# Note the sync_id (if jq is not installed; navigate JSON correspondingly)
curl localhost:9200/_stats?level=shards | jq '.indices.testindex.shards["0"][].commit.user_data.sync_id'

# wait 5-6 minutes

# The sync_id has changed:
curl localhost:9200/_stats?level=shards | jq '.indices.testindex.shards["0"][].commit.user_data.sync_id'

Provide logs (if relevant):

With debug logging enabled, something like this is printed:

[2017-12-15T11:30:21,439][DEBUG][o.e.i.e.Engine           ] [xxx] [testindex][0] successfully sync committed. sync id [AWBZ8H59yP3Rl7MoZfSx].
[2017-12-15T11:30:22,312][DEBUG][o.e.i.s.IndexShard       ] [xxx] [testindex][0] shard is now inactive

Metadata

Metadata

Assignees

Labels

:Distributed/EngineAnything around managing Lucene and the Translog in an open shard.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions