Skip to content

Ensure SLM stats does not block an in-place upgrade from 7.4#48361

Closed
jakelandis wants to merge 1 commit intoelastic:7.xfrom
jakelandis:slm_stats_fix
Closed

Ensure SLM stats does not block an in-place upgrade from 7.4#48361
jakelandis wants to merge 1 commit intoelastic:7.xfrom
jakelandis:slm_stats_fix

Conversation

@jakelandis
Copy link
Copy Markdown
Contributor

7.5+ for SLM requires [stats] object to exist in the cluster state.
When doing an in-place upgrade from 7.4 to 7.5+ [stats] does not exist
in cluster state, result in an exception on startup [1].

This commit moves the [stats] to be an optional object in the parser
and if not found will default to an empty stats object.

[1] Caused by: java.lang.IllegalArgumentException: Required [stats]


Note - this is was not caught by normal full cluster state restart tests since by default there is no SLM data in cluster state. Stopping and restarting ILM is one way to force SLM cluster state to be written (hence the custom test)

Note - this does not need to merged to master/8.0, however, we may need a follow up PR in 7.x to ensure that SLM is eagerly written to cluster state.

Below is the full error when the test is run without the fix:

./gradlew :x-pack:qa:full-cluster-restart:v7.4.0#bwc -Dtests.class=org.elasticsearch.xpack.restart.FullClusterRestartIT -Dtests.method="testSlmStats"

> Task :printGlobalBuildInfo UP-TO-DATE
=======================================
Elasticsearch Build Hamster says Hello!
  Gradle Version        : 5.6.2
  OS Info               : Mac OS X 10.14.6 (x86_64)
  JDK Version           : 12 (Oracle Corporation 12.0.1 [Java HotSpot(TM) 64-Bit Server VM 12.0.1+12])
  JAVA_HOME             : /Users/jakelandis/workspace/java/jdk-12.0.1.jdk/Contents/Home
  Random Testing Seed   : DFEA8712EB526FEE
=======================================

> Task :x-pack:plugin:core:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :x-pack:qa:full-cluster-restart:v7.4.0#upgradedClusterTest

=== Standard output of node `node{:x-pack:qa:full-cluster-restart:v7.4.0-0}` ===

»    ↓ errors and warnings from /Users/jakelandis/workspace/7x/elasticsearch/x-pack/qa/full-cluster-restart/build/testclusters/v7.4.0-0/logs/es.stdout.log ↓
» WARN ][o.e.d.FileBasedSeedHostsProvider] [v7.4.0-0] expected, but did not find, a dynamic hosts list at [/Users/jakelandis/workspace/7x/elasticsearch/x-pack/qa/full-cluster-restart/build/testclusters/v7.4.0-0/config/unicast_hosts.txt]
»   ↑ repeated 4 times ↑
» ERROR][o.e.g.GatewayMetaState   ] [v7.4.0-0] failed to read or upgrade local state, exiting...
»  org.elasticsearch.ElasticsearchException: java.io.IOException: failed to read /Users/jakelandis/workspace/7x/elasticsearch/x-pack/qa/full-cluster-restart/build/testclusters/v7.4.0-0/data/nodes/0/_state/global-22.st
»       at org.elasticsearch.ExceptionsHelper.maybeThrowRuntimeAndSuppress(ExceptionsHelper.java:167) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaDataStateFormat.loadGeneration(MetaDataStateFormat.java:414) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:81) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:154) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:90) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.node.Node.start(Node.java:697) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:273) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:358) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:125) [elasticsearch-cli-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cli.Command.main(Command.java:90) [elasticsearch-cli-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) [elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»  Caused by: java.io.IOException: failed to read /Users/jakelandis/workspace/7x/elasticsearch/x-pack/qa/full-cluster-restart/build/testclusters/v7.4.0-0/data/nodes/0/_state/global-22.st
»       at org.elasticsearch.gateway.MetaDataStateFormat.loadGeneration(MetaDataStateFormat.java:408) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       ... 13 more
»  Caused by: java.lang.IllegalArgumentException: Required [stats]
»       at org.elasticsearch.common.xcontent.ConstructingObjectParser$Target.finish(ConstructingObjectParser.java:446) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.common.xcontent.ConstructingObjectParser$Target.access$000(ConstructingObjectParser.java:350) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.common.xcontent.ConstructingObjectParser.parse(ConstructingObjectParser.java:169) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.xpack.ilm.IndexLifecycle.lambda$getNamedXContent$5(IndexLifecycle.java:222) ~[?:?]
»       at org.elasticsearch.common.xcontent.NamedXContentRegistry$Entry.lambda$new$0(NamedXContentRegistry.java:63) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.common.xcontent.NamedXContentRegistry.parseNamedObject(NamedXContentRegistry.java:141) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.common.xcontent.support.AbstractXContentParser.namedObject(AbstractXContentParser.java:385) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cluster.metadata.MetaData$Builder.fromXContent(MetaData.java:1403) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cluster.metadata.MetaData$1.fromXContent(MetaData.java:1448) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cluster.metadata.MetaData$1.fromXContent(MetaData.java:1439) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaDataStateFormat.read(MetaDataStateFormat.java:302) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaDataStateFormat.loadGeneration(MetaDataStateFormat.java:404) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       ... 13 more
» WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [v7.4.0-0] uncaught exception in thread [main]
»  org.elasticsearch.bootstrap.StartupException: ElasticsearchException[java.io.IOException: failed to read /Users/jakelandis/workspace/7x/elasticsearch/x-pack/qa/full-cluster-restart/build/testclusters/v7.4.0-0/data/nodes/0/_state/global-22.st]; nested: IOException[failed to read /Users/jakelandis/workspace/7x/elasticsearch/x-pack/qa/full-cluster-restart/build/testclusters/v7.4.0-0/data/nodes/0/_state/global-22.st]; nested: IllegalArgumentException[Required [stats]];
»       at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:125) ~[elasticsearch-cli-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»  Caused by: org.elasticsearch.ElasticsearchException: java.io.IOException: failed to read /Users/jakelandis/workspace/7x/elasticsearch/x-pack/qa/full-cluster-restart/build/testclusters/v7.4.0-0/data/nodes/0/_state/global-22.st
»       at org.elasticsearch.ExceptionsHelper.maybeThrowRuntimeAndSuppress(ExceptionsHelper.java:167) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaDataStateFormat.loadGeneration(MetaDataStateFormat.java:414) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:81) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:154) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:90) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.node.Node.start(Node.java:697) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:273) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:358) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       ... 6 more
»  Caused by: java.io.IOException: failed to read /Users/jakelandis/workspace/7x/elasticsearch/x-pack/qa/full-cluster-restart/build/testclusters/v7.4.0-0/data/nodes/0/_state/global-22.st
»       at org.elasticsearch.gateway.MetaDataStateFormat.loadGeneration(MetaDataStateFormat.java:408) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:81) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:154) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:90) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.node.Node.start(Node.java:697) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:273) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:358) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       ... 6 more
»  Caused by: java.lang.IllegalArgumentException: Required [stats]
»       at org.elasticsearch.common.xcontent.ConstructingObjectParser$Target.finish(ConstructingObjectParser.java:446) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.common.xcontent.ConstructingObjectParser$Target.access$000(ConstructingObjectParser.java:350) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.common.xcontent.ConstructingObjectParser.parse(ConstructingObjectParser.java:169) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.xpack.ilm.IndexLifecycle.lambda$getNamedXContent$5(IndexLifecycle.java:222) ~[?:?]
»       at org.elasticsearch.common.xcontent.NamedXContentRegistry$Entry.lambda$new$0(NamedXContentRegistry.java:63) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.common.xcontent.NamedXContentRegistry.parseNamedObject(NamedXContentRegistry.java:141) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.common.xcontent.support.AbstractXContentParser.namedObject(AbstractXContentParser.java:385) ~[elasticsearch-x-content-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cluster.metadata.MetaData$Builder.fromXContent(MetaData.java:1403) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cluster.metadata.MetaData$1.fromXContent(MetaData.java:1448) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.cluster.metadata.MetaData$1.fromXContent(MetaData.java:1439) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaDataStateFormat.read(MetaDataStateFormat.java:302) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaDataStateFormat.loadGeneration(MetaDataStateFormat.java:404) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:81) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:154) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:90) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.node.Node.start(Node.java:697) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:273) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:358) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
»       ... 6 more

7.5+ for SLM requires [stats] object to exist in the cluster state.
When doing an in-place upgrade from 7.4 to 7.5+ [stats] does not exist
in cluster state, result in an exception on startup [1].

This commit moves the [stats] to be an optional object in the parser
and if not found will default to an empty stats object.

[1] Caused by: java.lang.IllegalArgumentException: Required [stats]
@jakelandis jakelandis added >bug :Data Management/ILM+SLM DO NOT USE. Use ":StorageEngine/ILM" or ":Distributed Coordination/SLM" instead. v7.5.0 v7.6.0 labels Oct 22, 2019
@jakelandis jakelandis requested a review from dakrone October 22, 2019 17:11
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

@jakelandis
Copy link
Copy Markdown
Contributor Author

hmm..failed to download ml artifact, trying again.

@elasticmachine run elasticsearch-ci/bwc

Comment on lines +448 to +449
client().performRequest(new Request("POST", "_ilm/stop"));
client().performRequest(new Request("POST", "_ilm/start"));
Copy link
Copy Markdown
Contributor

@AthenaEryma AthenaEryma Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of #47710, we should PUT a policy here rather than starting/stopping.

Here's an example repository/policy you can PUT that will execute every February 31 (i.e. never):

PUT /_snapshot/test-repo
{
  "type": "fs",
  "settings": {
    "location": "test-repo"
  }
}

PUT /_slm/policy/test-policy
{
  "schedule": "* * * 31 FEB ? *", 
  "name": "<test-snap-{now/d}>", 
  "repository": "test-repo", 
  "config": { 
    "indices": ["*"] 
  }
}

@jakelandis
Copy link
Copy Markdown
Contributor Author

jakelandis commented Oct 22, 2019

I realized this does indeed need to be targeted at 8.0/master first. So closing this PR in favor of #48367

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Data Management/ILM+SLM DO NOT USE. Use ":StorageEngine/ILM" or ":Distributed Coordination/SLM" instead. v7.5.0 v7.6.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants