[CI] SLMSnapshotBlockingIntegTests.testRetentionWhileSnapshotInProgress failure on master

From https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/583/console & https://gradle-enterprise.elastic.co/s/6x67ha6426acy/console-log

```
  2> REPRODUCE WITH: ./gradlew ':x-pack:plugin:ilm:test' --tests "org.elasticsearch.xpack.slm.SLMSnapshotBlockingIntegTests.testRetentionWhileSnapshotInProgress" -Dtests.seed=7BA427BA999CD99D -Dtests.security.manager=true -Dtests.locale=fr-GP -Dtests.timezone=America/Edmonton -Dcompiler.java=12 -Druntime.java=11
  2> java.lang.AssertionError
        at __randomizedtesting.SeedInfo.seed([7BA427BA999CD99D:67E39A043E8F736]:0)
        at org.junit.Assert.fail(Assert.java:86)
        at org.junit.Assert.assertTrue(Assert.java:41)
        at org.junit.Assert.assertNotNull(Assert.java:712)
        at org.junit.Assert.assertNotNull(Assert.java:722)
        at org.elasticsearch.xpack.slm.SLMSnapshotBlockingIntegTests.lambda$testRetentionWhileSnapshotInProgress$2(SLMSnapshotBlockingIntegTests.java:153)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:866)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:840)
        at org.elasticsearch.xpack.slm.SLMSnapshotBlockingIntegTests.testRetentionWhileSnapshotInProgress(SLMSnapshotBlockingIntegTests.java:146)
```

Likely from this exception when trying to kick off the second snapshot:

```
  1> [2019-09-09T13:42:03,563][INFO ][o.e.x.s.SLMSnapshotBlockingIntegTests] [testRetentionWhileSnapshotInProgress] --> waiting for snapshot snap-qdwsdayhtfuymbsj7vi2yw to be completed, got: STARTED
  1> [2019-09-09T13:42:03,821][INFO ][o.e.x.s.SLMSnapshotBlockingIntegTests] [testRetentionWhileSnapshotInProgress] --> waiting for snapshot snap-qdwsdayhtfuymbsj7vi2yw to be completed, got: SUCCESS
  1> [2019-09-09T13:42:03,821][INFO ][o.e.x.s.SLMSnapshotBlockingIntegTests] [testRetentionWhileSnapshotInProgress] --> blocking nodes from completing snapshot
  1> [2019-09-09T13:42:03,822][INFO ][o.e.x.s.SnapshotLifecycleTask] [node_s0] snapshot lifecycle policy [slm-policy] issuing create snapshot [snap-frash4insd-kptw8sm1rew]
  1> [2019-09-09T13:42:03,824][INFO ][o.e.x.s.SLMSnapshotBlockingIntegTests] [testRetentionWhileSnapshotInProgress] --> checking for in progress snapshot...
  1> [2019-09-09T13:42:03,826][INFO ][o.e.x.s.SLMSnapshotBlockingIntegTests] [testRetentionWhileSnapshotInProgress] --> checking for in progress snapshot...
  1> [2019-09-09T13:42:03,828][WARN ][o.e.s.SnapshotsService   ] [node_s0] [slm-repo][snap-frash4insd-kptw8sm1rew] failed to create snapshot
  1> org.elasticsearch.snapshots.ConcurrentSnapshotExecutionException: [slm-repo:snap-frash4insd-kptw8sm1rew]  a snapshot is already running
  1> 	at org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:301) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:697) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:319) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:214) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:699) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
  1> 	at java.lang.Thread.run(Thread.java:834) [?:?]
```

My hunch is that the first snapshot has a "SUCCESS" status, but is still present in the cluster state. We should ensure it's no longer present in the cluster state before issuing the second execute policy request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] SLMSnapshotBlockingIntegTests.testRetentionWhileSnapshotInProgress failure on master #46508

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[CI] SLMSnapshotBlockingIntegTests.testRetentionWhileSnapshotInProgress failure on master #46508

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions