Skip to content

Use retry logic and real file system in file settings ITs#116392

Merged
elasticsearchmachine merged 9 commits intoelastic:mainfrom
n1v0lg:generalize-file-settings-it-fix
Nov 11, 2024
Merged

Use retry logic and real file system in file settings ITs#116392
elasticsearchmachine merged 9 commits intoelastic:mainfrom
n1v0lg:generalize-file-settings-it-fix

Conversation

@n1v0lg
Copy link
Copy Markdown
Contributor

@n1v0lg n1v0lg commented Nov 7, 2024

Several file-settings ITs fail (rarely) with exceptions like:

java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)

This happens in Windows file systems, due to a race condition where the file settings service is reading the settings file concurrently with the test trying to modify it (a no-go in Windows). It turns out we have already addressed this with a retry for one test suite (#91863), plus addressed a related issue around mock windows file-systems misbehaving (#92653).

This PR extends the above fixes to all file-settings related ITs.

@n1v0lg n1v0lg added >test Issues or PRs that are addressing/adding tests :Core/Infra/Settings Settings infrastructure and APIs v8.16.0 v9.0.0 v8.17.0 labels Nov 7, 2024
@n1v0lg n1v0lg self-assigned this Nov 7, 2024
@n1v0lg n1v0lg added the auto-backport Automatically create backport pull requests when merged label Nov 7, 2024
@n1v0lg n1v0lg changed the title Use retry logic in all file settings ITs Use retry logic and real file systems in file settings ITs Nov 7, 2024
@n1v0lg n1v0lg added the test-windows Trigger CI checks on Windows label Nov 7, 2024
@n1v0lg
Copy link
Copy Markdown
Contributor Author

n1v0lg commented Nov 7, 2024

@elasticmachine update branch

@n1v0lg
Copy link
Copy Markdown
Contributor Author

n1v0lg commented Nov 7, 2024

@elasticmachine update branch

@n1v0lg n1v0lg changed the title Use retry logic and real file systems in file settings ITs Use retry logic and real file system in file settings ITs Nov 7, 2024
@n1v0lg
Copy link
Copy Markdown
Contributor Author

n1v0lg commented Nov 7, 2024

@elasticmachine update branch

@n1v0lg n1v0lg requested a review from thecoop November 8, 2024 11:35
@n1v0lg n1v0lg marked this pull request as ready for review November 8, 2024 11:35
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Nov 8, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@n1v0lg n1v0lg requested review from a team and removed request for thecoop November 8, 2024 11:36
Copy link
Copy Markdown
Contributor

@ldematte ldematte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but if you want to put some more work in it an clean it up a bit more I left a suggestion. Optional, up to you.
Thanks!

Files.move(tempFilePath, fileSettingsService.watchedFile(), StandardCopyOption.ATOMIC_MOVE);
logger.info("--> After writing new settings file: [{}]", settingsFileContent);
logger.info("--> before writing JSON config to node {} with path {}", node, tempFilePath);
logger.info(Strings.format(json, version));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe extract Strings.format(json, version) to a variable like in the original code?

}

private static void writeJSONFile(String node, String json, Logger logger, AtomicLong versionCounter, boolean incrementVersion)
public static void writeJSONFile(String node, String json, Logger logger, AtomicLong versionCounter, boolean incrementVersion)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why did this become public?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's re-used by another test. However, I like your suggestion below to just move it to ESIntegTestCase which makes this bit obsolete.

@n1v0lg n1v0lg added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Nov 11, 2024
@elasticsearchmachine elasticsearchmachine merged commit 91559da into elastic:main Nov 11, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

An unexpected error occurred when attempting to backport this PR.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 116392

@n1v0lg
Copy link
Copy Markdown
Contributor Author

n1v0lg commented Nov 13, 2024

💚 All backports created successfully

Status Branch Result
8.x
8.16

Questions ?

Please refer to the Backport tool documentation

n1v0lg added a commit to n1v0lg/elasticsearch that referenced this pull request Nov 13, 2024
…6392)

Several file-settings ITs fail (rarely) with exceptions like:

```
java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)
```

This happens in Windows file systems, due to a race condition where the
file settings service is reading the settings file concurrently with the
test trying to modify it (a no-go in Windows). It turns out we have
already addressed this with a retry for one test suite
(elastic#91863), plus addressed a
related issue around mock windows file-systems misbehaving
(elastic#92653).

This PR extends the above fixes to all file-settings related ITs.

(cherry picked from commit 91559da)
n1v0lg added a commit to n1v0lg/elasticsearch that referenced this pull request Nov 13, 2024
…6392)

Several file-settings ITs fail (rarely) with exceptions like:

```
java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)
```

This happens in Windows file systems, due to a race condition where the
file settings service is reading the settings file concurrently with the
test trying to modify it (a no-go in Windows). It turns out we have
already addressed this with a retry for one test suite
(elastic#91863), plus addressed a
related issue around mock windows file-systems misbehaving
(elastic#92653).

This PR extends the above fixes to all file-settings related ITs.

(cherry picked from commit 91559da)
jozala pushed a commit that referenced this pull request Nov 13, 2024
Several file-settings ITs fail (rarely) with exceptions like:

```
java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)
```

This happens in Windows file systems, due to a race condition where the
file settings service is reading the settings file concurrently with the
test trying to modify it (a no-go in Windows). It turns out we have
already addressed this with a retry for one test suite
(#91863), plus addressed a
related issue around mock windows file-systems misbehaving
(#92653). 

This PR extends the above fixes to all file-settings related ITs.
elasticsearchmachine pushed a commit that referenced this pull request Nov 13, 2024
…116709)

Several file-settings ITs fail (rarely) with exceptions like:

```
java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)
```

This happens in Windows file systems, due to a race condition where the
file settings service is reading the settings file concurrently with the
test trying to modify it (a no-go in Windows). It turns out we have
already addressed this with a retry for one test suite
(#91863), plus addressed a
related issue around mock windows file-systems misbehaving
(#92653).

This PR extends the above fixes to all file-settings related ITs.

(cherry picked from commit 91559da)
elasticsearchmachine pushed a commit that referenced this pull request Nov 14, 2024
…116710)

Several file-settings ITs fail (rarely) with exceptions like:

```
java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)
```

This happens in Windows file systems, due to a race condition where the
file settings service is reading the settings file concurrently with the
test trying to modify it (a no-go in Windows). It turns out we have
already addressed this with a retry for one test suite
(#91863), plus addressed a
related issue around mock windows file-systems misbehaving
(#92653).

This PR extends the above fixes to all file-settings related ITs.

(cherry picked from commit 91559da)

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Nov 14, 2024
…6392)

Several file-settings ITs fail (rarely) with exceptions like:

```
java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)
```

This happens in Windows file systems, due to a race condition where the
file settings service is reading the settings file concurrently with the
test trying to modify it (a no-go in Windows). It turns out we have
already addressed this with a retry for one test suite
(elastic#91863), plus addressed a
related issue around mock windows file-systems misbehaving
(elastic#92653). 

This PR extends the above fixes to all file-settings related ITs.
alexey-ivanov-es pushed a commit to alexey-ivanov-es/elasticsearch that referenced this pull request Nov 28, 2024
…6392)

Several file-settings ITs fail (rarely) with exceptions like:

```
java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)
```

This happens in Windows file systems, due to a race condition where the
file settings service is reading the settings file concurrently with the
test trying to modify it (a no-go in Windows). It turns out we have
already addressed this with a retry for one test suite
(elastic#91863), plus addressed a
related issue around mock windows file-systems misbehaving
(elastic#92653). 

This PR extends the above fixes to all file-settings related ITs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport pending :Core/Infra/Settings Settings infrastructure and APIs Team:Core/Infra Meta label for core/infra team >test Issues or PRs that are addressing/adding tests test-windows Trigger CI checks on Windows v8.16.0 v8.17.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants