Skip to content

Shadow Replica indexes do not delete properly #17695

@abeyad

Description

@abeyad

This issue is to document deletion problems with shadow replica indices that were found while working on #17265. A separate PR #17638 that improves the naming of methods in the IndicesService also contains tests or added assertions to existing tests that reveal the issues below and must be enabled as part of any PR that fixes the issues.

No. 1
The index file deletion logic that is triggered in IndicesService#deleteIndexStore(String reason, Index index, IndexSettings indexSettings checks before deleting files to see if the index is not a shadow replica, or if it is, ensure that it has been closed before (so that no other nodes are holding resources to it). An issue with this is that it is too strict of a check, so that if a shadow replica index is deleted, if it was not previously closed, the index folder itself is not deleted and remains on the file system (an empty folder). So one of the issues that needs fixing is to ensure index directories are deleted even on shadow replica index deletes. The following tests have commented out assertions to test this behavior once fixed:

  • IndexWithShadowReplicaIT#testIndexWithShadowReplicasCleansUp
  • IndexWithShadowReplicaIT#testShadowReplicaNaturalRelocation

Note that shared shard data is cleaned up properly in a shadow replica index that is not closed, as the shard data is deleted by the StoreCloseListener. This is verified in the tests with the assertPathHasBeenCleared assert.

No. 2
The issue with deleting a shadow replica index that was previously closed is that all of the index and shard data are potentially deleted simultaneously by each node that receives the delete operation and invokes NodeEnvironment#deleteIndexDirectorySafe. This can lead to race conditions where a node is trying to delete a file that was deleted by another node as both are walking the file system simultaneously (using Lucene's IOUtils.rm). This ends up logged as a warning in IndicesService#deleteIndexStore(String reason, Index index, IndexSettings indexSettings and the deletion is put on the pending queue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions