Skip to content

TRA waits when an index doesn't exist but fails immediately when shard is not found #20279

@bleskes

Description

@bleskes

TransportReplicationAction currently has an inconsistency in how it deals with requests that refer to things that don't exist (which is different than not available).

  1. When an index is not found in the cluster state, we go into a retry loop where we wait for the index to appear.
  2. When a request comes in for a shard that doesn't exists (i.e., the shard id is higher than the number of shards ) we fail immediately - as it will never appear.

This is surprising and we should fix it.

In my opinion we should:

  1. Require ReplicationRequests to have a complete ShardId when they get to the reroute phase in TRA.
  2. Fail immediately when that shard id can not be resolved.
  3. Change TransportIndexAction and similar write actions to resolve the incoming requests and set their proper shard id (with index uuid). If they need to create the index, they can go ahead, but then it's up to them to also wait until the current (data) node, knows about the index that was just created. We can have a shared utility method for this on AutoCreateIndex.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed/CRUDA catch all label for issues around indexing, updating and getting a doc by id. Not search.>bug

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions