Make data streams in APIs resolvable.#54726
Conversation
This change makes data stream resolvable in all APIs. For APIs that have `FORBID_ALIASES_TO_MULTIPLE_INDICES` set in the index options, this means that a data stream is resolved to the latest backing index. For all other APIs this means that a data stream is resolved to all backing indices. Additionally also a `IGNORE_DATA_STREAMS` indices option is added, so that for specific APIs data stream resolvability can be disabled. Relates to elastic#53100
|
Pinging @elastic/es-core-features (:Core/Features/Data streams) |
It is used in the Metadata class and otherwise node commands can't load the cluster state.
to a change in the error message, this pr now returns `alias` with originates from the index abstraction type's display name.
… indices option. Data streams are not resolved unless the INCLUDE_DATA_STREAMS indices option is specified. Only search, msearch, index (with op type create) and refresh requests are configured with the INCLUDE_DATA_STREAMS indices option. All other requests / apis don't resolve data streams.
danhermann
left a comment
There was a problem hiding this comment.
The latest PR updates LGTM
|
I think we should we add a few REST tests too that checks that doing index admin ops against a data-stream fails? Like we have also for aliases, for instance when deleting an index. I am not sure we need full coverage on that, but would be good to ensure a handful of most important cases are handled, like delete index, update settings, update mapping, GET index (should that fail, I think so?). |
henningandersen
left a comment
There was a problem hiding this comment.
@martijnvg let me know your thoughts on moving the includeDataStreams option from IndicesOptions to IndexNameExpressionResolver and failing always when hitting data-stream (and not wanting to resolve them)?
server/src/main/java/org/elasticsearch/action/support/IndicesOptions.java
Show resolved
Hide resolved
| @@ -133,12 +133,19 @@ public enum Option { | |||
| new IndicesOptions(EnumSet.of(Option.ALLOW_NO_INDICES, Option.FORBID_CLOSED_INDICES), | |||
| EnumSet.of(WildcardStates.OPEN, WildcardStates.HIDDEN)); | |||
| public static final IndicesOptions STRICT_EXPAND_OPEN_FORBID_CLOSED_IGNORE_THROTTLED = | |||
There was a problem hiding this comment.
If we stick to this approach, this should also be renamed?
| concreteIndices.add(writeIndex.getIndex()); | ||
| } | ||
| } else { | ||
| if (indexAbstraction.getIndices().size() > 1 && !options.allowAliasesToMultipleIndices()) { |
There was a problem hiding this comment.
I think reusing this flag for data streams is not ideal. I wonder if we should instead flat out fail, if an expression matches a data-stream and we are not in one of the cases where data-streams are resolved.
I think I prefer this over skipping data-streams. Index, alias and data-stream are in the same namespace. I think it will be confusing to have some index admin operations that match indices and aliases, but ignores data-streams. Suppose you have:
xalias -> yindex
xindex
xdatastream
and you do POST /x*/_close, I think it would close yindex, xindex, but silently ignore xdatastream. Obviously, users should design their namings to avoid such scenarios, but I think a better default would be to fail, if an operation unexpectedly resolves to a data-stream.
There was a problem hiding this comment.
Whether an operation or api is expected to expand a wildcard expression to a data stream that depends on whether that api is setup to resolve data streams. So in this case that would mean that ignoring data streams is expected behaviour. This is at least my expectation.
There was a problem hiding this comment.
Henning and I chatted about this and we both now agree that the request should fail if data streams are not resolvable both the expression matches with one or more data streams. It makes it clear that an api doesn't support data streams (just like if a user specified a concrete data stream name) and this makes it easier to add data stream support if an api should support data streams.
| } | ||
| } | ||
|
|
||
| public void testDataSteams() { |
There was a problem hiding this comment.
nit:
| public void testDataSteams() { | |
| public void testDataStreams() { |
…ssion expands to a data stream then fail. This means the api doesn't support data streams, but the wildcard expression is resolving to a data stream. Prior to this commit data streams would be quietly ignored. We think that failing is more clear, so that it is clear that a specific api call isn't executed over a data stream.
| out.writeEnumSet(options); | ||
| if (out.getVersion().before(Version.V_8_0_0) && options.contains(Option.INCLUDE_DATA_STREAMS)) { | ||
| EnumSet<Option> copy = EnumSet.copyOf(options); | ||
| copy.remove(Option.INCLUDE_DATA_STREAMS); |
There was a problem hiding this comment.
I think we should put in a check into CreateDataStreamAction to not allow creating data streams before all nodes are on the target release (7.8)? To ensure that there are no surprises in a mixed cluster.
There was a problem hiding this comment.
👍 I will make this change in a followup change.
Backport from: #54726 The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used. In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs. Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex(). If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api). Relates to #53100
Enabled data streams and itv2 feature enabled system properties in server module's integ test task. PR elastic#54726 added java integration tests for data streams, so this is why these system properties need to be enabled when running release build.
Enabled data streams and itv2 feature enabled system properties in server module's integ test task. PR #54726 added java integration tests for data streams, so this is why these system properties need to be enabled when running release build.
Enabled data streams and itv2 feature enabled system properties in server module's integ test task. PR elastic#54726 added java integration tests for data streams, so this is why these system properties need to be enabled when running release build.
) Enabled data streams and itv2 feature enabled system properties in server module's integ test task. PR #54726 added java integration tests for data streams, so this is why these system properties need to be enabled when running release build.
This change makes data stream resolvable in APIs.
The
INCLUDE_DATA_STREAMSindices option controls whether data streams can be resolved in an api. In this pr, theINCLUDE_DATA_STREAMSindices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable theINCLUDE_DATA_STREAMSindices option for these APIs.Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the
IndexNameExpressionResolver.Context.isResolveToWriteIndex().If
isResolveToWriteIndex()returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api).Relates to #53100