Skip to content

Add basic alias support for data streams#72613

Merged
martijnvg merged 6 commits intoelastic:masterfrom
martijnvg:data_stream_aliases_basics
May 11, 2021
Merged

Add basic alias support for data streams#72613
martijnvg merged 6 commits intoelastic:masterfrom
martijnvg:data_stream_aliases_basics

Conversation

@martijnvg
Copy link
Copy Markdown
Member

@martijnvg martijnvg commented May 3, 2021

Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated. When a data stream alias is used in an api then all the backing indices
of the data streams the alias currently refers to are resolved.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details: #66163 (comment)

@martijnvg martijnvg force-pushed the data_stream_aliases_basics branch from fbaa9b4 to 9732815 Compare May 3, 2021 09:15
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_alias api now support data streams, so changed this test to use get index api to test
that apis that don't data streams do not include data streams among authorised indices.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 7.x, this test uses get alias api, but that api now supports data streams,
so this yaml rest combat test needs to be muted.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @martijnvg I am testing in 7.14.0 with wildcard matching on the data streams indices and it does not seem to work. Not sure what I did wrong here
image

image

But not using wild card does work

image

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hungnguyen-elastic, thanks for reporting! I checked this locally and ran into the same issue. I will work on a fix for this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #75456 to track this bug.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @martijnvg

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
elastic#66163 (comment)

Relates to elastic#66163
@martijnvg martijnvg force-pushed the data_stream_aliases_basics branch from 801c2bf to 63ad677 Compare May 4, 2021 09:26
@martijnvg martijnvg marked this pull request as ready for review May 4, 2021 09:27
@martijnvg martijnvg requested a review from danhermann May 4, 2021 09:27
@elasticmachine elasticmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label May 4, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

Copy link
Copy Markdown
Contributor

@danhermann danhermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a lot of stuff happening in this PR! I have some suggestions, but nothing blocking.

  • I'd suggest getting @jrodewig to look over the doc changes for consistency, wording, etc.
  • I believe the merge conflict in DataStreamMetadata is due to a recent change by Armin that makes that class immutable. I'd suggest doing the same thing with the new dataStreamAliases member that you're adding here.
  • Along the lines of the previous item, I'd suggest making DataStreamAlias immutable as its list of data streams is currently mutable.
  • I wonder if it makes sense to introduce a new IndexAbstraction.Type member since index aliases and data stream aliases are different. As it stands, IndexAbstraction.DataStreamAlias is the only implementation of IndexAbstraction that does not have its own type.
  • More broadly, and perhaps not a question for this PR specifically, but with data stream aliases limited to data streams, they won't support the use case of migrating a Kibana dashboard gradually from indices to data streams. Were we to want to support that use case in the future, it seems like it would require a lot of rework to this implementation rather than just an iterative refinement of it.

@martijnvg
Copy link
Copy Markdown
Member Author

Thanks for reviewing @danhermann!

I believe the merge conflict in DataStreamMetadata is due to a recent change by Armin that makes that class immutable. I'd suggest doing the same thing with the new dataStreamAliases member that you're adding here.

👍

Along the lines of the previous item, I'd suggest making DataStreamAlias immutable as its list of data streams is currently mutable.

👍

I wonder if it makes sense to introduce a new IndexAbstraction.Type member since index aliases and data stream aliases are different. As it stands, IndexAbstraction.DataStreamAlias is the only implementation of IndexAbstraction that does not have its own type.

Data stream alias are a different kind of alias compared to indices aliases, but both are similar enough that the functionality that these kinds of aliases provide can abstracted behind the IndexAbstraction.Type.ALIAS enum. If we would introduce another IndexAbstraction.Type enum instance then a lot of checks in the codebase would check for IndexAbstraction.Type.ALIAS and IndexAbstraction.Type.DATA_STREAM_ALIAS, which I think in many places would be verbose.

Also for someone setting up aliases, there isn't a real notion of data stream aliases or indices aliases. The update aliases api or get alias api don't make a distinction between the two, it is just that an alias either refers to indices or data streams. It is just that certain alias options (like routing) aren't supported for aliases that refer to data streams. APIs that use aliases just resolve an alias to a set of indices (irregardless what type of alias or whether an alias refers to a data stream and that data stream refers to backing indices).

More broadly, and perhaps not a question for this PR specifically, but with data stream aliases limited to data streams, they won't support the use case of migrating a Kibana dashboard gradually from indices to data streams. Were we to want to support that use case in the future, it seems like it would require a lot of rework to this implementation rather than just an iterative refinement of it.

I don't see data stream aliases play a role in this case. If an alias is used in front of these indices then the migrate data stream api can be used. If not, then currently there isn't much we can do to help migration to data streams. We would need ES to support index renaming (atomically) in order to be of any help here.

@martijnvg martijnvg requested a review from jrodewig May 10, 2021 08:52
Copy link
Copy Markdown
Contributor

@jrodewig jrodewig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together @martijnvg. I left some suggestions, but feel free to ignore them if wanted.

Some additional doc changes are required, but I'll handle those in a separate PR. I'll add you as a reviewer there.

martijnvg and others added 2 commits May 10, 2021 19:57
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
@martijnvg
Copy link
Copy Markdown
Member Author

@elasticmachine update branch

@martijnvg
Copy link
Copy Markdown
Member Author

Thanks for reviewing @jrodewig!

@martijnvg martijnvg merged commit 6689b8b into elastic:master May 11, 2021
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 11, 2021
Backporting elastic#72613 to 7.x.

Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
elastic#66163 (comment)

Relates to elastic#66163
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 11, 2021
martijnvg added a commit that referenced this pull request May 11, 2021
martijnvg added a commit that referenced this pull request May 11, 2021
Backporting #72613 to 7.x.

Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
#66163 (comment)

Relates to #66163

* made code compatible with java 8
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 11, 2021
martijnvg added a commit that referenced this pull request May 11, 2021
jrodewig added a commit that referenced this pull request May 17, 2021
#72613 adds data stream support to aliases.
This adds an `alias` glossary entry and removes out the current `index alias` entry.
jrodewig added a commit that referenced this pull request May 17, 2021
#72613 adds data stream support to aliases.
This adds an `alias` glossary entry and removes out the current `index alias` entry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>feature :StorageEngine/Data streams Data streams and their lifecycles Team:Data Management (obsolete) DO NOT USE. This team no longer exists. v7.14.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants