DOCS: general overview of data tiers and roles#63086
DOCS: general overview of data tiers and roles#63086andreidan merged 21 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-core-features (:Core/Features/Features) |
|
Pinging @elastic/es-docs (>docs) |
dakrone
left a comment
There was a problem hiding this comment.
Thanks for opening this Andrei! I left a bunch of comments and hopefully someone from the docs team can weigh in as well
docs/reference/index-modules/allocation/data_tier_allocation.asciidoc
Outdated
Show resolved
Hide resolved
| Common data lifecycle management patterns revolve around transitioning the indices | ||
| through multiple collections of nodes with different hardware characteristics in order | ||
| to fulfil evolving CRUD, search, and aggregation needs as the indices age. The concept | ||
| of a tiered hardware architecture is not new in {es}. |
There was a problem hiding this comment.
(Only my suggestion, not necessarily a requirement)
| Common data lifecycle management patterns revolve around transitioning the indices | |
| through multiple collections of nodes with different hardware characteristics in order | |
| to fulfil evolving CRUD, search, and aggregation needs as the indices age. The concept | |
| of a tiered hardware architecture is not new in {es}. | |
| Common data lifecycle management patterns revolve around transitioning indices | |
| through multiple collections of nodes with different hardware characteristics in order | |
| to fulfil evolving CRUD, search, and aggregation needs as indices age. |
There was a problem hiding this comment.
The reason I removed the comment about the "not new" section is I think we could/should explicitly add a section about migrating attribute based transitioning to data tier transitioning, perhaps elsewhere or as a blog post?
There was a problem hiding this comment.
That's a great point Lee. I believe the ILM section should advise on how to migrate. That said, I think mentioning/referencing the existing ILM tiered options/methods here is a nice bridge for that (with links going back and forth between the ILM guide and this page).
I'm happy to drop it but I find it a nice bridge towards ILM and the tiered options it enables (with and without data tiers)
There was a problem hiding this comment.
I've reworded the tiers definition to emphasise things like replicas etc should be configured and don't come as guarantees. Also reworded the data retention a bit to be a guideline.
Let me know if we should reword /remove more.
| is retained for months and the indices have zero replicas as they are backed by a searchable | ||
| snapshot. |
There was a problem hiding this comment.
I definitely think this sentence should not be here, as it makes it sound like all of this happens automatically when data is moved to the cold tier
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
|
@elasticmachine update branch |
debadair
left a comment
There was a problem hiding this comment.
Left several comments & suggestions. Let me know if you have questions or want to discuss.
| Updates the <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>> | ||
| index setting in order to migrate the index to the <<modules-tiers, data tier>> corresponding | ||
| to the current phase. |
There was a problem hiding this comment.
| Updates the <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>> | |
| index setting in order to migrate the index to the <<modules-tiers, data tier>> corresponding | |
| to the current phase. | |
| Moves the index to the <<modules-tiers, data tier>> that corresponds | |
| to the current phase by updating the <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>> | |
| index setting. | |
| {ilm-init} automatically injects the migrate action in the warm and cold | |
| phases if no allocation options are specified with the <<ilm-allocate, allocate>> action. If you specify an allocate action that only modifies the number of index | |
| replicas, {ilm-init} reduces the number of replicas before migrating the index. | |
| To prevent automatic migration without specifying allocation options, | |
| you can explicitly include the migrate action and set the enabled option to`false`. |
| Content data nodes accommodate user-created content. They enable operations like CRUD, | ||
| search and aggregations. |
There was a problem hiding this comment.
I think we need a better definition of content node. Defining it in terms of "user-created content" could be interpreted as actual user-generated content, not content like a product catalog. I was trying to define it in terms of "collections of things" vs a stream of data. Maybe something like "Content data nodes store indices that contain collections of things such as an catalog of products. The value of the data in a content node remains relatively constant, and the performance requirements aren't tied to the age of the data."
There was a problem hiding this comment.
I think introducing more abstract terms could potentially complicate things further here. I believe the product catalog would usually be manually introduced in the system (ie. user created) as opposed to being machine generated (like logs and metrics).
There was a problem hiding this comment.
I wonder if it would be clearer if we talk about "content" by exemplifying it as opposed to using the content origin?
eg. Content data nodes store the documents that back/support application, website, and enterprise search. The value of the data in a content node remains relatively constant, and the performance requirements aren't tied to the age of the data.
docs/reference/modules/node.asciidoc
Outdated
| Warm data nodes hold indices after they are no longer being written to, but still being | ||
| queried, usually at a lower frequency than it was in the hot tier. Lower performant | ||
| hardware can usually be used in this tier. |
There was a problem hiding this comment.
| Warm data nodes hold indices after they are no longer being written to, but still being | |
| queried, usually at a lower frequency than it was in the hot tier. Lower performant | |
| hardware can usually be used in this tier. | |
| Warm data nodes store indices that are no longer being regularly updated, but are still being | |
| queried. Query volume is usually at a lower than it was while the index was in the hot tier. Less performant | |
| hardware can usually be used for nodes in this tier. |
docs/reference/setup.asciidoc
Outdated
|
|
||
| include::modules/node.asciidoc[] | ||
|
|
||
| include::modules/datatiers.asciidoc[] |
There was a problem hiding this comment.
Per previous comment, I think we want this info at the top level.
Co-authored-by: debadair <debadair@elastic.co>
Co-authored-by: debadair <debadair@elastic.co>
dakrone
left a comment
There was a problem hiding this comment.
This looks much better, thanks for working on this! I left a bunch of comments still, but they are really minor. Deb should take another look before merging also.
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
|
Thanks for the review @dakrone |
This adds general overview documentation for data tiers, the data tiers specific node roles, and their application in ILM. Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co> (cherry picked from commit d588cab) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
This adds general overview documentation for data tiers, the data tiers specific node roles, and their application in ILM. Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co> (cherry picked from commit d588cab) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
This adds general overview documentation for data tiers, the data tiers specific node roles, and their application in ILM. Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co> (cherry picked from commit d588cab) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co>
This adds general overview documentation for data tiers, the data tiers specific node roles, and their application in ILM. Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co> (cherry picked from commit d588cab) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
This adds a general overview documentation for data tiers
and the data tiers specific node roles.
Relates to #60848