[DOCS] Clarify expected availability of HDFS for the HDFS Repository#25220
[DOCS] Clarify expected availability of HDFS for the HDFS Repository#25220jbaiera merged 5 commits intoelastic:masterfrom
Conversation
…r the HDFS Repository Plugin
|
Relates #25119 |
| [float] | ||
| ===== A Note on HDFS Availablility | ||
| When you create a repository, its settings are persisted in the cluster state. When a node comes online, it will | ||
| attempt to create all repositories for which it has settings. If your cluster has an HDFS repository configured, then |
There was a problem hiding this comment.
I would say initialize instead of create.
| ===== A Note on HDFS Availablility | ||
| When you create a repository, its settings are persisted in the cluster state. When a node comes online, it will | ||
| attempt to create all repositories for which it has settings. If your cluster has an HDFS repository configured, then | ||
| all nodes in the cluster must be able to reach HDFS, even when starting. If not, then the node may fail to create the |
There was a problem hiding this comment.
reach HDFS, even when starting -> reach HDFS when starting.
There was a problem hiding this comment.
For the last two lines, I would replace with something like:
If HDFS is not available, the node will fail to start.
For the last line, I don't understand exactly what you are suggesting. If the node can't start, they can't remove the repository.
There was a problem hiding this comment.
My understanding based on the logic in the RepositoriesService is that the node won't fail to start, but will rather simply log a warning with the exception thrown from the create call and continue on with initializing:
There was a problem hiding this comment.
Well, may fail to create is still incorrect then. You could say will fail to initialize and be unusable?
|
@rjernst I've addressed the reviews |
rjernst
left a comment
There was a problem hiding this comment.
LGTM, one last place to use "initialize" instead of "create".
| [float] | ||
| ===== A Note on HDFS Availablility | ||
| When you initialize a repository, its settings are persisted in the cluster state. When a node comes online, it will | ||
| attempt to create all repositories for which it has settings. If your cluster has an HDFS repository configured, then |
There was a problem hiding this comment.
I think the word "create" is confusing here. It should be "initialize".
|
Thanks @rjernst! |
…y-context * 'master' of github.com:elastic/elasticsearch: (21 commits) [DOCS] Clarify expected availability of HDFS for the HDFS Repository (elastic#25220) Remove some redundant 140 character checkstyle suppressions [Docs] more fix for the parent-join docs [Docs] Fix cross reference for parent-join field More advices around search speed and disk usage. (elastic#25252) Add documentation for the new parent-join field (elastic#25227) [analysis-icu] Allow setting unicodeSetFilter (elastic#20814) Introduce translog size and age based retention policies (elastic#25147) Add needs methods for specific variables to Painless script context factories. (elastic#25267) Improves snapshot logging and snapshoth deletion error handling (elastic#25264) Add unit test for PathHierarchyTokenizerFactory (elastic#24984) Deprecate tribe service Moved more token filters to analysis-common module. [Test] Make sure that SearchAfterSortedDocQueryTests uses a single threaded searcher [DOCS] Defined es-test-dir and plugins-examples-dir in index.asciidoc. (elastic#25232) Test fix - removed superfluous assertion (elastic#25247) [Test] restore BWC for parent-join now that the new mapping format is in 5.x Add a section named "relations" in the ParentJoinFieldMapper (elastic#25248) test: Ported more OldIndexBackwardsCompatibilityIT tests to full cluster restart qa tests. (elastic#25173) fix: Sort Processor does not have proper behavior with targetField (elastic#25237) ...
If a cluster is configured with an HDFS repository and a node is started, that node must be able to
reach HDFS, or else when it attempts to add the repository from the cluster state at start up it will
fail to connect and the repository will be left in an inconsistent state. Adding a blurb in the docs to
outline the expected availability for HDFS when using the repository plugin.