Skip to content

[DOCS] Clarify expected availability of HDFS for the HDFS Repository#25220

Merged
jbaiera merged 5 commits intoelastic:masterfrom
jbaiera:jbaiera-hdfs-docs-clarification
Jun 16, 2017
Merged

[DOCS] Clarify expected availability of HDFS for the HDFS Repository#25220
jbaiera merged 5 commits intoelastic:masterfrom
jbaiera:jbaiera-hdfs-docs-clarification

Conversation

@jbaiera
Copy link
Copy Markdown
Member

@jbaiera jbaiera commented Jun 14, 2017

If a cluster is configured with an HDFS repository and a node is started, that node must be able to
reach HDFS, or else when it attempts to add the repository from the cluster state at start up it will
fail to connect and the repository will be left in an inconsistent state. Adding a blurb in the docs to
outline the expected availability for HDFS when using the repository plugin.

@jbaiera jbaiera added :Plugin Repository HDFS >docs General docs changes labels Jun 14, 2017
@jbaiera
Copy link
Copy Markdown
Member Author

jbaiera commented Jun 14, 2017

Relates #25119

Copy link
Copy Markdown
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few nits.

[float]
===== A Note on HDFS Availablility
When you create a repository, its settings are persisted in the cluster state. When a node comes online, it will
attempt to create all repositories for which it has settings. If your cluster has an HDFS repository configured, then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say initialize instead of create.

===== A Note on HDFS Availablility
When you create a repository, its settings are persisted in the cluster state. When a node comes online, it will
attempt to create all repositories for which it has settings. If your cluster has an HDFS repository configured, then
all nodes in the cluster must be able to reach HDFS, even when starting. If not, then the node may fail to create the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reach HDFS, even when starting -> reach HDFS when starting.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the last two lines, I would replace with something like:

If HDFS is not available, the node will fail to start.

For the last line, I don't understand exactly what you are suggesting. If the node can't start, they can't remove the repository.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding based on the logic in the RepositoriesService is that the node won't fail to start, but will rather simply log a warning with the exception thrown from the create call and continue on with initializing:

https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/repositories/RepositoriesService.java#L304-L306

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, may fail to create is still incorrect then. You could say will fail to initialize and be unusable?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will do.

@jbaiera
Copy link
Copy Markdown
Member Author

jbaiera commented Jun 15, 2017

@rjernst I've addressed the reviews

Copy link
Copy Markdown
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one last place to use "initialize" instead of "create".

[float]
===== A Note on HDFS Availablility
When you initialize a repository, its settings are persisted in the cluster state. When a node comes online, it will
attempt to create all repositories for which it has settings. If your cluster has an HDFS repository configured, then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the word "create" is confusing here. It should be "initialize".

@jbaiera
Copy link
Copy Markdown
Member Author

jbaiera commented Jun 16, 2017

Thanks @rjernst!

@jbaiera jbaiera merged commit 9c65073 into elastic:master Jun 16, 2017
@jbaiera jbaiera deleted the jbaiera-hdfs-docs-clarification branch June 16, 2017 13:47
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 16, 2017
…y-context

* 'master' of github.com:elastic/elasticsearch: (21 commits)
  [DOCS] Clarify expected availability of HDFS for the HDFS Repository (elastic#25220)
  Remove some redundant 140 character checkstyle suppressions
  [Docs] more fix for the parent-join docs
  [Docs] Fix cross reference for parent-join field
  More advices around search speed and disk usage. (elastic#25252)
  Add documentation for the new parent-join field (elastic#25227)
  [analysis-icu] Allow setting unicodeSetFilter (elastic#20814)
  Introduce translog size and age based retention policies (elastic#25147)
  Add needs methods for specific variables to Painless script context factories. (elastic#25267)
  Improves snapshot logging and snapshoth deletion error handling (elastic#25264)
  Add unit test for PathHierarchyTokenizerFactory (elastic#24984)
  Deprecate tribe service
  Moved more token filters to analysis-common module.
  [Test] Make sure that SearchAfterSortedDocQueryTests uses a single threaded searcher
  [DOCS] Defined es-test-dir and plugins-examples-dir in index.asciidoc.  (elastic#25232)
  Test fix - removed superfluous assertion (elastic#25247)
  [Test] restore BWC for parent-join now that the new mapping format is in 5.x
  Add a section named "relations" in the ParentJoinFieldMapper (elastic#25248)
  test: Ported more OldIndexBackwardsCompatibilityIT tests to full cluster restart qa tests. (elastic#25173)
  fix: Sort Processor does not have proper behavior with targetField (elastic#25237)
  ...
@clintongormley clintongormley added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed :Plugin Repository HDFS labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >docs General docs changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants