Skip to content

Improve handling of low disk watermark issues #856

@jsoriano

Description

@jsoriano

When starting a cluster with elastic-package stack, if the host where docker is running has little disk space available, Elasticsearch will fail to allocate shards and Kibana will fail to reach a healthy state.

The thresholds for this kind of issues to happen are the low and high disk watermarks, they are 90% and 95% respectively by default.

Currently, elastic-package just waits till timeout for Kibana to reach a healthy state, and no helpful information is given to users when it finally fails, leading to issues like #838. When the command timeouts, this is the printed error now:

container for service "kibana" is unhealthy
Error: booting up the stack failed: running docker-compose failed: running command failed: running Docker Compose up command failed: exit status 1

This situation can be detected by checking the available disk space, or by checking Elasticsearch logs, looking for messages like high disk watermark [95%] exceeded.

The current workaround would be to free disk space in the docker host.

Proposed actions

  • Detect when the watermarks are exceeded during boot up (parsing logs or checking cluster state), and fail immediately, providing an error message about the availability of free space in the host.
  • Modify healthcheck in elasticsearch service to fail if the status is red.
  • An alternative could be to disable these thresholds ( cluster.routing.allocation.disk.threshold_enabled: false), and let Elasticsearch fail to start when there is no disk space. But this can be still misleading, specially in OSs where docker runs in a virtual machine.

Metadata

Metadata

Assignees

Labels

Team:EcosystemLabel for the Packages Ecosystem team

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions