Description
If a node operator moves their data directory (eg. --data-dir) the voting keystore path may need to be updated in the validator_definitions.yml file but this is not necessarily obvious to the operator.
Version
This affects stable but I made a PR to unstable with a possible fix.
Present Behaviour
In the situation described above (node operator moved the data directory, eg. --datadir) but did not change the path for voting_keystore_path in their validator_definitions.yml file. The following output will be seen when trying to start the validator client process:
Dec 04 18:00:08 deskboy systemd[1]: Started Lighthouse Validator.
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.966 INFO Logging to file path: "/storage_4TB/lighthouse/validators/logs/validator.log"
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.966 INFO Lighthouse started version: Lighthouse/v3.2.1-6d5a2b5
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.966 INFO Configured for network name: mainnet
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.966 INFO Starting validator client validator_dir: "/storage_4TB/lighthouse/validators", beacon_nodes: ["http://localhost:5052/"]
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.966 INFO HTTP metrics server is disabled
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.967 INFO Completed validator discovery new_validators: 0
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.968 CRIT Failed to start validator client reason: Unable to initialize validators: UnableToOpenVotingKeystore(Os { code: 2, kind: NotFound, message: "No such file or directory" })
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.968 INFO Internal shutdown received reason: Failed to start validator client
Dec 04 18:00:08 deskboy lighthouse[418374]: Dec 05 00:00:08.968 INFO Shutting down.. reason: Failure("Failed to start validator client")
Dec 04 18:00:08 deskboy lighthouse[418374]: Failed to start validator client
Dec 04 18:00:08 deskboy systemd[1]: lighthousevalidator.service: Main process exited, code=exited, status=1/FAILURE
Dec 04 18:00:08 deskboy systemd[1]: lighthousevalidator.service: Failed with result 'exit-code'.
Dec 04 18:00:14 deskboy systemd[1]: lighthousevalidator.service: Scheduled restart job, restart counter is at 21402.
Dec 04 18:00:14 deskboy systemd[1]: Stopped Lighthouse Validator.
Relevant CLI flags: ./lighthouse validator_client --datadir /storage_4TB/lighthouse
The actual error thrown is: CRIT Failed to start validator client reason: Unable to initialize validators: UnableToOpenVotingKeystore(Os { code: 2, kind: NotFound, message: "No such file or directory" }) but it is not obvious what the issue is.
Expected Behaviour
If we reach this code path and get the UnableToOpenVotingKeystore error then it means our --data-dir was sepcified correctly but the voting_keystore_path in the validator_definitions.yml file was not valid. It would be nice to have a small message to tip node operators of the issue in this case (I imagine it is not too uncommon to change the --data-dir location when upgrading drives, migrating hosts, etc.)
Steps to resolve
I will submit a PR with a proposed fix and link it to this issue. I am happy to modify the PR as needed.
Description
If a node operator moves their data directory (eg.
--data-dir) the voting keystore path may need to be updated in thevalidator_definitions.ymlfile but this is not necessarily obvious to the operator.Version
This affects
stablebut I made a PR tounstablewith a possible fix.Present Behaviour
In the situation described above (node operator moved the data directory, eg.
--datadir) but did not change the path forvoting_keystore_pathin theirvalidator_definitions.ymlfile. The following output will be seen when trying to start the validator client process:Relevant CLI flags:
./lighthouse validator_client --datadir /storage_4TB/lighthouseThe actual error thrown is:
CRIT Failed to start validator client reason: Unable to initialize validators: UnableToOpenVotingKeystore(Os { code: 2, kind: NotFound, message: "No such file or directory" })but it is not obvious what the issue is.Expected Behaviour
If we reach this code path and get the
UnableToOpenVotingKeystoreerror then it means our --data-dir was sepcified correctly but thevoting_keystore_pathin thevalidator_definitions.ymlfile was not valid. It would be nice to have a small message to tip node operators of the issue in this case (I imagine it is not too uncommon to change the--data-dirlocation when upgrading drives, migrating hosts, etc.)Steps to resolve
I will submit a PR with a proposed fix and link it to this issue. I am happy to modify the PR as needed.