Add option to use Compute Horde#68
Conversation
| return False | ||
|
|
||
| if miner_state.chain_model_hash != model_hash: | ||
| if model_hash is not None and miner_state.chain_model_hash != model_hash: |
There was a problem hiding this comment.
These changes seem unnecessary? it's reducing code lines but obfuscating the log (i.e., unable to tease out if missing or a mismatch)
There was a problem hiding this comment.
I did it this way so that it's possible to do some of the checks before we have the model hash or miner coldkey from docker (to avoid downloading the model here and in compute horde later too, like in my other comment). I changed the caller code to account for that: https://github.com/deval-core/De-Val/pull/68/files/6e8da07badffeac501b426c2f19285ff18af1f5d#diff-274d3bc59fd308b41d1dcd439b1385875eb9dbf11c1dfe915c3f596c3907cb15R185-R192
but if you'd like it to be done in a different way I'll see what I can do
|
|
||
| async def run_epoch_on_compute_horde(self, miner_state: ModelState) -> ModelState: | ||
| # Local validation that does not require Docker container. | ||
| is_valid = self.contest.validate_model(miner_state, None, None, 0, constants.max_model_size_gbs + 2) |
There was a problem hiding this comment.
This seems like it bypasses many of the validation checks we put in place to prevent cheating? Why would we need a separate validation step?
There was a problem hiding this comment.
Yes, this does bypass some of the checks that were done here, but they are done later in the compute horde job: https://github.com/deval-core/De-Val/pull/68/files#diff-2c0eaf30c9dd4bcfecbb25b71de8aa26a5661f2bf60a057af68ad719a73485b6R43-R58 (all of them except the docker container size)
This was split this way due to some technical limitations:
- doing all in the deval validator would require downloading the miner model twice (here and then in the compute horde)
- doing all in the compute horde is not possible because it does not allow network connections
6e8da07 to
dcc480f
Compare
Adds flag
--neuron.use_compute_hordefor the validator, to perform computations on the Compute Horde instead of locally.Warning
Compute Horde is not 100% production ready yet, don't set this flag in real validators.
Usage
Requires some new
COMPUTE_HORDE_settings in.env:To run the validator using Compute Horde, register as validator on SN15 (see the README for validators) and run:
You can also use pm2 as recommended.
To run a single job (miner model) on the Compute Horde without starting a validator loop – this doesn't require registration and acts as a "sanity check":
(replace the flag values with any model you want to validate)
Implementation
The flow looks roughly like this:
neurons/compute_horde_entrypoint.py)TODO
compute-horde-sdkfrom PyPI rather than git