Modernize conda environment#34
Conversation
|
Thank you for the draft! |
|
DeepSpeed accepted our first upstream fix regarding the ninja detection (deepspeedai/DeepSpeed#7687). Once a new version is released, this should allow us to get rid of the PyPI ninja dependency. Of course, this fix will only come into play if we decide against the vendoring approach. |
|
As of 2024/12/01, packages still installed from pypi after installing openfold3 in devel/editable mode: To investigate
Proposed solution: remove aria2 from pypi dependencies, as it is currently unused in OF3 codebase. The pypi package is an old convenience and should in general not be used to install aria2, as it is not even such big of a convenience. Because of cuequivariance_ops_torch_cu12
These should at the very least aligned with the CF version, but likely it is best to just install all from pypi until we understand how to deal with the license. The key question is what to do with libcublas, maybe we should add synonyms to parselmouth in pixi - although I am not 100% sure these two packages are 100% binary compatible. Currently the biggest block to have a conda package with these is their LICENSE. See also: NVIDIA/cuEquivariance#218 It could be interesting to see if openequivariance could be a viable alternative: Because of mkl
Proposed solution: remove mkl from pypi dependencies, as it is actually unused (pytorch links it statically, numpy and scipy are not build against it and do not dynamically dispatch). |
Also: |
|
Coming back to this after the end of year "hiatus". Current state and TODOs:
|
all tests pass, predictions seem to be correct corresponds to a modernized conda environment following best practices
incomplete, we might not need the native sources from upstream commit df59f203f40c8a292dd019ae68c9e6c88f107026
incomplete, we might not need the native sources from upstream commit df59f203f40c8a292dd019ae68c9e6c88f107026
Use vendored deepspeed in the attention primitives
…ks from pixi environment
|
Got it to run on a DGX box, great stuff run_openfold predict \
--query_json examples/example_inference_inputs/query_ubiquitin.json \
--runner_yaml examples/example_runner_yamls/low_mem.yml \
--num_diffusion_samples=1 \
--num_model_seeds=1 \
--output_dir output/
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
WARNING:openfold3.entry_points.experiment_runner:No version_tensor is found for this checkpoint.Assuming the user knows checkpoints are parameters are compatible, continuing...
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
💡 Tip: For seamless cloud logging and experiment tracking, try installing [litlogger](https://pypi.org/project/litlogger/) to enable LitLogger, which logs metrics and artifacts automatically to the Lightning Experiments platform.
💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
WARNING:openfold3.core.data.tools.colabfold_msa_server:Using output directory: /tmp/of3_colabfold_msas for ColabFold MSAs.
WARNING:openfold3.core.data.tools.colabfold_msa_server:Mapping file /tmp/of3_colabfold_msas/mappings/seq_to_rep_id.json already exists. Appending new sequences.
WARNING:openfold3.core.data.tools.colabfold_msa_server:Mapping file /tmp/of3_colabfold_msas/mappings/rep_id_to_seq.json already exists. Appending new sequences.
Submitting 1 sequences to the Colabfold MSA server for main MSAs...
No complexes found for paired MSA generation. Skipping...
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/multiprocessing/popen_fork.py:67: DeprecationWarning: This process (pid=1974826) is multi-threaded, use of fork() may lead to deadlocks in the child.
self.pid = os.fork()
Preprocessing templates: 100%|█████████████████████████████████████████████████████| 1/1 [00:00<00:00, 142.58it/s]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/pytorch_lightning/utilities/_pytree.py:21: `isinstance(treespec, LeafSpec)` is deprecated, use `isinstance(treespec, TreeSpec) and treespec.is_leaf()` instead.
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:429: Consider setting `persistent_workers=True` in 'predict_dataloader' to speed up the dataloader worker initialization.
Predicting: | | 0/? [00:00<?, ?it/s]/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/torch/cuda/__init__.py:435: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)
queued_call()
Predicting DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]Seed set to 2746317213
/home/jandom/workspace/openfold-3/.pixi/envs/openfold3-cuda13-pypi/lib/python3.13/site-packages/biotite/structure/io/pdbx/convert.py:912: DeprecationWarning: `include_bonds` parameter is deprecated, intra-residue are always written, if available
warnings.warn(
Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████| 1/1 [04:47<00:00, 0.00it/s]
==================================================
PREDICTION SUMMARY (COMPLETE)
==================================================
Total Queries Processed: 1
- Successful Queries: 1
- Failed Queries: 0
==================================================
Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████| 1/1 [04:47<00:00, 0.00it/s]
Removing empty log directory...
Cleaning up MSA directories...however, I had to make one change Update: all the tests pass locally, this seems good to go! |
|
Good stuff @jandom I was surprised the new multiprocessing config affected your box (apologies!). I have narrowed down to change things only in osx (in fact I will feel more comfortable if we do not hardcode "spawn" there). Can you try again? On a positive note, tests also pass nicely in blackwell + linux-64. |
I'm not sure that's impacting me, haven't tried on my osx yet, this was on a DGX |
|
Still broken on linux without additional changes, this time only a config change was needed Claude then proceed to propose this delightful change (I'm not impressed) # macOS MPS requires fork: https://github.com/pytorch/pytorch/issues/87688
# Linux requires spawn: fork segfaults when CUDA threads are already running
# (Python 3.12+ warns, 3.14 will change the default)
multiprocess_context = None
if num_workers:
if platform.system() == "Darwin" and torch.backends.mps.is_available():
multiprocess_context = "fork"
else:
multiprocess_context = "spawn"Update: everything is passing for me locally, yay! The only thing I see missing here is the Dockerfile updates? (I'm happy to add those, pixi activation is even tricker than conda activation in docker). Here is an idea – let's merge this into a dummy branch on this repo, so that we can start iterating on the docker image stuff (which will be hard on your fork, comparatively) |
1fb8bf2 to
610ae56
Compare
We would still need to document this for users
610ae56 to
23811a4
Compare
|
Thanks a lot for the tests @jandom. Good stuff. I guess the pae_enabled problem will be taken care of by #142 I have created a more sophisticated solution to the multiprocessing_context in 23811a4 (apologies for the force-push, and note a fixed capitalization error later) that:
If you like it we will need to document. We can also revert into your solution if we prefer simplicity. Let me know :-) As we discussed in our call, I removed our example dockerfiles. I would suggest to do that in a subsequent PR, or we can try to bring it to this PR. In any case, if you need help, we can lend a hand. As we are removing "hacks.py" and the newer deepspeed versions have fixed the cutlass configuration problems, we will need to update the docs introduced here. Happy to do too. Should we open an issue to document the failing tests for cuEquivariance > 0.7? I do not know if these are legit issues or if the tests need update. |
jandom
left a comment
There was a problem hiding this comment.
Hell yeah, let's do this – massive thanks to @sdvillal for putting this together.
I'm approving this after extensive testing, including a small training run (eh). It's a massive improvement over the conda workflow, which we should deprecate going forward.
I propose we sqash-merge into a 'dev' branch, and iterate a bit more, because I want to make sure the docker images work as well. Deprecate the conda during 2026Q2, and pixi-only starting 2026Q3.
|
To my mind, we could already merge this to main (feel free to squash-merge) and iterate from there. While there are improvements that will come later, it is good if people can start playing with it, while the old workflows are still in place. That would potentially give us nice intel into trouble, allow to submit smaller improvements as PRs, and get us to a more polished finish line before we deprecate. We should update the lock file before merging. Happy to follow your lead. |
|
You're probably right – this doesn't actually remove anything, the conda thing should still work (my main worry, the research team will take a while to migrate) |
|
With @jnwei we've got some reports that things might be breaking in training, we want to put this on 'pixi-beta' branch for now. |
|
Awesome, thanks @jandom. What type of problems do we have at training? Could we open issues? One thing that worries me is than the kalign-python pypi packages bundle their own OpenMP runtimes, which I heard is dangerous. |
Summary
Adds a modern conda environment following best practices to improve the quality of life of conda users.
The environment is self-contained, including a sane toolchain to build extensions fully compatible with the rest of dependencies, and with batteries included (inference, bioinformatics, fast kernels, dev dependencies).
We maintain a pixi workspace and an automatically generated conda environment for non-pixi users.
We still need to iron-out four known problems (see overcomments in pixi.toml and upcoming issues) and add documentation.
From here, creating conda-forge openfold3 package and bioconda openfold3-extra should be simple enough.
Changes
Related Issues
TBC
Testing
Current environment passes all tests and produce sensible predictions.
Other Notes
This is exploratory at the moment. Will cleanup commit history or open a clean PR when we are done.