Skip to content

[16.10] Backport #3106 and #3222: Cached conda environments and API to manage them#3227

Merged
martenson merged 13 commits intogalaxyproject:release_16.10from
mvdbeek:cached_deps_16.10
Nov 29, 2016
Merged

[16.10] Backport #3106 and #3222: Cached conda environments and API to manage them#3227
martenson merged 13 commits intogalaxyproject:release_16.10from
mvdbeek:cached_deps_16.10

Conversation

@mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Nov 29, 2016

Backport #3106 and #3222.

It appears that building conda environments in the job directory is problematic in some larger production environments (true for @lecorguille @lparsons and myself ).
Since that seriously cripples Conda usage I think we should backport this to release_16.10.

Ping @bgruening @jmchilton

mvdbeek and others added 13 commits November 29, 2016 17:41
Similar to galaxyproject#2986, implement
a mechanism that allows tool dependencies to be cached.

If the `use_cached_dependency_manager` option is set to True in
galaxy.ini, we build a hash of the combination of a tools' requirements,
and store the resulting environment in a directory specified by the
`tool_dependency_cache_dir` option in galaxy.ini.
to updates or changes in depedencies, folder structure and resolver
configuration. Instead of hashing name, type and version of a
dependency, hash the json representation of the dependencies returned by
the dependency resolver, which include the path to the environment and
the depedency type. This is only applied to resolvers whose cacheable
attribute is set to True (conda-only, currently).
and only activate cached environments if they exist.
and override __eq__ for ToolRequirement, to simplify
checking if ToolRequirements are already installed/cached.
and to build a tool dependency cache (if activated in galaxy.ini).

An example to install dependencies for the twobit converter:
```
import bioblend.galaxy
url = 'http://localhost:8080/'
api_key = 'admin_api_key'
tool_id = 'CONVERTER_fasta_to_2bit'
endpoint = "api/tools/%s/install_dependencies" % tool_id
gi = bioblend.galaxy.GalaxyInstance(url, api_key)
gi.make_post_request("/".join((url, endpoint)), payload={'id': tool_id})
```
If `use_cached_dependency_manager` is activated in the galaxy.ini,
a cached environment can be built like this:
```
endpoint = "api/tools/%s/build_dependency_cache" % tool_id
gi.make_post_request("/".join((url, endpoint)), payload={'id': tool_id})
```
This will remove a pre-existing cached dependency directory.
@abretaud
Copy link
Contributor

(I would love this too 😄 )

@jmchilton
Copy link
Member

+1 Alright - it is marked as beta I guess anyway and certainly main could use it to eliminate the expensive copying @natefoo is doing right now.

@natefoo
Copy link
Member

natefoo commented Nov 29, 2016

Yeah I see this as a bug fix since w/o it, many deployments will be broken (and since it's optional it won't break any existing working deployments).

+1

@martenson martenson merged commit d6f0e6b into galaxyproject:release_16.10 Nov 29, 2016
@lecorguille
Copy link
Member

Are you considering updating conda version with this fix?
I have some issue with the 3.19.3 and some bioconductor-package galaxyproject/planemo#604

def hash_requirements(self, resolved_dependencies):
"""Return hash for requirements"""
resolved_dependencies = [[(dep.name, dep.version, dep.exact, dep.dependency_type) for dep in resolved_dependencies]]
hash_str = json.dumps(sorted([resolved_dependencies]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mvdbeek There are a few extra brackets here:

        resolved_dependencies = [(dep.name, dep.version, dep.exact, dep.dependency_type) for dep in resolved_dependencies]
        hash_str = json.dumps(sorted(resolved_dependencies))

But this change may break previously created cached environments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, how did I miss that!? :(.
Let's change it while it' not in wide use.

@mvdbeek mvdbeek deleted the cached_deps_16.10 branch June 12, 2018 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants