Deployment Considerations documentation#9933

Merged

jacobtomlinson merged 8 commits intodask:mainfrom

gjoseph92:deployment-considerations

Feb 13, 2023

Collaborator

gjoseph92 commented Feb 8, 2023

This document tries to cover some of the infrastructure challenges outside of Dask that people commonly run into when setting up serious (production, multi-tenant) Dask deployments. The goal here is to give a more realistic picture of what it takes to run a production-grade Dask deployment to people who might be thinking of setting one up.

This is spun out from https://github.com/dask/dask/pull/9912/files#r1096578227, and based loosely on @mrocklin's PyData NYC talk: https://www.youtube.com/watch?v=5hUkUj1VYW4.

cc @scharlottej13 @jacobtomlinson


          Deployment Considerations documentation

2a4d4f6

github-actions bot added the documentation label

Collaborator Author

gjoseph92 commented Feb 8, 2023

https://dask--9933.org.readthedocs.build/en/9933/deployment-considerations.html

jacobtomlinson mentioned this pull request

Promoting related for-profit companies/technologies/orgs in Dask content dask/community#305

Closed

jacobtomlinson reviewed

View reviewed changes

Member

jacobtomlinson left a comment •

edited

Loading

I really like the way this has been laid out. It talks through a lot of important factors that advanced users will want to think about. Especially when rolling Dask out within orgs. As you say much of this is out of scope for core Dask.

I like how it starts with the challenges and then neatly directs users off to other projects and companies that solve those problems. This seems like a really great way to direct folks away to other projects and commercial offerings.

docs/source/deployment-considerations.rst


		Thanks to the efforts of the open-source community, there are tools to deploy Dask :ref:`pretty much anywhere <deployment-options>`—if you can get computers to talk to each other, you can probably turn them into a Dask cluster.

		However, getting Dask running is often not the last step, but the first step. This document attempts to cover some of the things outside of Dask you may have to think about when managing a Dask deployment.

Member

jacobtomlinson Feb 8, 2023

I like how this is clearly setting the stage that these things are out of scope or on the periphery for Dask.

docs/source/deployment-considerations.rst Outdated

		Additional challenges can include getting local packages or scripts onto the cluster (and ensuring they're up to date), as well as packages installed from private Git or PyPI repos.


		Observability

Member

jacobtomlinson Feb 8, 2023

This is super important, but I feel like this is one of the last things folks think about. I would probably move this below other sections like cost and credentials.

Collaborator Author

gjoseph92 Feb 9, 2023

That was actually why I moved it up here; I know people don't usually think about it up front, but I wanted to make it more prominent since it's so important.

Also note that I mixed log retention in with metrics. Maybe those are worth splitting; I think log retention should be quite high (you're really not going to have a good time if you don't even keep logs around), but metrics usually come a bit later in your deployment journey.

docs/source/deployment-considerations.rst Outdated

+              - What are we spending it on? (machines, machines that should have been turned off, network egress that shouldn't have happened, etc.)
+              - Who/what is responsible?
+              Non-commercial deployment tools generally don't build in this sort of monitoring. Organizations that need it either end up building their own tools, or turning to commercial deployment offerings.

Member

jacobtomlinson Feb 8, 2023

Maybe soften this a bit.

Suggested change

      
            Non-commercial deployment tools generally don't build in this sort of monitoring. Organizations that need it either end up building their own tools, or turning to commercial deployment offerings.
          
            Many deployment tools generally don't build in this sort of monitoring. Organizations that need it either end up building their own tools, or turning to commercial deployment offerings.

Collaborator Author

gjoseph92 Feb 9, 2023

I'd originally written that, but then couldn't think of any non-commercial tools that actually did have built-in capabilities for cost monitoring. Is there something I'm not thinking of?

Arguably Coiled doesn't even do what I've described here. Coiled can tell you how you spent your Coiled bill, but for your AWS bill, you still have to look yourself in the cost explorer (though this is facilitated by tags Coiled adds to all your dask infrastructure).

docs/source/deployment-considerations.rst Outdated


		You may also have other systems on restricted networks that workers need to access to read and write data, or call APIs. Connecting to those networks could add additional complexity.

		Some organizations may have additional network security policies, such as requiring all traffic to be encrypted. Dask supports this with :doc:`TLS <tls>`, which requires additional configuration, and managing certificates.

Member

jacobtomlinson Feb 8, 2023

Side note but Dask Cloud Provider turns this on by default. I wonder if we should do that in more deployment tooling and make it more of an opt out.

docs/source/deployment-considerations.rst Outdated Show resolved Hide resolved

docs/source/deployment-considerations.rst Outdated Show resolved Hide resolved

scharlottej13 mentioned this pull request

RFC - Improving deploying Dask docs #9912

Merged

3 tasks

gjoseph92 and others added 5 commits

February 8, 2023 18:55


          Apply suggestions from code review

Co-authored-by: Jacob Tomlinson <jacobtomlinson@users.noreply.github.com>


          note self-signed tls

1f76568


          mention package sync

fbe03b4


          separate logging & observability

6128c32


          soften cost monitoring

700c9f3

gjoseph92 changed the title ~~RFC: Deployment Considerations documentation~~ Deployment Considerations documentation

gjoseph92 marked this pull request as ready for review

February 10, 2023 04:36

Collaborator Author

gjoseph92 commented Feb 10, 2023

@jacobtomlinson I think I've addressed your comments!

gjoseph92 added 2 commits

February 10, 2023 12:57


          list of env management options

26066db


          narrow down managed list

9f70bec

jacobtomlinson approved these changes

View reviewed changes

Member

jacobtomlinson left a comment

Thanks for writing this up @gjoseph92

jacobtomlinson merged commit 07b76c2 into dask:main

gjoseph92 deleted the deployment-considerations branch

February 13, 2023 20:39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels