Add shadow helper python package shadowtools#3449
Conversation
|
I think the biggest downside of adding something like this is that it'd be another place to keep in sync with the config format. really half-baked idea: I wonder if we could make this python module more-or-less replace the specification doc. If we added the text from there as comments it'd be pretty close. Potentially we could have a script that generates the markdown/html from the python module. |
|
I write many shadow config generators in python. Usually I'm reading in a config created by I wonder:
I would argue against putting the Shadow config documentation inside this somewhat extraneous python tool. That seems kind of backwards to me. I would prefer instead to put the config docs as close to the rust config code as possible (i.e., |
Yeah, I'm not sure. Keeping it in this repo is generally simpler and less work overall, and we can ensure it's always in sync with shadow itself. From a user perspective the only downside I can think of is in a CI job where you want to run a pre-built shadow binary using this python module, now you need to clone the relatively large shadow repo just to get the module. Overall that's probably better though than having to track and version the two repos independently.
It seems a little weird to push shadow-related-packages to pypy.org when we don't package shadow itself anywhere. We can define a package though, and use pip to install it. e.g. in arti's CI we install chutney with
Fair enough.
Interesting. I think we do something a little bit like this in arti to scrape rustdoc comments to generate the RPC documentation. By the way I don't know if you saw that later commits, but I did end up fleshing out the runner tool a bit more. It's kind of convenient, and maybe a useful gateway to show people how to use shadow before forcing them to write a config file. Right now it's in the same python module, but probably belongs in its own module. (Though could maybe be part of the same python package) Example: |
It looks like it can be in a subdirectory. From https://pip.pypa.io/en/stable/cli/pip_install/#examples : |
2ce1919 to
851360a
Compare
851360a to
e991012
Compare
|
Ok, I think this is about ready for proper review. If you're not crazy about the I'm also open to better names than "shadowsim" and "shadow-shell". |
robgjansen
left a comment
There was a problem hiding this comment.
This is cool!
- Don't we want a README.md in the python package? Most of the content of your initial PR message would be a good starting point to briefly explain the purpose, how it works, and how to install.
- It's probably also worth an entry in Shadow's CHANGELOG.
Feedback on names:
shadowsim--> I likeshadowpyorshadowtoolsbettershadow-shell--> I likeshadow-execbetter (reasoning in my review comments)
Other idea:
- I was thinking why not just make
tools/.the base directory rather thantools/shadowsim. But that maybe only makes sense if the python package name isshadowtools(in which case the base directory forshadowtoolswould be shadow/tools :)). But also maybe that's a stretch.
|
Here are my first impressions before reviewing. Just typing them out now before I get some food :) Looks cool!
I think having it in the shadow repo makes sense to keep things in sync. Having it in the shadow repo also means that you don't need to worry about if the version of the tool from its repo will be compatible with a specific version of shadow from the shadow repo. I think it should have tests in the CI to make sure it continues to work and doesn't become stale over time. I have mixed feelings about generating the docs from code in general. I think generating documentation from code doesn't usually turn out that great. There ends up being cases that aren't generated properly or could be done better by a person, and then we need some way to override the documentation generation script. I had tried doing this when originally moving the config code to rust, but realized after a while that it would take more time and effort to maintain the script than to just update the documentation manually when needed. It can be painful when you know what you want the documentation to say, but you need to spend an hour modifying the script to do what you could do by hand in 10 seconds. On the other hand it's not difficult to accidentally forget to update the docs, and then the docs become stale. But I think we've been doing a pretty good job at keeping the docs up to date. Another option is something like schemars which lets you derive a json schema from the rust code which you could use to build documentation, but it needs a bunch of manual editing to get human readable types, default values, etc. Anyways my thoughts are that if we can generate the docs like we have it now, and without needing a bunch of overrides for cases that the docs generation script doesn't handle well, then it might be nice to have it regardless of if it's being generated from the rust serde types or the new python config types. |
Ah, so I was originally thinking just "shadow", but there are a lot packages referencing various shadows in pypi: https://pypi.org/search/?q=shadow. Even if we never publish to pypi I figured it'd be best to avoid name collisions there. So the "sim" suffix was to distinguish it from all the other shadows being referenced there. IMO having 'py' in the name of a python package feels redundant, but it's certainly not unprecedented 🤷 Maybe also adding the "tools" suffix makes sense though since the python module doesn't completely wrap "the shadow simulator", but is indeed more of a toolset for working with it. So maybe "shadowsimtools"?
I figured I'd have the intermediate So maybe make the base directory |
Agreed :/
Yeah, it might end up being too niche to be generally useful. I definitely intend to use it in tor's CI though, so it'll have at least one user :). Maybe more use-cases will crop up; e.g. in some ways it's a more powerful alternative to I think we could use this to run a lot of our "unit" tests and delete the checked-in configs. In addition to getting rid of the configs, the simulated stdout would be a bit more visible since you wouldn't have to hunt down the data dir. I'm not sure it makes sense to go that "all in" on it yet, though. e.g.
Yeah, I'll add some tests. (Though I suppose using it to run our other tests as above would kill 2 birds with one stone 🤔 )
Yeah, same. It's probably not worth it. I feel a little bad about adding yet another place to keep synchronized with the config spec, but OTOH it shouldn't really generate that much more work, and isn't that big a deal if it falls out of sync. |
|
Added mypy static type checking to the CI. Still thinking about how best to add additional tests for On a whim, this gets surprisingly far: As far as I can tell shadow is deadlocked at that point, in the I am thinking a bit more about adding a Probably for now I'll just add a few tests to the python package dir itself rather than trying to migrate our other tests to use it. |
It's the "sim" part that I was hoping to drop. I guess I'm selfishly thinking Shadow existed long before most of those python packages so we have as much claim to the base shadow name as anyone else. I still like We can also have Steve cast a vote :) |
stevenengler
left a comment
There was a problem hiding this comment.
Looks good! Sorry for the delay, my modern python knowledge is lacking.
I don't have strong feelings here. I think I'd lean slightly towards "shadowtools" over "shadowsim" or "shadowsimtools". IMO we shouldn't care too much about what's on pypy since a name that's available today might not be available tomorrow (either coincidentally or maliciously). |
Sounds like "shadowtools" it is!
Yeah, my main concern is that I don't think a user can install this package locally AND a package of the same name from pypi at the same time. But maybe it's not that big of a deal, particularly with virtual envs. |
It's crazy to me that it gets that far, considering it's running random cmake stuff. I wonder if the
Yeah I don't think it's a big deal. It might be good for us to (at some point, doesn't need to be now) have a documentation page about how to add new config options (update rust code, update documentation, update python code, and update expected |
|
I see the PR to drop support for Debian 10, but it also looks like the PR is failing on ubuntu 20.04. If ubuntu 20.04 has too old of a python version (3.8), my vote would be to just disable testing of this python code on that platform rather than trying to support the old ubuntu version that we're dropping support for in ~5 months. |
Yeah I think it only needs a small change, but it if it turns out to be too onerous I'm fine with just not supporting it. |
|
Ok, I think that about does it! I'm sure I'll be iterating a bit more on the @robgjansen could you approve #3451 and remove debian:10 from the required tests? |
It's passed EOL: https://wiki.debian.org/LTS. Its python3 package is also pretty old (3.7), which would have required substantial changes to #3449 to support.
Currently this includes two modules: * shadowtools.config: Raw type definitions corresponding to shadow's config format. * shadowtools.shadow_exec: A CLI tool for running simple shadow simulations. The new package can be installed from a local checkout with: ``` python3 -m pip install ./shadowtools ``` Or, once merged, with something like: ``` python3 -m pip install git+https://github.com/shadow/shadow.git#subdirectory=shadowtools ```
Most python code should be able to (and now does) pass mypy in non-strict mode. The newly added shadow tools package is fully type annotated; it should and does pass mypy in strict mode.
e3c1632 to
6acea1b
Compare
|
Thanks for the reviews! |
|
Oh, @robgjansen I wasn't sure what milestone to use. Maybe a new one along the lines of "usability"? |
|
Maybe we can avoid adding a milestone for now to prevent "milestone creep"? ... IMO, those typically imply some concerted effort toward a goal, or that we put more thought into where we are headed than perhaps we did here. That being said, if we start adding lots of work in this direction, we could think more about where we are wanting to get to in the shortish term, and then it could make sense to encode that into a milestone. |
That's fine with me if you're fine with me continuing to flagrantly ignore the "check milestone" test failures for now :) |
:( well that sounds like you have some more work in mind in this direction. I made a milestone for this now. But I really want to avoid having this be a generic "usability" milestone that will be open for the rest of eternity. So if you could think a bit about when we should consider this line of work mature enough to close the milestone, that would be nice. And please update the milestone description to document your current plans :) |
|
Thanks; this LGTM! Yeah I expect minimally I'll be tinkering a bit more with the Another way of looking at the goal here is to improve the "shadow as CI tool" use-case. Generally I think |
Maybe also the "reproducible build" use-case? Though if we really wanted to support that I suppose we should fix file metadata to have the simulated dates. I'm not sure how much you care about these use-cases in general, but OTOH supporting more use-cases would hopefully help draw interest and ultimately funding. gcc seems to work, though it seems we have a bug preventing setting +x. |
|
@sporksmith I added the project 141 label, but feel free to remove if it shouldn't be. |
Oh thanks, I'd forgotten we'd created that label. I added the alt use-case examples to the milestone |
Add the shadowtools python package
Currently this includes two modules:
config format.
simulations.
The new package can be installed from a local checkout with:
Or, once merged, with something like:
Some additional motivation/context: in tor we're increasingly using shadow in CI, and starting to accumulate python scripts that generate shadow configurations. This started off as just writing the
configmodule to help a little bit with writing the python code that generates configs.We're also starting to use it to run chutney though. Soon we want to teach chutney to generate a shadow config that runs the chutney processes on different simulated hosts, but it's also useful to be able to run a "normal" chutney network and test suite under shadow in the same way that it would run natively. For simple "single host" simulations, I think it's be nice to have the
shadow-exectool.I think the
shadow-exectool could also be helpful for demoing shadow. A lot of shadow's cool properties can be shown off with it, and when someone is ready to do something more advanced they could use its generated configs as a starting point.e.g.:
"faketime" substitute (for 2000-01-01, anyway)
time "fast forwarding"
precise and deterministic time tracking
determinism:
(single host) networking:
I think we could also extend it a little bit more, maybe even to run multi-host configs as something like:
Though I'm not sure yet whether that's worth implementing; at that point it might be better to write a config file and use shadow directly.