pycross: Add patching support to py_wheel_library#1436
pycross: Add patching support to py_wheel_library#1436aignas merged 27 commits intobazel-contrib:mainfrom
Conversation
This patch adds a few arguments to `py_wheel_library` to simulate how `http_archive` accepts patch-related arguments. I also amended the existing test to validate the behaviour at a very high level. References: bazel-contrib#1360
| ), | ||
| "python_version": attr.string( | ||
| doc = "The python version required for this wheel ('PY2' or 'PY3')", | ||
| values = ["PY2", "PY3", ""], |
There was a problem hiding this comment.
nit: we should probably remove support for PY2, which is probably outside the scope of this PR.
There was a problem hiding this comment.
Yeah, I can submit a follow-up PR.
tests/pycross/BUILD.bazel
Outdated
| patch_args = [ | ||
| "-p1", | ||
| ], | ||
| patch_tool_target = ":unix_patcher", |
There was a problem hiding this comment.
Should we instead just use patch_tool? Right now we are spawning python with ctx.actions which spawns Python which spawns Python which spawns patch. If there is a reason for using Python, maybe it is worth a comment?
I see you want to test on Windows as well as UNIX and now Windows just does nothing for now. Consider just adding another py_wheel_library target, which does patching which is incompatible with windows. Then you would need an extra test file that would be checking the patched contents. IMHO the lines of code might be the same as having unix_patcher.py.
There was a problem hiding this comment.
I considered that, but wanted to mirror the arguments for http_archive here. In http_archive, the patch_tool argument behaves the same as I have it here. Since it's a repository rule, it cannot support a label for a binary target. So I decided to create a new attribute called patch_tool_target. At least that was my original reasoning for it. If you feel strongly, I could be convinced to rename this to patch_tool and rename the string-based version to patch_tool_cmd or something like that.
| doc = "The patch(1) utility from the host to use. " + | ||
| "If set, overrides `patch_tool_target`.", | ||
| ), | ||
| "patch_tool_target": attr.label( | ||
| executable = True, | ||
| cfg = "exec", | ||
| doc = "The label of the patch(1) utility to use. " + | ||
| "Only used if `patch_tool` is not set.", |
There was a problem hiding this comment.
Do we restrict users to only use patch(1) tool?
There was a problem hiding this comment.
No. This is simply copied from the http_archive implementation. I'm happy to change the wording, but keeping it similar to http_archive seemed reasonable at the time at least. I think part of the reason for writing it as patch(1) explicitly is to imply the API (i.e. patch on stdin).
WDYT?
aignas
left a comment
There was a problem hiding this comment.
I am not sure that having patching as part of the build action is the best way. Patches in general can modify the METADATA that is used to generate dependencies and as you mentioned, if we split patching into a separate action, then we may end up with double the space in the build action cache.
If we introduce patching at this layer, then we will have to have a different type of patching for the dependencies. What is more, making this hermetic requires the user to go through extra hoops and just leaving it in the repository rule world and non-hermetic by construction segregates the responsibilities slightly better, IMHO.
That said, let's discuss as I am happy to be convinced that my view is wrong. :)
You mean patching what the dependencies themselves are? If so, you are correct. That can be done by modifying the intermediate file itself. The same place where you introduce what patches to apply.
That does sound nice, but that's not really what my goal is. My goal is to cut the logic in repository rules to an absolute minimum. Everything else should go into actions. I.e. I really just want to download the wheel, set up some minimal BUILD files that specify the dependencies and such, and leave everything else to actions. Having a non-hermetic patch tool in an action is really no worse than having a non-hermetic patch tool in repository context. Making the patching hermetic is easier in action context because you can depend on the output of other actions (e.g. a bazel-compiled |
aignas
left a comment
There was a problem hiding this comment.
I am not sure that having patching as part of the
buildaction is the best way. Patches in general can modify theMETADATAthat is used to generate dependencies and as you mentioned, if we split patching into a separate action, then we may end up with double the space in the build action cache.
If we introduce patching at this layer, then we will have to have a different type of patching for the dependencies.You mean patching what the dependencies themselves are? If so, you are correct. That can be done by modifying the intermediate file itself. The same place where you introduce what patches to apply. https://github.com/philsc/rules_python/blob/unreviewed/phil/pypi_install/examples/pypi_install/intermediate_file_patcher.py
Thanks for the link, this will give me a little bit of more context.
What is more, making this hermetic requires the user to go through extra hoops and just leaving it in the repository rule world and non-hermetic by construction segregates the responsibilities slightly better, IMHO.
That does sound nice, but that's not really what my goal is. My goal is to cut the logic in repository rules to an absolute minimum. Everything else should go into actions. I.e. I really just want to download the wheel, set up some minimal BUILD files that specify the dependencies and such, and leave everything else to actions.
Having a non-hermetic patch tool in an action is really no worse than having a non-hermetic patch tool in repository context. Making the patching hermetic is easier in action context because you can depend on the output of other actions (e.g. a bazel-compiled
patchbinary).
Having non-hermetic tools within the action leads to the results of that action go into the action cache, which may make the cached entries harder to trust. That said, we can mark the actions with tags like no-cache to ensure that they don't get cached, whereas in the repository_ctx we don't have this choice.
Anyway, I am pro having this merged and discussing more as we go.
This patch adds a few arguments to
py_wheel_libraryto simulate howhttp_archiveaccepts patch-related arguments.I also amended the existing test to validate the behaviour at a very
high level.
References: #1360