CI build, doc-build: Run containers explicitly, separate jobs for pyright, build, modularized tests, long tests#36498
Conversation
|
|
|
via #36443 |
|
I don't get the point of this PR. It seems to mostly complicate the setup and essentially reimplements the container feature of github. For the out-of-space issue it would be nice to first know what's the route cause to be able to evaluate if steps to fixing it go into the correct direction. For potential speedups by caching the docker image, I'm not sure if this is really needed. Yes, from a conceptual viewpoint you only want to build sagelib once, and then reuse it - but well, it mostly takes 2-10 mins of incremental compilation anyway. Personally, I think we have more pressing issues with the ci setup and better not introduce more moving parts (atm at least). |
It's really not very complicated. GitHub runs the container from its very full runner image. We know that it does not have enough space for our large containers. |
That's right. The declarative |
Actually, the compilation can take much longer when low level parts of sagelib are touched, and when certain packages are updated. |
Github provides 14gb of space (after its runner image), that should be sufficient (and was sufficient) to run a 7gb image in it and install sage. Also our image got bigger by a few houndred mbs. |
|
If you want to understand more details, I'd suggest that you experiment with the space parameters in https://github.com/sagemath/sage/blob/develop/.github/workflows/docker.yml#L148 |
|
It's already quite easy to run commands outside the container. Here is an example for exactly your first use case: https://github.com/osquery/osquery/blob/a916e80a895f8abb9b749161d774d528d0466b31/.github/workflows/hosted_runners.yml#L257-L263 I don't think we need more flexibility. |
Thanks for finding this.
Well, I already know that we do need it, namely for sharing the build, which I'll do in a follow up. |
…remental build more incremental via git, via git, .ci/retrofit-worktree.sh
|
merge conflict |
sagemathgh-36498: CI build, doc-build: Run containers explicitly, separate jobs for pyright, build, modularized tests, long tests <!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> <!-- Why is this change required? What problem does it solve? --> Running the containers explicitly, instead of using the declarative `container:` feature of GH Actions gives us more control: - we can create more space on the host if necessary; we just scraped by an out of space condition in sagemath#36473 / sagemath#36469 (comment) - we can run some operations outside of the container but in the same job; this will make the separate "Get CI fixes" jobs unnecessary, addressing the cosmetic concerns from sagemath#36338 (comment), sagemath#36349 - it enables caching between the various workflows (as first discussed in sagemath#36446). We split out static type checking with pyright into its separate workflow: - **pyright.yml**: As static checking does not need a build of the Sage library, for PRs that do not make any changes to packages, there's nothing to build, and the workflow gives a fast turnaround just after 10 minutes. For applying the CI fixes from blocker tickets, this workflow uses the technique favored in sagemath#36349. The workflow **build.yml** first launches a job: - **test-new:** It builds incrementally (using a tox-generated `Dockerfile` and https://github.com/marketplace/actions/build-and-push- docker-images) and does the quick incremental test. This completes within 10 to 20 minutes when there's no change. After this is completed, more jobs are launched: - **test-mod:** It again builds incrementally and tests a modularized distribution. Later (with more from sagemath#35095), more jobs will be added to this matrix job for other distributions. - **test-long:** It again builds incrementally and runs the long test. The workflows **doc-build.yml** and **doc-build-pdf.yml** again build incrementally and then build the documentation. The diffing code for the HTML documentation now runs in the host, not the container, which simplifies things. (To show that diffing still works, we include a small change to the Sage library.) Splitting the workflow jobs implements @kwankyu's suggestion from: - sagemath#35652 (comment) (Fixes sagemath#35814) The steps "again builds incrementally" actually use the GH Actions cache, https://docs.docker.com/build/ci/github-actions/cache/#cache- backend-api. When nothing is cached and the 3 workflows are launched in parallel, they may each build the same thing. But when there's congestion that leads to the workflows to be run serially, the 2nd and 3rd workflow will be able to use the cache from the 1st workflow. This elasticity may be helpful in reducing congestion. There is a rather small per-project limit of 10 GB for this cache, so we'll have to see how effectively caching works in practice. See https://github.com/sagemath/sage/actions/caches <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> - Depends on sagemath#36938 (merged here) - Depends on sagemath#36951 (merged here) - Depends on sagemath#37351 (merged here) - Depends on sagemath#37750 (merged here) <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#36498 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Matthias Köppe
sagemathgh-36498: CI build, doc-build: Run containers explicitly, separate jobs for pyright, build, modularized tests, long tests <!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> <!-- Why is this change required? What problem does it solve? --> Running the containers explicitly, instead of using the declarative `container:` feature of GH Actions gives us more control: - we can create more space on the host if necessary; we just scraped by an out of space condition in sagemath#36473 / sagemath#36469 (comment) - we can run some operations outside of the container but in the same job; this will make the separate "Get CI fixes" jobs unnecessary, addressing the cosmetic concerns from sagemath#36338 (comment), sagemath#36349 - it enables caching between the various workflows (as first discussed in sagemath#36446). We split out static type checking with pyright into its separate workflow: - **pyright.yml**: As static checking does not need a build of the Sage library, for PRs that do not make any changes to packages, there's nothing to build, and the workflow gives a fast turnaround just after 10 minutes. For applying the CI fixes from blocker tickets, this workflow uses the technique favored in sagemath#36349. The workflow **build.yml** first launches a job: - **test-new:** It builds incrementally (using a tox-generated `Dockerfile` and https://github.com/marketplace/actions/build-and-push- docker-images) and does the quick incremental test. This completes within 10 to 20 minutes when there's no change. After this is completed, more jobs are launched: - **test-mod:** It again builds incrementally and tests a modularized distribution. Later (with more from sagemath#35095), more jobs will be added to this matrix job for other distributions. - **test-long:** It again builds incrementally and runs the long test. The workflows **doc-build.yml** and **doc-build-pdf.yml** again build incrementally and then build the documentation. The diffing code for the HTML documentation now runs in the host, not the container, which simplifies things. (To show that diffing still works, we include a small change to the Sage library.) Splitting the workflow jobs implements @kwankyu's suggestion from: - sagemath#35652 (comment) (Fixes sagemath#35814) The steps "again builds incrementally" actually use the GH Actions cache, https://docs.docker.com/build/ci/github-actions/cache/#cache- backend-api. When nothing is cached and the 3 workflows are launched in parallel, they may each build the same thing. But when there's congestion that leads to the workflows to be run serially, the 2nd and 3rd workflow will be able to use the cache from the 1st workflow. This elasticity may be helpful in reducing congestion. There is a rather small per-project limit of 10 GB for this cache, so we'll have to see how effectively caching works in practice. See https://github.com/sagemath/sage/actions/caches <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> - Depends on sagemath#36938 (merged here) - Depends on sagemath#36951 (merged here) - Depends on sagemath#37351 (merged here) - Depends on sagemath#37750 (merged here) <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#36498 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Matthias Köppe
|
merge conflict |
… tool versions (variant)
…our script packages
…use of {envpython} by just python3
…ing spkg-check, source sage-env etc.
…Replace use of {envpython} by just python3"
This reverts commit cca2b8d.
…able.txt.m4: Copy from sagemath#35095
Running the containers explicitly, instead of using the declarative
container:feature of GH Actions gives us more control:We split out static type checking with pyright into its separate workflow:
The workflow build.yml first launches a job:
Dockerfileand https://github.com/marketplace/actions/build-and-push-docker-images) and does the quick incremental test. This completes within 10 to 20 minutes when there's no change.After this is completed, more jobs are launched:
The workflows doc-build.yml and doc-build-pdf.yml again build incrementally and then build the documentation. The diffing code for the HTML documentation now runs in the host, not the container, which simplifies things. (To show that diffing still works, we include a small change to the Sage library.)
Splitting the workflow jobs implements @kwankyu's suggestion from:
The steps "again builds incrementally" actually use the GH Actions cache, https://docs.docker.com/build/ci/github-actions/cache/#cache-backend-api. When nothing is cached and the 3 workflows are launched in parallel, they may each build the same thing. But when there's congestion that leads to the workflows to be run serially, the 2nd and 3rd workflow will be able to use the cache from the 1st workflow. This elasticity may be helpful in reducing congestion.
There is a rather small per-project limit of 10 GB for this cache, so we'll have to see how effectively caching works in practice. See https://github.com/sagemath/sage/actions/caches
📝 Checklist
⌛ Dependencies
pkgs/sagemath-standard: Move metadata fromsetup.cfgtopyproject.toml#36951 (merged here)ubuntu-trusty,opensuse-{15.3,15.4}#37351 (merged here)