Skip to content

.zuul: Enable testing on Fedora 40#1468

Merged
debarshiray merged 2 commits intocontainers:mainfrom
debarshiray:wip/rishi/zuul-test-f40
May 2, 2024
Merged

.zuul: Enable testing on Fedora 40#1468
debarshiray merged 2 commits intocontainers:mainfrom
debarshiray:wip/rishi/zuul-test-f40

Conversation

@debarshiray
Copy link
Copy Markdown
Member

No description provided.

debarshiray added a commit to debarshiray/toolbox that referenced this pull request Mar 11, 2024
@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from 05e9993 to 77556ba Compare March 11, 2024 15:25
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/457baa233ee949baa5e852fd55e58e57

✔️ unit-test SUCCESS in 5m 05s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 35s
✔️ unit-test-restricted SUCCESS in 4m 07s
✔️ system-test-fedora-rawhide SUCCESS in 34m 03s
system-test-fedora-40 TIMED_OUT in 1h 00m 20s
✔️ system-test-fedora-39 SUCCESS in 34m 35s
✔️ system-test-fedora-38 SUCCESS in 34m 23s

@debarshiray
Copy link
Copy Markdown
Member Author

recheck

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/ea568dffd0ec4d4fb025df7883d85eda

✔️ unit-test SUCCESS in 4m 46s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 14s
✔️ unit-test-restricted SUCCESS in 4m 45s
✔️ system-test-fedora-rawhide SUCCESS in 33m 43s
system-test-fedora-40 TIMED_OUT in 1h 00m 25s
✔️ system-test-fedora-39 SUCCESS in 34m 35s
✔️ system-test-fedora-38 SUCCESS in 33m 22s

debarshiray added a commit to debarshiray/toolbox that referenced this pull request Mar 11, 2024
The current timeout of 1 hour that's used for stable Fedoras has proved
insufficient for Fedora 40.  It's not clear why that is.  It's possible
that since Fedora 40 is currently Branched and pre-release Fedoras use
Linux kernels that are built with debugging enabled, it's slower than
stable Fedoras.

Therefore, the same timeout as Fedora Rawhide has been used.

containers#1468
@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from 77556ba to ae30307 Compare March 11, 2024 17:57
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/faf6e71e3a1c4623aa7de40625142eef

✔️ unit-test SUCCESS in 5m 03s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 29s
✔️ unit-test-restricted SUCCESS in 4m 15s
✔️ system-test-fedora-rawhide SUCCESS in 35m 00s
system-test-fedora-40 TIMED_OUT in 1h 20m 28s
✔️ system-test-fedora-39 SUCCESS in 35m 42s
✔️ system-test-fedora-38 SUCCESS in 34m 41s

@debarshiray
Copy link
Copy Markdown
Member Author

recheck

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/4132e3e669db41dfb78c2f24e8d0faca

✔️ unit-test SUCCESS in 4m 51s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 38s
✔️ unit-test-restricted SUCCESS in 4m 07s
✔️ system-test-fedora-rawhide SUCCESS in 34m 15s
system-test-fedora-40 TIMED_OUT in 1h 20m 24s
✔️ system-test-fedora-39 SUCCESS in 33m 41s
✔️ system-test-fedora-38 SUCCESS in 33m 41s

@debarshiray
Copy link
Copy Markdown
Member Author

recheck

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/e4317ecd416746898f59eb190074fb78

✔️ unit-test SUCCESS in 5m 16s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 32s
✔️ unit-test-restricted SUCCESS in 3m 57s
✔️ system-test-fedora-rawhide SUCCESS in 36m 05s
system-test-fedora-40 TIMED_OUT in 1h 20m 26s
✔️ system-test-fedora-39 SUCCESS in 35m 22s
✔️ system-test-fedora-38 SUCCESS in 35m 04s

@debarshiray
Copy link
Copy Markdown
Member Author

@danpawlik , @TristanCacqueray do you have any idea what's wrong with the cloud-fedora-40 instances? It says that it's timing out, but that's not the real problem. It has the same timeout of 4800 seconds as Fedora Rawhide, which is more than the 3600 seconds that we use for the other stable Fedoras.

It gets stuck setting up the test suite and doesn't ever progress:

2024-03-13 15:22:14.061987 | TASK [Run system tests]
2024-03-13 15:22:17.033732 | fedora-40 | 1..340
2024-03-13 15:22:17.089197 | fedora-40 | # test suite: Set up
2024-03-13 16:37:39.488832 | RUN END RESULT_TIMED_OUT: [untrusted : github.com/containers/toolbox/playbooks/system-test.yaml@main]

@danpawlik
Copy link
Copy Markdown
Contributor

hi @debarshiray . No idea, I will try to take a look soon/tomorrow.

@danpawlik
Copy link
Copy Markdown
Contributor

recheck

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/e3c5ba0d9acf44afbbb0a5072df1ca88

✔️ unit-test SUCCESS in 5m 09s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 40s
✔️ unit-test-restricted SUCCESS in 3m 43s
✔️ system-test-fedora-rawhide SUCCESS in 34m 51s
system-test-fedora-40 FAILURE in 9m 20s
✔️ system-test-fedora-39 SUCCESS in 34m 29s
✔️ system-test-fedora-38 SUCCESS in 36m 00s

@danpawlik
Copy link
Copy Markdown
Contributor

danpawlik commented Mar 14, 2024

by using command:

bats --timing ./test/system -x

I spotted, that it stuck on:

$ run "$PODMAN" --root "${DOCKER_REG_ROOT}" run -d --rm --name "${DOCKER_REG_NAME}" --privileged -v "${DOCKER_REG_AUTH_DIR}":/auth -e REGISTRY_AUTH=htpasswd -e REGISTRY_AUTH_HTPASSWD_REALM="Registry Realm" -e REGISTRY_AUTH_HTPASSWD_PATH="/auth/htpasswd" -v "${DOCKER_REG_CERTS_DIR}":/certs -e REGISTRY_HTTP_ADDR=0.0.0.0:443 -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key -p 50000:443 "${IMAGES[docker-reg]}"
$ assert_success
$ run "$PODMAN" login --authfile "${TEMP_BASE_DIR}/authfile.json" --username user --password user "${DOCKER_REG_URI}"
$ assert_success
$ run "$SKOPEO" copy --dest-authfile "${TEMP_BASE_DIR}/authfile.json" dir:"${IMAGE_CACHE_DIR}"/fedora-toolbox-34 docker://"${DOCKER_REG_URI}"/fedora-toolbox:34  #### HERE ####

so it might be not working on F40. Try to bump the base image to f38

@danpawlik
Copy link
Copy Markdown
Contributor

@debarshiray did you try with updated images?

debarshiray added a commit to debarshiray/toolbox that referenced this pull request Mar 25, 2024
The current timeout of 1 hour that's used for stable Fedoras has proved
insufficient for Fedora 40.  It's not clear why that is.  It's possible
that since Fedora 40 is currently Branched and pre-release Fedoras use
Linux kernels that are built with debugging enabled, it's slower than
stable Fedoras.

Therefore, the same timeout as Fedora Rawhide has been used.

containers#1468
@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from ae30307 to 8d5f8c3 Compare March 25, 2024 22:05
@debarshiray
Copy link
Copy Markdown
Member Author

@debarshiray did you try with updated images?

Sorry, I got pulled away by some other Fedora 40 and internal deadlines. Let me try now. Thanks for confirming that there's nothing wrong with Zuul.

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/394efbc1975c4b9f9fb104fc23d5ce8c

✔️ unit-test SUCCESS in 5m 15s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 50s
✔️ unit-test-restricted SUCCESS in 3m 45s
system-test-fedora-rawhide TIMED_OUT in 1h 20m 20s
system-test-fedora-40 TIMED_OUT in 1h 20m 28s
✔️ system-test-fedora-39 SUCCESS in 34m 36s
✔️ system-test-fedora-38 SUCCESS in 33m 37s

@debarshiray
Copy link
Copy Markdown
Member Author

I spotted, that it stuck on:

$ run "$PODMAN" --root "${DOCKER_REG_ROOT}" run -d --rm --name "${DOCKER_REG_NAME}" --privileged -v "${DOCKER_REG_AUTH_DIR}":/auth -e REGISTRY_AUTH=htpasswd -e REGISTRY_AUTH_HTPASSWD_REALM="Registry Realm" -e REGISTRY_AUTH_HTPASSWD_PATH="/auth/htpasswd" -v "${DOCKER_REG_CERTS_DIR}":/certs -e REGISTRY_HTTP_ADDR=0.0.0.0:443 -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key -p 50000:443 "${IMAGES[docker-reg]}"
$ assert_success
$ run "$PODMAN" login --authfile "${TEMP_BASE_DIR}/authfile.json" --username user --password user "${DOCKER_REG_URI}"
$ assert_success
$ run "$SKOPEO" copy --dest-authfile "${TEMP_BASE_DIR}/authfile.json" dir:"${IMAGE_CACHE_DIR}"/fedora-toolbox-34 docker://"${DOCKER_REG_URI}"/fedora-toolbox:34  #### HERE ####

I added a debug commit to run the tests with bats --trace ..., and it shows me the same thing.

Interestingly, it also gets stuck on Fedora Rawhide now, in addition to F40.

so it might be not working on F40. Try to bump the base image to f38

It's trying to use skopeo(1) to upload a fedora-toolbox:34 image that was already downloaded into a local directory onto a temporary local Docker registry created by the test suite. So, I don't think the fedora-toolbox:34 image disappeared from registry.fedoraproject.org, because the image got downloaded. For some reason the skopeo copy ... can't upload it.

I think I will need a Fedora 40 system to debug this. Fedora 40 Beta will be released later today, so it's good timing. :)

For context:

We download a bunch of images using skopeo copy ... while setting up the test suite, and cache them in a separate directory using the dir: notation, where they don't show up in podman images, etc.. Then we use skopeo copy ... to either place them in containers-storage: to make them visible to Podman, or to upload them to the temporary local Docker registry created by the test suite, as necessary. This way each test can start with a clean slate, but we don't have to repeatedly download the images again and again.

@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from 8d5f8c3 to aadc1e0 Compare March 27, 2024 22:41
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/a9ebc15b1a5a47c1bfe15970c120cb3e

✔️ unit-test SUCCESS in 4m 53s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 38s
✔️ unit-test-restricted SUCCESS in 3m 48s
system-test-fedora-rawhide FAILURE in 35m 56s
system-test-fedora-40 FAILURE in 34m 21s
system-test-fedora-39 FAILURE in 35m 39s
system-test-fedora-38 FAILURE in 34m 50s

@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from aadc1e0 to bd80fba Compare April 17, 2024 13:36
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/52cdf068d4044d36a106fdd7ca223206

✔️ unit-test SUCCESS in 6m 37s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 15s
✔️ unit-test-restricted SUCCESS in 5m 53s
system-test-fedora-rawhide TIMED_OUT in 1h 20m 31s
system-test-fedora-40 TIMED_OUT in 1h 20m 26s
✔️ system-test-fedora-39 SUCCESS in 35m 52s
✔️ system-test-fedora-38 SUCCESS in 36m 28s

debarshiray added a commit to debarshiray/toolbox that referenced this pull request Apr 17, 2024
The current timeout of 1 hour that's used for stable Fedoras has proved
insufficient for Fedora 40.  It's not clear why that is.  It's possible
that since Fedora 40 is currently Branched and pre-release Fedoras use
Linux kernels that are built with debugging enabled, it's slower than
stable Fedoras.

Therefore, the same timeout as Fedora Rawhide has been used.

containers#1468
@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from bd80fba to cb3c4c8 Compare April 17, 2024 14:59
@debarshiray
Copy link
Copy Markdown
Member Author

recheck

@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from 9fbb6ad to 08626a3 Compare April 30, 2024 20:09
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/27afdd9b622c41d88c647a43f4db1b86

✔️ unit-test SUCCESS in 6m 38s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 7m 51s
✔️ unit-test-restricted SUCCESS in 5m 47s
system-test-fedora-rawhide FAILURE in 7m 31s
system-test-fedora-40 FAILURE in 7m 29s
✔️ system-test-fedora-39 SUCCESS in 35m 19s
✔️ system-test-fedora-38 SUCCESS in 35m 16s

@TristanCacqueray
Copy link
Copy Markdown
Contributor

Not sure if the failure is caused by a missing requirement or a test tool, but that error looks legitimate: Error: could not find slirp4netns, the network namespace can't be configured: exec: "slirp4netns": executable file not found in $PATH

debarshiray added a commit to debarshiray/toolbox that referenced this pull request Apr 30, 2024
The current timeout of 1 hour that's used for stable Fedoras has proved
insufficient for Fedora 40.  It's not clear why that is.  It's possible
that since Fedora 40 is currently Branched and pre-release Fedoras use
Linux kernels that are built with debugging enabled, it's slower than
stable Fedoras.

Therefore, the same timeout as Fedora Rawhide has been used.

containers#1468
@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from 08626a3 to 756a461 Compare April 30, 2024 21:59
@debarshiray
Copy link
Copy Markdown
Member Author

Not sure if the failure is caused by a missing requirement or a test tool, but that error looks legitimate: Error: could not find slirp4netns, the network namespace can't be configured: exec: "slirp4netns": executable file not found in $PATH

Yeah, it's caused by slirp4nets being changed to a Suggests in containers-common-extra.

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/6a857e23a9f34101a98f90fd8e0ad8d8

unit-test POST_FAILURE in 6m 24s
unit-test-migration-path-for-coreos-toolbox POST_FAILURE in 3m 26s
unit-test-restricted POST_FAILURE in 5m 33s
system-test-fedora-rawhide POST_FAILURE in 38m 41s
system-test-fedora-40 POST_FAILURE in 36m 36s
system-test-fedora-39 POST_FAILURE in 36m 35s
system-test-fedora-38 POST_FAILURE in 36m 42s

@debarshiray
Copy link
Copy Markdown
Member Author

recheck

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed.
https://softwarefactory-project.io/zuul/t/local/buildset/50d32dbf05924553abe2e47d458f2ca7

unit-test POST_FAILURE in 6m 27s
unit-test-migration-path-for-coreos-toolbox POST_FAILURE in 3m 14s
unit-test-restricted POST_FAILURE in 5m 33s
system-test-fedora-rawhide POST_FAILURE in 37m 59s
system-test-fedora-40 POST_FAILURE in 36m 23s
system-test-fedora-39 POST_FAILURE in 36m 52s
system-test-fedora-38 POST_FAILURE in 36m 53s

@TristanCacqueray
Copy link
Copy Markdown
Contributor

recheck ci outage because of no space left on logserver

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build succeeded.
https://softwarefactory-project.io/zuul/t/local/buildset/60cdb252d3a448f792e0eebe2f5cd9b5

✔️ unit-test SUCCESS in 6m 33s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 27s
✔️ unit-test-restricted SUCCESS in 5m 39s
✔️ system-test-fedora-rawhide SUCCESS in 37m 10s
✔️ system-test-fedora-40 SUCCESS in 35m 23s
✔️ system-test-fedora-39 SUCCESS in 34m 58s
✔️ system-test-fedora-38 SUCCESS in 35m 00s

@debarshiray
Copy link
Copy Markdown
Member Author

recheck ci outage because of no space left on logserver

I see. Thanks for fixing that!

Podman 5.0 switched to using pasta(1), instead of slirp4netns(1), by
default for rootless containers.  This change has led to a regression
causing 'skopeo copy' to get stuck uploading an OCI image to the local
temporary Docker registry run by the tests as a Podman container [1],
which breaks the test suite on Fedora 40 onwards.

Work around this by forcing the use of slirp4netns(1).

Note that the slirp4nets package needs to be explicitly installed on
Fedora 40 onwards, because the dependency in containers-common-extra
changed from Recommends to Suggests [2].

[1] containers/podman#22575

[2] Fedora containers-common commit 17934d87b2686ab5
    Fedora containers-common commit 13c232f064113860
    https://src.fedoraproject.org/rpms/containers-common/c/17934d87b2686ab5
    https://src.fedoraproject.org/rpms/containers-common/c/13c232f064113860

containers#1468
@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from 756a461 to b58f9a5 Compare May 2, 2024 13:12
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build succeeded.
https://softwarefactory-project.io/zuul/t/local/buildset/45e3f5373efd44818feaa7818614e280

✔️ unit-test SUCCESS in 7m 28s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 21s
✔️ unit-test-restricted SUCCESS in 6m 13s
✔️ system-test-fedora-rawhide SUCCESS in 44m 51s
✔️ system-test-fedora-39 SUCCESS in 43m 55s
✔️ system-test-fedora-38 SUCCESS in 44m 30s

debarshiray added a commit to debarshiray/toolbox that referenced this pull request May 2, 2024
The current timeout of 1 hour that's used for stable Fedoras has proved
insufficient for Fedora 40.  It's not clear why that is.  It's possible
that since Fedora 40 is currently Branched and pre-release Fedoras use
Linux kernels that are built with debugging enabled, it's slower than
stable Fedoras.

Therefore, the same timeout as Fedora Rawhide has been used.

containers#1468
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build succeeded.
https://softwarefactory-project.io/zuul/t/local/buildset/5ea1534a9f81423088dfb4b5fe52b953

✔️ unit-test SUCCESS in 6m 44s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 24s
✔️ unit-test-restricted SUCCESS in 5m 52s
✔️ system-test-fedora-rawhide SUCCESS in 39m 01s
✔️ system-test-fedora-40 SUCCESS in 34m 19s
✔️ system-test-fedora-39 SUCCESS in 37m 18s
✔️ system-test-fedora-38 SUCCESS in 34m 49s

@debarshiray debarshiray force-pushed the wip/rishi/zuul-test-f40 branch from 2c6e4ef to 9ea8967 Compare May 2, 2024 15:05
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build succeeded.
https://softwarefactory-project.io/zuul/t/local/buildset/ce725485b49c4b859723f61bcdefafc2

✔️ unit-test SUCCESS in 7m 02s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 3m 18s
✔️ unit-test-restricted SUCCESS in 5m 45s
✔️ system-test-fedora-rawhide SUCCESS in 51m 54s
✔️ system-test-fedora-40 SUCCESS in 50m 27s
✔️ system-test-fedora-39 SUCCESS in 50m 48s
✔️ system-test-fedora-38 SUCCESS in 49m 46s

@debarshiray debarshiray merged commit 9ea8967 into containers:main May 2, 2024
@debarshiray debarshiray deleted the wip/rishi/zuul-test-f40 branch May 2, 2024 16:16
@debarshiray
Copy link
Copy Markdown
Member Author

It's trying to use skopeo(1) to upload a fedora-toolbox:34 image that was already downloaded into a local directory onto a temporary local Docker registry created by the test suite. So, I don't think the fedora-toolbox:34 image disappeared from registry.fedoraproject.org, because the image got downloaded. For some reason the skopeo copy ... can't upload it.
I think I will need a Fedora 40 system to debug this. Fedora 40 Beta will be released later today, so it's good timing. :)
For context:
We download a bunch of images using skopeo copy ... while setting up the test suite, and cache them in a separate directory using the dir: notation, where they don't show up in podman images, etc.. Then we use skopeo copy ... to either place them in containers-storage: to make them visible to Podman, or to upload them to the temporary local Docker registry created by the test suite, as necessary. This way each test can start with a clean slate, but we don't have to repeatedly download the images again and again.

I filed a Skopeo issue for this: containers/podman#22575

This was worked around by forcing the use of slirp4nets(1).

debarshiray added a commit to debarshiray/toolbox that referenced this pull request Jan 26, 2026
debarshiray added a commit to debarshiray/toolbox that referenced this pull request Jan 26, 2026
debarshiray added a commit to debarshiray/toolbox that referenced this pull request Jan 29, 2026
containers#1468
containers#1740
(cherry picked from commit 9ea8967)
(cherry picked from commit ec3553b)
debarshiray added a commit to debarshiray/toolbox that referenced this pull request Jan 29, 2026
debarshiray added a commit to debarshiray/toolbox that referenced this pull request Jan 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants