Skip to content

Skopeo: Backport a patch to avoid a panic when compiled with Go >= 1.22 #355

Closed
mtrmac wants to merge 7 commits intocontainers:mainfrom
mtrmac:openshift-hotfix
Closed

Skopeo: Backport a patch to avoid a panic when compiled with Go >= 1.22 #355
mtrmac wants to merge 7 commits intocontainers:mainfrom
mtrmac:openshift-hotfix

Conversation

@mtrmac
Copy link
Copy Markdown
Collaborator

@mtrmac mtrmac commented May 21, 2024

Try the “easy” way to fix containers/skopeo#2328 .

Warning: I don’t know what I am doing. I’m hoping this PR, even before merging, generates images which can be used to run Skopeo tests to confirm this works as expected.

I have locally tested the sed command, but nothing else.

@cevich
Copy link
Copy Markdown
Member

cevich commented May 21, 2024

Change LGTM, I can't see why the new images won't build. Oh, looks like I need to give you access to this repo. Once sec...

@cevich
Copy link
Copy Markdown
Member

cevich commented May 21, 2024

...access granted. @mtrmac you'll need to re-push unfortunately. There's a validation check that will fail in Cirrus if I simply tell it to "go" as-is.

@mtrmac mtrmac force-pushed the openshift-hotfix branch from e6bd1c8 to 53fc10b Compare May 21, 2024 19:15
@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 21, 2024

Thanks, re-pushed.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 21, 2024

rawhide: No match for argument: mlocate

was retired this week (🍷 for a very old project of mine). Do we actually need that?

@edsantiago
Copy link
Copy Markdown
Member

Nuke it and we'll find out. Do you know if there's a successor? I'm going to miss it, I use it daily at home.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 22, 2024

Nuke it and we'll find out.

Done.

Do you know if there's a successor? I'm going to miss it, I use it daily at home.

https://fedoraproject.org/wiki/Changes/Plocate_as_the_default_locate_implementation , supposedly since F36.

@mtrmac mtrmac force-pushed the openshift-hotfix branch from e70be89 to 112cf1a Compare May 22, 2024 17:08
@cevich
Copy link
Copy Markdown
Member

cevich commented May 22, 2024

Do we actually need that? (mlocate)

Git blame says it came in by 38fa0c6 which is incredibly ancient. However, the origin of that change was exclusive to podman CI. So if removing mlocate still passes podman CI I think we're good.

@cevich
Copy link
Copy Markdown
Member

cevich commented May 22, 2024

Note: I re-ran a few container build tasks. Looks like quay or networking flakes.

@edsantiago
Copy link
Copy Markdown
Member

Test get_ci_vm_entrypoint seems to be failing hard:

rm 'get_ci_vm/good_repo_test/uninit_gcloud.output'
Testing: Verify mock 'gcevm' flavor main() workflow produces expected output
fail - Expected exit-code 0 but received 128 while executing mock_gcevm_workflow (output follows)
Winning lottery-number checksum: 0
gcloud --configuration=automation_images --project=automation_images compute instances create --zone=us-central1-a --image-project=automation_images --image=test-image-name --custom-cpu=0 --custom-memory=0Gb --boot-disk-size=0 --labels=in-use-by=foobar foobar-test-image-name
gcloud --configuration=automation_images --project=automation_images compute ssh --ssh-flag=-o=AddKeysToAgent=yes --force-key-file-overwrite --strict-host-key-checking=no --zone=us-central1-a root@foobar-test-image-name -- true
Cloning into '/tmp/get_ci_vm_ifMxex.tmp/var/tmp/automation_images'...
fatal: detected dubious ownership in repository at '/tmp/cirrus-ci-build/get_ci_vm/good_repo_test/.git'
To add an exception for this directory, call:

	git config --global --add safe.directory /tmp/cirrus-ci-build/get_ci_vm/good_repo_test/.git
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

This is not code touched by this PR.

@cevich any chance you could take a quick look? I'm having trouble understanding. https://github.com//containers/automation_images/blob/b7395d11fee6e977d2256b536a485fdb9811839b/get_ci_vm/test.sh#L276

@cevich
Copy link
Copy Markdown
Member

cevich commented May 23, 2024

This is not code touched by this PR.

Agreed, this is unrelated. Must be the result of a git fix/update. I'll take a look and work on this today, hopefully it's an easy fix.

@cevich
Copy link
Copy Markdown
Member

cevich commented May 23, 2024

Well...was an easy fix but hard to find the right place to implement. Local testing shows I found it, so #356 should fix this.

@cevich
Copy link
Copy Markdown
Member

cevich commented May 23, 2024

Done. Okay, rebase this and it should work 🤞

@mtrmac mtrmac force-pushed the openshift-hotfix branch from 112cf1a to 7ca9c72 Compare May 23, 2024 19:05
@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 23, 2024

Rebased. Thanks @cevich .

@github-actions
Copy link
Copy Markdown

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base debian do-not-use
base fedora do-not-use
base fedora-aws do-not-use
base fedora-aws-arm64 do-not-use
base image-builder do-not-use
base prior-fedora do-not-use
cache build-push c20240523t185657z-f40f39d13
cache debian c20240523t185657z-f40f39d13
cache fedora c20240523t185657z-f40f39d13
cache fedora-aws c20240523t185657z-f40f39d13
cache fedora-netavark c20240523t185657z-f40f39d13
cache fedora-netavark-aws-arm64 c20240523t185657z-f40f39d13
cache fedora-podman-aws-arm64 c20240523t185657z-f40f39d13
cache fedora-podman-py c20240523t185657z-f40f39d13
cache prior-fedora c20240523t185657z-f40f39d13
cache rawhide c20240523t185657z-f40f39d13
cache win-server-wsl c20240523t185657z-f40f39d13

@edsantiago
Copy link
Copy Markdown
Member

debian prior-fedora fedora fedora-aws rawhide
base 13.2 39-1.5 Generic ? 41-0
kernel 6.8.9-1 6.8.10-200 6.8.5-301 6.8.5-301 6.9.0-64
6.7.12-1 ⇑ 6.8.9-200 ⇑ 6.8.9-300 ⇑ 6.8.9-300 ⇑ 6.9.0-0.rc7.20240510git448b3fe5a0ea.62 ⇑
grub2-common 2.12-2 2.06-120 2.06-121 2.06-121 2.06-121
aardvark-dns 1.4.0-5.1 1.10.0-1 1.10.0-1 1.10.0-1 1.10.0-1
netavark 1.4.0-4.1 1.10.3-1 1.10.3-3 1.10.3-3 1.10.3-3
buildah 1.33.7+ds1-1 1.35.4-1 1.35.4-1 1.35.3-1 1.35.4-1
1.35.3-1 ⇑
conmon 2.1.10+ds1-1+b1 2.1.10-1 2.1.12-1 2.1.10-1 2.1.10-1
2.1.10-1 ⇑
container-selinux ? 2.231.0-1 2.231.0-1 2.231.0-1 2.231.0-1
2.230.0-1 ⇑ 2.230.0-1 ⇑
containers-common ? 1-99 0.58.0-2 0.58.0-2 0.58.0-18
criu 3.17.1-3 3.19-2 3.19-4 3.19-4 3.19-4
crun 1.15-1 1.15-1 1.15-1 1.15-1 1.15-1
1.14.4-1 ⇑ 1.14.4-1 ⇑
docker-ce 5:26.1.3-1debian.12bookworm ? ? ? ?
5:26.1.2-1debian.12bookworm ⇑
golang 2:1.22~3 1.21.10-1 1.22.3-1 1.22.3-1 1.22.3-1
1.21.9-1 ⇑ 1.22.2-1 ⇑
gvisor-tap-vsock ? 0.7.3-1 0.7.3-2 0.7.3-2 0.7.3-2
nmap-ncat 7.94+git20230807.3be01efb1+dfsg-3+b1 7.95-1 7.95-1 7.95-1 7.95-1
passt 2024-04-26 2024-04-26 2024-05-10 2024-05-10 2024-05-10
2024-04-26 ⇑
podman 4.9.4+ds1-1 4.9.4-1 5.1.0~rc1-1 5.0.3-1 5.0.3-1
5.0.3-1 ⇑ 5.0.2-1 ⇑ 5.0.2-1 ⇑
runc 1.1.12+ds1-2 1.1.12-1 1.1.12-3 1.1.12-3 1.1.12-3
skopeo 1.13.3+ds1-2+b1 1.15.0-1 1.15.1-1 1.15.0-1 1.15.1-1
1.15.0-1 ⇑ 1.15.0-1 ⇑
slirp4netns 1.2.1-1+b1 1.2.2-1 1.2.2-2 1.2.2-2 1.2.2-2
systemd 256~rc3-1 254.12-1 255.6-1 255.6-1 256~rc2-1
255.5-1 ⇑ 254.10-1 ⇑ 255.5-1 ⇑
tar 1.34+dfsg-1.2+deb12u1 1.35-2 1.35-3 1.35-3 1.35-3

mtrmac added a commit to mtrmac/skopeo that referenced this pull request May 27, 2024
... to see whether it fixes the failures in
containers#2328 .

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request May 27, 2024
... which has stopped installing mlocate, so confirm that doesn't
break anything.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 27, 2024

Downstream tests:

@edsantiago
Copy link
Copy Markdown
Member

@mtrmac see my comments in your podman PR. Debian is broken, not booting.

CAUSE: new systemd (see above table)

SOLUTION:

  1. wait for Debian: remove force-cgroups-v1 code #338 to merge (will take a long time because it will need podman-buildah-everything testing and I will be OOTO much of today); or
  2. block systemd upgrade on debian. Example on how to do that: https://github.com//containers/automation_images/blob/afe1ced362caf3ad65fc502da6af3567de0266f3/base_images/debian_base-setup.sh#L46-L57

@cevich
Copy link
Copy Markdown
Member

cevich commented May 28, 2024

@mtrmac Ed's second suggestion (unfortunately) isn't abnormal for this repo. If you're at all uncomfortable making that change, poke me and I'll take care of it for you.

Edit: It looks like your workaorund is at least functional as the containers/skopeo#2340 CI passed.

@mtrmac mtrmac force-pushed the openshift-hotfix branch 3 times, most recently from 6322d3c to d9d9f10 Compare May 28, 2024 17:37
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added 4 commits May 28, 2024 20:32
> panic: encoding alphabet includes duplicate symbols

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
It has been retired in Rawhide, and it's unclear whether
we need it at all.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
plocate is the default locate implementation in Fedora.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac force-pushed the openshift-hotfix branch from 0097606 to bff140d Compare May 28, 2024 18:32
…g the default one

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac force-pushed the openshift-hotfix branch from bff140d to c3dedc7 Compare May 28, 2024 18:32
@github-actions
Copy link
Copy Markdown

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base debian do-not-use
base fedora do-not-use
base fedora-aws do-not-use
base fedora-aws-arm64 do-not-use
base image-builder do-not-use
base prior-fedora do-not-use
cache build-push c20240528t183210z-f40f39d13
cache debian c20240528t183210z-f40f39d13
cache fedora c20240528t183210z-f40f39d13
cache fedora-aws c20240528t183210z-f40f39d13
cache fedora-netavark c20240528t183210z-f40f39d13
cache fedora-netavark-aws-arm64 c20240528t183210z-f40f39d13
cache fedora-podman-aws-arm64 c20240528t183210z-f40f39d13
cache fedora-podman-py c20240528t183210z-f40f39d13
cache prior-fedora c20240528t183210z-f40f39d13
cache rawhide c20240528t183210z-f40f39d13
cache win-server-wsl c20240528t183210z-f40f39d13

@edsantiago
Copy link
Copy Markdown
Member

NO GO. Repeat, NO GO. Something went wrong. systemd is still a bad version:

debian prior-fedora fedora fedora-aws rawhide
systemd 256~rc3-4 254.12-1 255.7-1 255.6-1 256~rc3-1
255.5-1 ⇑ 254.10-1 ⇑ 255.6-1 ⇑ 255.5-1 ⇑

@edsantiago
Copy link
Copy Markdown
Member

I can't figure out what went wrong. The log shows your pin file, then:

    debian: Unpacking libnss-resolve:amd64 (256~rc3-4) over (252.22-1~deb12u1) ...

Pinning allows globs, so maybe retry the 256 block with just 256* ? (At least I think it's globs. If it's regexps, that won't work at all). Sorry, I'm stuck.

@cevich
Copy link
Copy Markdown
Member

cevich commented May 29, 2024

I'd like to suggest something helpful, but I'm out of my depth here. Debugging problems in the "base" stage build can be challenging. It's not possible to use hack/get_ci_vm.sh to get anything earlier than a "base" stage VM (which is what is coming in here). The only way I know of is using the GCE Web UI or CLI to create a custom one. In this case Packer is grabbing the latest image by "family", so debian-12 according to the Makefile. In case that helps.

@edsantiago
Copy link
Copy Markdown
Member

EXECUTIVE DECISION: I am merging this. DO NOT USE THESE VMS!!!!!!!!!!!!!!!!

I am merging because the Skopeo fixes look good, and the mlocate/plocate too, and we REALLY NEED #338 SO PLEASE NO MORE MERGES INTO THIS REPO

@edsantiago
Copy link
Copy Markdown
Member

Ah phooey. Too late. Never mind, I'll just bring in the skopeo commit

@cevich
Copy link
Copy Markdown
Member

cevich commented May 29, 2024

Since you're going to be building images in the other PR right away anyway (with a newer IMG_SFX) maybe it's okay to just force-merge this?

@edsantiago
Copy link
Copy Markdown
Member

Doesn't seem to work. Wants to rerun CI, which is going to fail.

@cevich
Copy link
Copy Markdown
Member

cevich commented May 29, 2024

Dang 😢

@cevich
Copy link
Copy Markdown
Member

cevich commented May 29, 2024

Oh! You could stick '[skip-ci]' in the title, and re-push. That will 100% bypass all of Cirrus-CI.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 29, 2024

#338 now contains the wanted fixes from this PR, so I think this can just be closed after #338 merges.

I’m leaving it open for now just to reduce the number of changes in flight.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 29, 2024

#338 was merged, contains the commits we want, and makes the attempts to pin systemd unnecessary.

@edsantiago @cevich thanks!

@mtrmac mtrmac closed this May 29, 2024
@mtrmac mtrmac deleted the openshift-hotfix branch May 29, 2024 18:46
@edsantiago
Copy link
Copy Markdown
Member

Thank you for finding a solution to the Go issue and for catching (and fixing!) the mlocate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants