Skip to content

CI: Use local cache registry#22726

Merged
openshift-merge-bot[bot] merged 3 commits intocontainers:mainfrom
edsantiago:pull-from-local-registry
Jul 11, 2024
Merged

CI: Use local cache registry#22726
openshift-merge-bot[bot] merged 3 commits intocontainers:mainfrom
edsantiago:pull-from-local-registry

Conversation

@edsantiago
Copy link
Copy Markdown
Member

@edsantiago edsantiago commented May 16, 2024

As of containers/automation_images#357 our CI VMs include a local registry preloaded with all(*) images used in tests.

(*) where "all" means "most".

This PR sets up CI such that it will use that registry.

Signed-off-by: Ed Santiago santiago@redhat.com

None

@openshift-ci openshift-ci bot added release-note-none approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 16, 2024
@packit-as-a-service
Copy link
Copy Markdown

Ephemeral COPR build failed. @containers/packit-build please check.

@edsantiago edsantiago force-pushed the pull-from-local-registry branch 5 times, most recently from 684512b to c82b31a Compare May 16, 2024 16:14
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heads up, tests should still pass locally and I don't think we want to setup a local registry there right?
Thus I would think we need a regex or a Or() matcher to match both strings anyway

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is thorny. As tests are currently written, e2e tests hard-force the use of test/registries.conf. No matter where they're run (CI, laptop, anywhere). This may need to be reevaluated, but I'm not bothering with any of that until I find out if this approach is viable.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes sure keep testing, just keep in mind that the end result must still work locally

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not do anything, not sure if you were expecting anything with that or if this serves documentation purposes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need the container to be able to talk to host:56789. --port does the opposite, IIRC: host can talk to container. I'll look at logs and see what happens.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The container uses --network=host so it shares the network namespace with the host so from a network POV there should be no functional difference

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duh. Thanks, I missed that.

@edsantiago edsantiago force-pushed the pull-from-local-registry branch 7 times, most recently from 03825bc to 48e506f Compare May 22, 2024 14:40
@edsantiago edsantiago force-pushed the pull-from-local-registry branch 10 times, most recently from 293a60b to 88fa517 Compare May 28, 2024 13:54
@edsantiago edsantiago force-pushed the pull-from-local-registry branch 3 times, most recently from 3bc7dbf to 80ed2a7 Compare May 30, 2024 01:57
edsantiago added a commit to edsantiago/container_automation_images that referenced this pull request Jul 2, 2024
...to minimize hiccups. RUN-2091 in Jira. Network registries
are too unreliable; they cause too many flakes in CI. Here
we set up a registry running on each VM, prepopulated with
all container images used in podman and buildah tests.

Related PRs:
   containers/podman#22726
   containers/buildah#5584

Once those merge, podman and buildah CI tests will fetch
images from this local registry.

Signed-off-by: Ed Santiago <santiago@redhat.com>
edsantiago added a commit to edsantiago/container_automation_images that referenced this pull request Jul 2, 2024
...to minimize hiccups. RUN-2091 in Jira. Network registries
are too unreliable; they cause too many flakes in CI. Here
we set up a registry running on each VM, prepopulated with
all container images used in podman and buildah tests.

Related PRs:
   containers/podman#22726
   containers/buildah#5584

Once those merge, podman and buildah CI tests will fetch
images from this local registry.

Signed-off-by: Ed Santiago <santiago@redhat.com>
@edsantiago edsantiago force-pushed the pull-from-local-registry branch from 644ae51 to edde737 Compare July 2, 2024 18:37
edsantiago added a commit to edsantiago/container_automation_images that referenced this pull request Jul 2, 2024
...to minimize hiccups. RUN-2091 in Jira. Network registries
are too unreliable; they cause too many flakes in CI. Here
we set up a registry running on each VM, prepopulated with
all container images used in podman and buildah tests.

Related PRs:
   containers/podman#22726
   containers/buildah#5584

Once those merge, podman and buildah CI tests will fetch
images from this local registry.

Signed-off-by: Ed Santiago <santiago@redhat.com>
@edsantiago edsantiago force-pushed the pull-from-local-registry branch from edde737 to 75f8520 Compare July 2, 2024 21:23
@edsantiago edsantiago force-pushed the pull-from-local-registry branch 2 times, most recently from 202cf63 to 61053e5 Compare July 8, 2024 13:29
Copy link
Copy Markdown
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thinking about this should we set up a new cron run that runs without the cache registry to make sure local tests will keep working? (not blocking of course)

@edsantiago
Copy link
Copy Markdown
Member Author

Good idea.

/hold

Let's not merge this yet. I saw this bud flake over the weekend and I want to understand it better.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 8, 2024
@edsantiago edsantiago marked this pull request as draft July 8, 2024 13:44
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 8, 2024
@edsantiago edsantiago force-pushed the pull-from-local-registry branch from 61053e5 to 21f73f2 Compare July 8, 2024 14:52
edsantiago added a commit to edsantiago/container_automation_images that referenced this pull request Jul 8, 2024
...to minimize hiccups. RUN-2091 in Jira. Network registries
are too unreliable; they cause too many flakes in CI. Here
we set up a registry running on each VM, prepopulated with
all container images used in podman and buildah tests.

Related PRs:
   containers/podman#22726
   containers/buildah#5584

Once those merge, podman and buildah CI tests will fetch
images from this local registry.

Signed-off-by: Ed Santiago <santiago@redhat.com>
@edsantiago edsantiago force-pushed the pull-from-local-registry branch 2 times, most recently from afc6438 to 9c82836 Compare July 8, 2024 17:13
As of containers/automation_images#357
our CI VMs include a local registry preloaded with all(*)
images used in tests.

 * where "all" means "most".

This commit installs a new registries.conf that redirects docker
and quay to the new local registry. The hope is that this will
reduce CI flakes.

Since tests change over time, and new tests may require new
images, this commit also adds a mechanism for pulling in
remote images at test run time. Obviously this negates
the purpose of the cache, since it introduces a flake
pain point. The idea is: DO NOT DO THIS UNLESS ABSOLUTELY
NECESSARY, and then, if we have to do this, hurry up and
spin new CI VMs that include the new image(s).

Signed-off-by: Ed Santiago <santiago@redhat.com>
This commit gets tests working under the new local-registry system:

  * amend a few image names, mostly just sticking to a consistent
    list of those images in our registry cache. Mostly minor
    tag updates.

  * trickier: pull_test: change some error messages, and remove
    a test that's now a NOP. Basically, with a local (unprotected)
    registry we always get "404 manifest unknown"; with a real
    registry we'll get "403 I can't tell you".

  * trickiest: seccomp_test: build our own images at run time,
    with our desired labels. Until now we've been pulling
    prebuilt images, but those will not copy to the local
    cache registry. Something about v1? Anyhow, I gave up
    trying to cache them, and the workaround is straightforward.

Also took the liberty of strengthening a few error-message checks

Signed-off-by: Ed Santiago <santiago@redhat.com>
New tool, get-local-registry-script, intended for developers
to get a local registry running in their environment. This is
not necessary for any tests, but may be desirable for performance
reasons and/or to recreate the CI environment.

Signed-off-by: Ed Santiago <santiago@redhat.com>
Copy link
Copy Markdown
Member Author

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again I think this is ready for review. My thanks for everyone's patience.

Comment on lines +121 to +124
session = podmanTest.Podman([]string{"info", "--format", "{{.Host.RemoteSocket.Exists}}"})
session.WaitWithDefaultTimeout()
Expect(session).Should(ExitCleanly())
Expect(session.OutputToString()).To(Equal("true"))
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is, I think, the only change since last review. The latest Debian VM image creates and recreates /run/user/XXX/podman/podman.sock at (seemingly) random times. I've given up trying to understand what is creating it, and honestly I choose to decide that this test was broken. There is no harm in the socket existing in a podman-local test. The only necessary test is that if podman-remote, then socket-must-exist.

Copy link
Copy Markdown
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Jul 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago, Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Jul 11, 2024

/lgtm
I will let @edsantiago cancel the hold

@edsantiago
Copy link
Copy Markdown
Member Author

/hold cancel

(I forgot about that). Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note-none

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants