Skip to content

COS-2692: Add new variants to enable pure CentOS Stream/RHEL CoreOS builds, add Containerfile for layered OKD/OCP builds#1445

Merged
openshift-merge-bot[bot] merged 3 commits intoopenshift:masterfrom
jlebon:pr/c9s-split
May 29, 2024

Conversation

@jlebon
Copy link
Member

@jlebon jlebon commented Feb 21, 2024

Add new okd-c9s and ocp-rhel-9.4 variants

To make introducing the base RHCOS/SCOS images safer, let's create two
new variants: okd-c9s and ocp-rhel-9.4. These variants are cloned
from the existing c9s and rhel-9.4 variants to start.

The new variants will track the status quo: building SCOS/RHCOS with the
OpenShift components baked in (hence the okd/ocp prefixes). This is
what the pipeline will keep building.

Meanwhile, what is currently the c9s and rhel-9.4` variants will
become the new base SCOS/RHCOS streams containing purely CentOS
Stream/RHEL content.

The default variant is still ocp-rhel-9.4 for now.


Make c9s and rhel-9.4 variants be pure C9S/RHEL 9.4 content

This is the second step now in this switcheroo dance (see previous
commit). We make the c9s and rhel-9.4 variants contain only C9S/
RHEL 9.4 content and then make the okd-c9s and ocp-rhel-9.4 variants
inherit from those and add the OCP-specific stuff.


Containerfile: new file

This Containerfile allows us to build the OpenShift node image on top
of the base RHCOS/SCOS image (i.e. built from the c9s or rhel-9.4
image).

Currently, the resulting image is at parity with the base image you'd
get from building the okd-c9s or ocp-rhel-9.4 variant. In the
future, those variants will go away and this will become the only way to
build the node image.

Part of: #799

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 21, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 21, 2024
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 21, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 21, 2024
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 21, 2024
jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Feb 21, 2024
jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Feb 21, 2024
This is part of openshift/os#1445.

Those tests are all actually testing OCP components. In the new model,
they should be run against an OCP layered image instead.
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Feb 21, 2024
We don't have to be super strict here in how we find the bootloader
entry. There should only be one, so simplify the logic using a glob
instead.

Motivated by the fact that this will break otherwise as part of
openshift/os#1445 where the `ID` will be
`centos`, but the stateroot will still be `scos`.
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Feb 21, 2024
The `ID` will change to `centos` as part of
openshift/os#1445.
@jlebon
Copy link
Member Author

jlebon commented Feb 21, 2024

@jlebon
Copy link
Member Author

jlebon commented Feb 21, 2024

The major gap left for this is adapting the pipeline to build the layered OCP image.

@LorbusChris
Copy link
Contributor

/cc @lmzuccarelli @aguidirh @sherine-k
we'll have make sure okd-coreos-pipeline is adapted accordingly

jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Feb 22, 2024
We don't have to be super strict here in how we find the bootloader
entry. There should only be one, so simplify the logic using a glob
instead.

Motivated by the fact that this will break otherwise as part of
openshift/os#1445 where the `ID` will be
`centos`, but the stateroot will still be `scos`.
jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Feb 22, 2024
The `ID` will change to `centos` as part of
openshift/os#1445.
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 29, 2024
@cgwalters cgwalters self-assigned this Mar 1, 2024
Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just skimming, LGTM at a high level

jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Mar 5, 2024
This is part of openshift/os#1445.

Those tests are all actually testing OCP components. In the new model,
they should be run against an OCP layered image instead. Add a tag on
them so that we'll be able to run them separately.
@jlebon jlebon changed the title Make c9s variant contain c9s content only, no OCP content COS-2692: Make c9s variant contain c9s content only, no OCP content Mar 5, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 5, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 5, 2024

@jlebon: This pull request references COS-2692 which is a valid jira issue.

Details

In response to this:

This is a first stab at #799, aimed at the c9s variant to start.

In this model, the base (container and disk) images we build in the
pipeline do not contain any OCP-specific details. The compose is made up
purely of RPMs coming out directly from the c9s pungi composes.

Let's go over details of this in bullet form:

  1. To emphasize the binding to c9s composes, we change the versioning
    scheme: the version string is now exactly the same version as the
    pungi compose from which we've built (well, we do add a .N field
    because we want to be able to rebuild multiple times on top of the
    same base pungi compose). It's almost like if our builds are part of
    the c9s pungi composes directly. (And maybe one day they will be...)
    This is implemented using a versionary script that queries compose
    info.
  2. We no longer include packages-openshift.yaml: this has all the OCP
    stuff that we want to do in a layered build instead.
  3. We no longer completely rewrite /etc/os-release. The host is
    image-mode CentOS Stream and e.g. ID will now say centos.
    However, we do still inject VARIANT and VARIANT_ID fields to
    note that it's of the CoreOS kind. We should probably actually match
    FCOS here and properly add a CoreOS variant in the centos-release
    package.
  4. Tests which have to do with the OpenShift layer now have the required
    tag openshift. This means that it'll no longer run in the default
    set of kola tests. When building the derived image, we will run just
    those tests using kola run --tag openshift --oscontainer ....

Note that to make this work, OCP itself still needs to actually have
that derived image containing the OCP bits. For now, we will build this
in the pipelines (as a separate artifact that we push to the repos) but
the eventual goal is that we'd split that out of the pipeline and have
it be more like how the rest of OCP is built (using Prow/OSBS/Konflux).

Note also we don't currently build the c9s variant in the pipelines but
this is a long time overdue IMO.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

jlebon added a commit to coreos/coreos-assembler that referenced this pull request Mar 6, 2024
This is part of openshift/os#1445.

Those tests are all actually testing OCP components. In the new model,
they should be run against an OCP layered image instead. Add a tag on
them so that we'll be able to run them separately.
@jlebon
Copy link
Member Author

jlebon commented May 21, 2024

/refresh

To make introducing the base RHCOS/SCOS images safer, let's create two
new variants: `okd-c9s` and `ocp-rhel-9.4`. These variants are cloned
from the existing `c9s` and `rhel-9.4` variants to start.

The new variants will track the status quo: building SCOS/RHCOS with the
OpenShift components baked in (hence the `okd`/`ocp` prefixes). This is
what the pipeline will keep building.

Meanwhile, what is currently the `c9s` and rhel-9.4` variants will
become the new base SCOS/RHCOS streams containing *purely* CentOS
Stream/RHEL content.

The default variant is still `ocp-rhel-9.4` for now.
@jlebon jlebon force-pushed the pr/c9s-split branch 2 times, most recently from 3aad6fe to 05ab4e8 Compare May 24, 2024 17:48
jlebon added 2 commits May 24, 2024 17:08
This is the second step now in this switcheroo dance (see previous
commit). We make the `c9s` and `rhel-9.4` variants contain only C9S/
RHEL 9.4 content and then make the `okd-c9s` and `ocp-rhel-9.4` variants
inherit from those and add the OCP-specific stuff.
This Containerfile allows us to build the OpenShift node image on top
of the base RHCOS/SCOS image (i.e. built from the `c9s` or `rhel-9.4`
image).

Currently, the resulting image is at parity with the base image you'd
get from building the `okd-c9s` or `ocp-rhel-9.4` variant. In the
future, those variants will go away and this will become the only way to
build the node image.

Part of: openshift#799
license: MIT
name: rhcos
summary: OpenShift 4
summary: RHEL CoreOS 9.4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Red Hat Enterprise Linux CoreOS 9.4

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can stay as is as I don't think that's used anywhere.

@travier
Copy link
Member

travier commented May 28, 2024

Only have one note:

  • We will have to remove all the extensions for the non OCP variants.

@travier
Copy link
Member

travier commented May 28, 2024

So one obvious thing we notice here totally unrelated to this is that we're currently baking some NVMe-related UUID things that should probably instead be generated on first boot.

I though that this had been fixed already. This is weird but it's not due to this change so let's not hold it.

/lgtm
/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 28, 2024
@travier
Copy link
Member

travier commented May 28, 2024

Feel free to unhold when you think it's ready to go / when we've completed the 4.16 branching.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 28, 2024
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, jlebon, travier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [cgwalters,jlebon,travier]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jlebon
Copy link
Member Author

jlebon commented May 29, 2024

Thanks for the review!
/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 29, 2024
@travier
Copy link
Member

travier commented May 29, 2024

I've filed #1519

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 29, 2024

@jlebon: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit f6cad58 into openshift:master May 29, 2024
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jun 18, 2024
A big part of the new variants added in
openshift/os#1445 is that we only minimally
modify `/etc/os-release`. This means that e.g. `ID` is still `rhel` and
`VERSION_ID` is e.g. `9.4` for the `rhel-9.4` variant. We do still
inject `VARIANT` and `VARIANT_ID` though.

Adapt these library functions here to handle this.
jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Jun 19, 2024
A big part of the new variants added in
openshift/os#1445 is that we only minimally
modify `/etc/os-release`. This means that e.g. `ID` is still `rhel` and
`VERSION_ID` is e.g. `9.4` for the `rhel-9.4` variant. We do still
inject `VARIANT` and `VARIANT_ID` though.

Adapt these library functions here to handle this.
jbtrystram added a commit to jbtrystram/coreos-assembler that referenced this pull request Apr 4, 2025
As we introduced pure rhel and rhcos variants in [1],
we did not change the `name` key in meta.json

Kola uses this key to determine the distribution of
the image to discriminate relevant tests.
Some tests are tied to OCP content, e.g. `crio.base`
and `crio.network` [2] and inherently won't work
with the base because they rely on content that
is added as part of the node image layer.

Currently we skip the tests tagged with `openshift`
in prow [3] and pipelie to work around this.

Going with the principle of least disruption
by adding `rhcos-base` rather than adding `rhcos-ocp`
because all of the build already out there contains
the old name and a I don't want to break compatibility
if someone use a newer cosa to test those older
builds.

[1] openshift/os#1445
[2] https://github.com/coreos/coreos-assembler/blob/main/mantle/kola/tests/crio/crio.go
[3] https://github.com/openshift/os/blob/48a18918794f5418352c03a3415fac3fde28e1b6/ci/prow-entrypoint.sh#L306

See openshift/os#1790
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants