Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCO-1100: enable RHEL entitlements in on-cluster layering with OCL API #4333

Conversation

cheesesashimi
Copy link
Member

@cheesesashimi cheesesashimi commented Apr 23, 2024

- What I did

This adds the capability for BuildController to use the RHEL entitlement secrets to allow cluster admins to inject RHEL content into their builds that they are entitled to receive. This also allows the injection / consumption of content into /etc/yum.repos.d as well as /etc/pki/rpm-gpg. There are a few notes about the implementation that I would like to have at a higher level:

  • Because we run rootless Buildah, we're more prone to running into SELinux complications. This makes it more difficult to directly mount the contents of /etc/yum.repos.d, /etc/pki/entitlement, and /etc/pki/rpm-gpg directly into the build context. With that in mind, we copy everything into a series of temp directories first, and then mount those temp directories into the build context as a volume.
  • We also create an emptyDir which is mounted into the build pod at /home/build/.local/share/containers. It is unclear why this is necessary, but as mentioned before, I suspect that this is due to SELinux issues.
  • The e2e test suite now has the capability to stream the container logs from the build pod to the filesystem as there is useful information contained within those logs if the e2e test fails. In OpenShift CI, this location will be determined by the ARTIFACT_DIR env var. If this env var is not present, it will default the current directory.
  • For right now, etc-pki-entitlement flow (specifically, the TestEntitledBuild test) is being skipped in OpenShift CI because the test clusters do not have that cred available. The test suite will automatically detect the presence (or lack thereof) of that cred in the openshift-config-managed namespace and run the test if it is available. However, the TestYumRepos test targets a very similar flow and can do its own setup and teardown regardless of creds preexisting.

I took care to ensure that this does not break OKD by taking the following actions:

  • I observed that the addition of the /home/build/.local/share/containers volume mount to the build pod prevented the wait-for-done container to start when running on FCOS. With this in mind, I modified the build pod instantiation to not connect this volume mount to the wait-for-done container.
  • I added a TestOnClusterBuildsOnOKD e2e test which will only run against an OKD cluster. Conversely, I excluded other tests from running against an OKD cluster since those tests make assumptions about things that would only be present within an OCP cluster.

The difference between this PR and #4312 is that this one is based upon both the on-cluster layering PR (#4327) and the on-cluster layering e2e PR (#4328).

- How to verify it

Automated verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace. If this secret is not present, TestEntitledBuilds and TestEntitledBuildsRollsOutImage will be skipped.
  2. Ensure that the OnClusterBuild feature-gate is enabled. The test suite will fail immediately if the feature-gate is not enabled.
  3. Run the tech preview e2e test suite: $ go test -count=1 -v ./test/e2e-techpreview/...

(Note: Because we have not landed #4284, the cleanup / teardown will delete the node and its underlying machine, causing the Machine API to provision a replacement node.)

Manual verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace.
  2. Ensure that the OnClusterBuild feature-gate is enabled.
  3. Copy the etc-pki-entitlement secret into the openshift-machine-config-operator namespace. Here's a small script you can use:
#!/usr/bin/env bash

set -xeuo

oc create secret generic etc-pki-entitlement \
  --namespace "openshift-machine-config-operator" \
  --from-file=entitlement.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement.pem" | base64decode }}') \
  --from-file=entitlement-key.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement-key.pem" | base64decode }}')
  1. Create a new MachineConfigPool and MachineOSConfig :
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: layered
spec:
  machineConfigSelector:
    matchExpressions:
    - key: machineconfiguration.openshift.io/role
      operator: In
      values:
      - worker
      - layered
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/layered: ""
  paused: false
---
apiVersion: machineconfiguration.openshift.io/v1alpha1
kind: MachineOSConfig
metadata:
  name: layered
spec:
  buildInputs:
    baseImagePullSecret:
      name: global-pull-secret-copy
    containerFile:
    - containerfileArch: noarch
      content: |-
        FROM configs AS final
        RUN rpm-ostree install cowsay && \
          ostree container commit
    imageBuilder:
      imageBuilderType: PodImageBuilder
    renderedImagePushSecret:
      name: <add your registry's push secret here>
    renderedImagePushspec: <add your registry's pushspec here>
  buildOutputs:
    currentImagePullSecret:
      name: <add your registry's pull secret here>
  machineConfigPool:
    name: layered
  1. Watch for the machine-os-builder pod to start. Shortly afterward, the build pod should start. It should complete without any errors. Seeing the following lines in the build pod content will verify that we've successfully ingested content:
Enabled rpm-md repositories: rhel-9-for-x86_64-baseos-beta-rpms rhel-9-for-x86_64-appstream-beta-rpms
Updating metadata for 'rhel-9-for-x86_64-baseos-beta-rpms'...done
Updating metadata for 'rhel-9-for-x86_64-appstream-beta-rpms'...done
Importing rpm-md...done
rpm-md repo 'rhel-9-for-x86_64-baseos-beta-rpms'; generated: 2024-03-25T12:33:56Z solvables: 1816
rpm-md repo 'rhel-9-for-x86_64-appstream-beta-rpms'; generated: 2024-03-25T12:34:53Z solvables: 6972

- Description for the changelog
Enables RHEL entitlements in on-cluster layering

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 23, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 23, 2024

@cheesesashimi: This pull request references MCO-1100 which is a valid jira issue.

In response to this:

- What I did

This adds the capability for BuildController to use the RHEL entitlement secrets to allow cluster admins to inject RHEL content into their builds that they are entitled to receive. This also allows the injection / consumption of content into /etc/yum.repos.d as well as /etc/pki/rpm-gpg. There are a few notes about the implementation that I would like to have at a higher level:

  • Because we run rootless Buildah, we're more prone to running into SELinux complications. This makes it more difficult to directly mount the contents of /etc/yum.repos.d, /etc/pki/entitlement, and /etc/pki/rpm-gpg directly into the build context. With that in mind, we copy everything into a series of temp directories first, and then mount those temp directories into the build context as a volume.
  • We also create an emptyDir which is mounted into the build pod at /home/build/.local/share/containers. It is unclear why this is necessary, but as mentioned before, I suspect that this is due to SELinux issues.
  • The e2e test suite now has the capability to stream the container logs from the build pod to the filesystem as there is useful information contained within those logs if the e2e test fails. In OpenShift CI, this location will be determined by the ARTIFACT_DIR env var. If this env var is not present, it will default the current directory.

The difference between this PR and #4312 is that this one is based upon both the on-cluster layering PR (#4327) and the on-cluster layering e2e PR (#4328).

- How to verify it

Automated verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace. If this secret is not present, TestEntitledBuilds and TestEntitledBuildsRollsOutImage will be skipped.
  2. Ensure that the OnClusterBuild feature-gate is enabled. The test suite will fail immediately if the feature-gate is not enabled.
  3. Run the tech preview e2e test suite: $ go test -count=1 -v ./test/e2e-techpreview/...

(Note: Because we have not landed #4284, the cleanup / teardown will delete the node and its underlying machine, causing the Machine API to provision a replacement node.)

Semi-manual verification:

  1. Download / install v0.0.14 of my OpenShift helpers on your local machine.
  2. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace.
  3. Create a Dockerfile on your local machine that contains the following content:
FROM configs AS final

RUN rm -rf /etc/rhsm-host && \
 rpm-ostree install buildah && \
 ln -s /run/secrets/rhsm /etc/rhsm-host && \
 ostree container commit
  1. With my onclustertesting helper in your $PATH, run the following: $ onclustertesting setup in-cluster-registry --enable-featuregate --pool=layered --custom-dockerfile=./path/to/the/Dockerfile
  2. If you have not previously enabled the featuregate, my helper will enable it for you. It will cause a new MachineConfig to be created and rolled out to all of the nodes, so the build might not begin immediately. Using this flag is idempotent.
  3. Watch for the machine-os-builder pod to start. Shortly afterward, the build pod should start. It should complete without any errors.

- Description for the changelog
Enables RHEL entitlements in on-cluster layering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 23, 2024
Copy link
Contributor

openshift-ci bot commented Apr 23, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@cheesesashimi cheesesashimi force-pushed the zzlotnik/rhel-entitlements-with-ocb-api branch from 33f46cf to f412d68 Compare April 23, 2024 20:47
Copy link
Contributor

openshift-ci bot commented Apr 23, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheesesashimi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 23, 2024
@cheesesashimi cheesesashimi changed the title MCO-1100: enable RHEL entitlements in on-cluster layering MCO-1100: enable RHEL entitlements in on-cluster layering with OCL API Apr 23, 2024
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-techpreview

@cheesesashimi
Copy link
Member Author

/test test-unit
/test verify

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-techpreview

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 24, 2024

@cheesesashimi: This pull request references MCO-1100 which is a valid jira issue.

In response to this:

- What I did

This adds the capability for BuildController to use the RHEL entitlement secrets to allow cluster admins to inject RHEL content into their builds that they are entitled to receive. This also allows the injection / consumption of content into /etc/yum.repos.d as well as /etc/pki/rpm-gpg. There are a few notes about the implementation that I would like to have at a higher level:

  • Because we run rootless Buildah, we're more prone to running into SELinux complications. This makes it more difficult to directly mount the contents of /etc/yum.repos.d, /etc/pki/entitlement, and /etc/pki/rpm-gpg directly into the build context. With that in mind, we copy everything into a series of temp directories first, and then mount those temp directories into the build context as a volume.
  • We also create an emptyDir which is mounted into the build pod at /home/build/.local/share/containers. It is unclear why this is necessary, but as mentioned before, I suspect that this is due to SELinux issues.
  • The e2e test suite now has the capability to stream the container logs from the build pod to the filesystem as there is useful information contained within those logs if the e2e test fails. In OpenShift CI, this location will be determined by the ARTIFACT_DIR env var. If this env var is not present, it will default the current directory.

The difference between this PR and #4312 is that this one is based upon both the on-cluster layering PR (#4327) and the on-cluster layering e2e PR (#4328).

- How to verify it

Automated verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace. If this secret is not present, TestEntitledBuilds and TestEntitledBuildsRollsOutImage will be skipped.
  2. Ensure that the OnClusterBuild feature-gate is enabled. The test suite will fail immediately if the feature-gate is not enabled.
  3. Run the tech preview e2e test suite: $ go test -count=1 -v ./test/e2e-techpreview/...

(Note: Because we have not landed #4284, the cleanup / teardown will delete the node and its underlying machine, causing the Machine API to provision a replacement node.)

Manual verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace.
  2. Ensure that the OnClusterBuild feature-gate is enabled.
  3. Copy the etc-pki-entitlement secret into the openshift-machine-config-operator namespace. Here's a small script you can use:
#!/usr/bin/env bash

set -xeuo

oc create secret generic etc-pki-entitlement \
 --namespace "openshift-machine-config-operator" \
 --from-file=entitlement.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement.pem" | base64decode }}') \
 --from-file=entitlement-key.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement-key.pem" | base64decode }}')
  1. Create a new MachineOSConfig with the following Containerfile:
FROM configs AS final

RUN rm -rf /etc/rhsm-host && \
 rpm-ostree install buildah && \
 ln -s /run/secrets/rhsm /etc/rhsm-host && \
 ostree container commit
  1. Watch for the machine-os-builder pod to start. Shortly afterward, the build pod should start. It should complete without any errors. Seeing the following lines in the build pod content will verify that we've successfully ingested content:
Enabled rpm-md repositories: rhel-9-for-x86_64-baseos-beta-rpms rhel-9-for-x86_64-appstream-beta-rpms
Updating metadata for 'rhel-9-for-x86_64-baseos-beta-rpms'...done
Updating metadata for 'rhel-9-for-x86_64-appstream-beta-rpms'...done
Importing rpm-md...done
rpm-md repo 'rhel-9-for-x86_64-baseos-beta-rpms'; generated: 2024-03-25T12:33:56Z solvables: 1816
rpm-md repo 'rhel-9-for-x86_64-appstream-beta-rpms'; generated: 2024-03-25T12:34:53Z solvables: 6972

- Description for the changelog
Enables RHEL entitlements in on-cluster layering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2024
@cheesesashimi cheesesashimi force-pushed the zzlotnik/rhel-entitlements-with-ocb-api branch from 264be15 to 9561562 Compare April 24, 2024 17:09
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2024
@sergiordlr
Copy link

Verified here: #4312 (comment)

We add the qe-approved label

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Apr 24, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 24, 2024

@cheesesashimi: This pull request references MCO-1100 which is a valid jira issue.

In response to this:

- What I did

This adds the capability for BuildController to use the RHEL entitlement secrets to allow cluster admins to inject RHEL content into their builds that they are entitled to receive. This also allows the injection / consumption of content into /etc/yum.repos.d as well as /etc/pki/rpm-gpg. There are a few notes about the implementation that I would like to have at a higher level:

  • Because we run rootless Buildah, we're more prone to running into SELinux complications. This makes it more difficult to directly mount the contents of /etc/yum.repos.d, /etc/pki/entitlement, and /etc/pki/rpm-gpg directly into the build context. With that in mind, we copy everything into a series of temp directories first, and then mount those temp directories into the build context as a volume.
  • We also create an emptyDir which is mounted into the build pod at /home/build/.local/share/containers. It is unclear why this is necessary, but as mentioned before, I suspect that this is due to SELinux issues.
  • The e2e test suite now has the capability to stream the container logs from the build pod to the filesystem as there is useful information contained within those logs if the e2e test fails. In OpenShift CI, this location will be determined by the ARTIFACT_DIR env var. If this env var is not present, it will default the current directory.

The difference between this PR and #4312 is that this one is based upon both the on-cluster layering PR (#4327) and the on-cluster layering e2e PR (#4328).

- How to verify it

Automated verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace. If this secret is not present, TestEntitledBuilds and TestEntitledBuildsRollsOutImage will be skipped.
  2. Ensure that the OnClusterBuild feature-gate is enabled. The test suite will fail immediately if the feature-gate is not enabled.
  3. Run the tech preview e2e test suite: $ go test -count=1 -v ./test/e2e-techpreview/...

(Note: Because we have not landed #4284, the cleanup / teardown will delete the node and its underlying machine, causing the Machine API to provision a replacement node.)

Manual verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace.
  2. Ensure that the OnClusterBuild feature-gate is enabled.
  3. Copy the etc-pki-entitlement secret into the openshift-machine-config-operator namespace. Here's a small script you can use:
#!/usr/bin/env bash

set -xeuo

oc create secret generic etc-pki-entitlement \
 --namespace "openshift-machine-config-operator" \
 --from-file=entitlement.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement.pem" | base64decode }}') \
 --from-file=entitlement-key.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement-key.pem" | base64decode }}')
  1. Create a new MachineOSConfig with the following Containerfile:
FROM configs AS final

RUN rm -rf /etc/rhsm-host && \
 rpm-ostree install buildah && \
 ln -s /run/secrets/rhsm /etc/rhsm-host && \
 ostree container commit
  1. Watch for the machine-os-builder pod to start. Shortly afterward, the build pod should start. It should complete without any errors. Seeing the following lines in the build pod content will verify that we've successfully ingested content:
Enabled rpm-md repositories: rhel-9-for-x86_64-baseos-beta-rpms rhel-9-for-x86_64-appstream-beta-rpms
Updating metadata for 'rhel-9-for-x86_64-baseos-beta-rpms'...done
Updating metadata for 'rhel-9-for-x86_64-appstream-beta-rpms'...done
Importing rpm-md...done
rpm-md repo 'rhel-9-for-x86_64-baseos-beta-rpms'; generated: 2024-03-25T12:33:56Z solvables: 1816
rpm-md repo 'rhel-9-for-x86_64-appstream-beta-rpms'; generated: 2024-03-25T12:34:53Z solvables: 6972

- Description for the changelog
Enables RHEL entitlements in on-cluster layering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cheesesashimi cheesesashimi force-pushed the zzlotnik/rhel-entitlements-with-ocb-api branch from 9561562 to 679440f Compare April 24, 2024 21:27
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-techpreview

Copy link
Contributor

openshift-ci bot commented Apr 24, 2024

@cheesesashimi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/verify f412d68 link true /test verify

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-techpreview

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 25, 2024

@cheesesashimi: This pull request references MCO-1100 which is a valid jira issue.

In response to this:

- What I did

This adds the capability for BuildController to use the RHEL entitlement secrets to allow cluster admins to inject RHEL content into their builds that they are entitled to receive. This also allows the injection / consumption of content into /etc/yum.repos.d as well as /etc/pki/rpm-gpg. There are a few notes about the implementation that I would like to have at a higher level:

  • Because we run rootless Buildah, we're more prone to running into SELinux complications. This makes it more difficult to directly mount the contents of /etc/yum.repos.d, /etc/pki/entitlement, and /etc/pki/rpm-gpg directly into the build context. With that in mind, we copy everything into a series of temp directories first, and then mount those temp directories into the build context as a volume.
  • We also create an emptyDir which is mounted into the build pod at /home/build/.local/share/containers. It is unclear why this is necessary, but as mentioned before, I suspect that this is due to SELinux issues.
  • The e2e test suite now has the capability to stream the container logs from the build pod to the filesystem as there is useful information contained within those logs if the e2e test fails. In OpenShift CI, this location will be determined by the ARTIFACT_DIR env var. If this env var is not present, it will default the current directory.
  • For right now, etc-pki-entitlement flow (specifically, the TestEntitledBuild test) is being skipped in OpenShift CI because the test clusters do not have that cred available. The test suite will automatically detect the presence (or lack thereof) of that cred in the openshift-config-managed namespace and run the test if it is available. However, the TestYumRepos test targets a very similar flow and can do its own setup and teardown regardless of creds preexisting.

The difference between this PR and #4312 is that this one is based upon both the on-cluster layering PR (#4327) and the on-cluster layering e2e PR (#4328).

- How to verify it

Automated verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace. If this secret is not present, TestEntitledBuilds and TestEntitledBuildsRollsOutImage will be skipped.
  2. Ensure that the OnClusterBuild feature-gate is enabled. The test suite will fail immediately if the feature-gate is not enabled.
  3. Run the tech preview e2e test suite: $ go test -count=1 -v ./test/e2e-techpreview/...

(Note: Because we have not landed #4284, the cleanup / teardown will delete the node and its underlying machine, causing the Machine API to provision a replacement node.)

Manual verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace.
  2. Ensure that the OnClusterBuild feature-gate is enabled.
  3. Copy the etc-pki-entitlement secret into the openshift-machine-config-operator namespace. Here's a small script you can use:
#!/usr/bin/env bash

set -xeuo

oc create secret generic etc-pki-entitlement \
 --namespace "openshift-machine-config-operator" \
 --from-file=entitlement.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement.pem" | base64decode }}') \
 --from-file=entitlement-key.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement-key.pem" | base64decode }}')
  1. Create a new MachineOSConfig with the following Containerfile:
FROM configs AS final

RUN rm -rf /etc/rhsm-host && \
 rpm-ostree install buildah && \
 ln -s /run/secrets/rhsm /etc/rhsm-host && \
 ostree container commit
  1. Watch for the machine-os-builder pod to start. Shortly afterward, the build pod should start. It should complete without any errors. Seeing the following lines in the build pod content will verify that we've successfully ingested content:
Enabled rpm-md repositories: rhel-9-for-x86_64-baseos-beta-rpms rhel-9-for-x86_64-appstream-beta-rpms
Updating metadata for 'rhel-9-for-x86_64-baseos-beta-rpms'...done
Updating metadata for 'rhel-9-for-x86_64-appstream-beta-rpms'...done
Importing rpm-md...done
rpm-md repo 'rhel-9-for-x86_64-baseos-beta-rpms'; generated: 2024-03-25T12:33:56Z solvables: 1816
rpm-md repo 'rhel-9-for-x86_64-appstream-beta-rpms'; generated: 2024-03-25T12:34:53Z solvables: 6972

- Description for the changelog
Enables RHEL entitlements in on-cluster layering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cheesesashimi cheesesashimi force-pushed the zzlotnik/rhel-entitlements-with-ocb-api branch 2 times, most recently from 3429252 to 2c5bd22 Compare April 25, 2024 12:56
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-techpreview

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 25, 2024

@cheesesashimi: This pull request references MCO-1100 which is a valid jira issue.

In response to this:

- What I did

This adds the capability for BuildController to use the RHEL entitlement secrets to allow cluster admins to inject RHEL content into their builds that they are entitled to receive. This also allows the injection / consumption of content into /etc/yum.repos.d as well as /etc/pki/rpm-gpg. There are a few notes about the implementation that I would like to have at a higher level:

  • Because we run rootless Buildah, we're more prone to running into SELinux complications. This makes it more difficult to directly mount the contents of /etc/yum.repos.d, /etc/pki/entitlement, and /etc/pki/rpm-gpg directly into the build context. With that in mind, we copy everything into a series of temp directories first, and then mount those temp directories into the build context as a volume.
  • We also create an emptyDir which is mounted into the build pod at /home/build/.local/share/containers. It is unclear why this is necessary, but as mentioned before, I suspect that this is due to SELinux issues.
  • The e2e test suite now has the capability to stream the container logs from the build pod to the filesystem as there is useful information contained within those logs if the e2e test fails. In OpenShift CI, this location will be determined by the ARTIFACT_DIR env var. If this env var is not present, it will default the current directory.
  • For right now, etc-pki-entitlement flow (specifically, the TestEntitledBuild test) is being skipped in OpenShift CI because the test clusters do not have that cred available. The test suite will automatically detect the presence (or lack thereof) of that cred in the openshift-config-managed namespace and run the test if it is available. However, the TestYumRepos test targets a very similar flow and can do its own setup and teardown regardless of creds preexisting.

The difference between this PR and #4312 is that this one is based upon both the on-cluster layering PR (#4327) and the on-cluster layering e2e PR (#4328).

- How to verify it

Automated verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace. If this secret is not present, TestEntitledBuilds and TestEntitledBuildsRollsOutImage will be skipped.
  2. Ensure that the OnClusterBuild feature-gate is enabled. The test suite will fail immediately if the feature-gate is not enabled.
  3. Run the tech preview e2e test suite: $ go test -count=1 -v ./test/e2e-techpreview/...

(Note: Because we have not landed #4284, the cleanup / teardown will delete the node and its underlying machine, causing the Machine API to provision a replacement node.)

Manual verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace.
  2. Ensure that the OnClusterBuild feature-gate is enabled.
  3. Copy the etc-pki-entitlement secret into the openshift-machine-config-operator namespace. Here's a small script you can use:
#!/usr/bin/env bash

set -xeuo

oc create secret generic etc-pki-entitlement \
 --namespace "openshift-machine-config-operator" \
 --from-file=entitlement.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement.pem" | base64decode }}') \
 --from-file=entitlement-key.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement-key.pem" | base64decode }}')
  1. Create a new MachineConfigPool and MachineOSConfig :
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
 name: layered
spec:
 machineConfigSelector:
   matchExpressions:
   - key: machineconfiguration.openshift.io/role
     operator: In
     values:
     - worker
     - layered
 nodeSelector:
   matchLabels:
     node-role.kubernetes.io/layered: ""
 paused: false
---
apiVersion: machineconfiguration.openshift.io/v1alpha1
kind: MachineOSConfig
metadata:
 name: layered
spec:
 buildInputs:
   baseImagePullSecret:
     name: global-pull-secret-copy
   containerFile:
   - containerfileArch: noarch
     content: |-
       FROM configs AS final
       RUN rpm-ostree install cowsay && \
         ostree container commit
   imageBuilder:
     imageBuilderType: PodImageBuilder
   renderedImagePushSecret:
     name: <add your registry's push secret here>
   renderedImagePushspec: <add your registry's pushspec here>
 buildOutputs:
   currentImagePullSecret:
     name: <add your registry's pull secret here>
 machineConfigPool:
   name: layered
  1. Watch for the machine-os-builder pod to start. Shortly afterward, the build pod should start. It should complete without any errors. Seeing the following lines in the build pod content will verify that we've successfully ingested content:
Enabled rpm-md repositories: rhel-9-for-x86_64-baseos-beta-rpms rhel-9-for-x86_64-appstream-beta-rpms
Updating metadata for 'rhel-9-for-x86_64-baseos-beta-rpms'...done
Updating metadata for 'rhel-9-for-x86_64-appstream-beta-rpms'...done
Importing rpm-md...done
rpm-md repo 'rhel-9-for-x86_64-baseos-beta-rpms'; generated: 2024-03-25T12:33:56Z solvables: 1816
rpm-md repo 'rhel-9-for-x86_64-appstream-beta-rpms'; generated: 2024-03-25T12:34:53Z solvables: 6972

- Description for the changelog
Enables RHEL entitlements in on-cluster layering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 26, 2024

@cheesesashimi: This pull request references MCO-1100 which is a valid jira issue.

In response to this:

- What I did

This adds the capability for BuildController to use the RHEL entitlement secrets to allow cluster admins to inject RHEL content into their builds that they are entitled to receive. This also allows the injection / consumption of content into /etc/yum.repos.d as well as /etc/pki/rpm-gpg. There are a few notes about the implementation that I would like to have at a higher level:

  • Because we run rootless Buildah, we're more prone to running into SELinux complications. This makes it more difficult to directly mount the contents of /etc/yum.repos.d, /etc/pki/entitlement, and /etc/pki/rpm-gpg directly into the build context. With that in mind, we copy everything into a series of temp directories first, and then mount those temp directories into the build context as a volume.
  • We also create an emptyDir which is mounted into the build pod at /home/build/.local/share/containers. It is unclear why this is necessary, but as mentioned before, I suspect that this is due to SELinux issues.
  • The e2e test suite now has the capability to stream the container logs from the build pod to the filesystem as there is useful information contained within those logs if the e2e test fails. In OpenShift CI, this location will be determined by the ARTIFACT_DIR env var. If this env var is not present, it will default the current directory.
  • For right now, etc-pki-entitlement flow (specifically, the TestEntitledBuild test) is being skipped in OpenShift CI because the test clusters do not have that cred available. The test suite will automatically detect the presence (or lack thereof) of that cred in the openshift-config-managed namespace and run the test if it is available. However, the TestYumRepos test targets a very similar flow and can do its own setup and teardown regardless of creds preexisting.

I took care to ensure that this does not break OKD by taking the following actions:

  • I observed that the addition of the /home/build/.local/share/containers volume mount to the build pod prevented the wait-for-done container to start when running on FCOS. With this in mind, I modified the build pod instantiation to not connect this volume mount to the wait-for-done container.
  • I added a TestOnClusterBuildsOnOKD e2e test which will only run against an OKD cluster. Conversely, I excluded other tests from running against an OKD cluster since those tests make assumptions about things that would only be present within an OCP cluster.

The difference between this PR and #4312 is that this one is based upon both the on-cluster layering PR (#4327) and the on-cluster layering e2e PR (#4328).

- How to verify it

Automated verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace. If this secret is not present, TestEntitledBuilds and TestEntitledBuildsRollsOutImage will be skipped.
  2. Ensure that the OnClusterBuild feature-gate is enabled. The test suite will fail immediately if the feature-gate is not enabled.
  3. Run the tech preview e2e test suite: $ go test -count=1 -v ./test/e2e-techpreview/...

(Note: Because we have not landed #4284, the cleanup / teardown will delete the node and its underlying machine, causing the Machine API to provision a replacement node.)

Manual verification:

  1. Bring up a cluster where the secret etc-pki-entitlement exists in the openshift-config-managed namespace.
  2. Ensure that the OnClusterBuild feature-gate is enabled.
  3. Copy the etc-pki-entitlement secret into the openshift-machine-config-operator namespace. Here's a small script you can use:
#!/usr/bin/env bash

set -xeuo

oc create secret generic etc-pki-entitlement \
 --namespace "openshift-machine-config-operator" \
 --from-file=entitlement.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement.pem" | base64decode }}') \
 --from-file=entitlement-key.pem=<(oc get secret/etc-pki-entitlement -n openshift-config-managed -o go-template='{{index .data "entitlement-key.pem" | base64decode }}')
  1. Create a new MachineConfigPool and MachineOSConfig :
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
 name: layered
spec:
 machineConfigSelector:
   matchExpressions:
   - key: machineconfiguration.openshift.io/role
     operator: In
     values:
     - worker
     - layered
 nodeSelector:
   matchLabels:
     node-role.kubernetes.io/layered: ""
 paused: false
---
apiVersion: machineconfiguration.openshift.io/v1alpha1
kind: MachineOSConfig
metadata:
 name: layered
spec:
 buildInputs:
   baseImagePullSecret:
     name: global-pull-secret-copy
   containerFile:
   - containerfileArch: noarch
     content: |-
       FROM configs AS final
       RUN rpm-ostree install cowsay && \
         ostree container commit
   imageBuilder:
     imageBuilderType: PodImageBuilder
   renderedImagePushSecret:
     name: <add your registry's push secret here>
   renderedImagePushspec: <add your registry's pushspec here>
 buildOutputs:
   currentImagePullSecret:
     name: <add your registry's pull secret here>
 machineConfigPool:
   name: layered
  1. Watch for the machine-os-builder pod to start. Shortly afterward, the build pod should start. It should complete without any errors. Seeing the following lines in the build pod content will verify that we've successfully ingested content:
Enabled rpm-md repositories: rhel-9-for-x86_64-baseos-beta-rpms rhel-9-for-x86_64-appstream-beta-rpms
Updating metadata for 'rhel-9-for-x86_64-baseos-beta-rpms'...done
Updating metadata for 'rhel-9-for-x86_64-appstream-beta-rpms'...done
Importing rpm-md...done
rpm-md repo 'rhel-9-for-x86_64-baseos-beta-rpms'; generated: 2024-03-25T12:33:56Z solvables: 1816
rpm-md repo 'rhel-9-for-x86_64-appstream-beta-rpms'; generated: 2024-03-25T12:34:53Z solvables: 6972

- Description for the changelog
Enables RHEL entitlements in on-cluster layering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

This adds the capability for BuildController to use the RHEL entitlement
secrets to allow cluster admins to inject RHEL content into their builds
that they are entitled to receive. This also allows the injection /
consumption of content into /etc/yum.repos.d as well as
/etc/pki/rpm-gpg. There are a few notes about the implementation that I
would like to have at a higher level:

- Because we run rootless Buildah, we're more prone to running into
  SELinux complications. This makes it more difficult to directly mount
  the contents of /etc/yum.repos.d, /etc/pki/entitlement, and
  /etc/pki/rpm-gpg directly into the build context. With that in mind,
  we copy everything into a series of temp directories first, and then
  mount those temp directories into the build context as a volume.
- We also create an emptyDir which is mounted into the build pod at
  /home/build/.local/share/containers. It is unclear why this is
  necessary, but as mentioned before, I suspect that this is due to
  SELinux issues.
- The e2e test suite now has the capability to stream the container logs
  from the build pod to the filesystem as there is useful information
  contained within those logs if the e2e test fails. In OpenShift CI,
  this location will be determined by the ARTIFACT_DIR env var. If this
  env var is not present, it will default the current directory.
- For right now, etc-pki-entitlement flow (specifically, the
  TestEntitledBuild test) is being skipped in OpenShift CI because the
  test clusters do not have that cred available. The test suite will
  automatically detect the presence (or lack thereof) of that cred in
  the openshift-config-managed namespace and run the test if it is
  available. However, the TestYumRepos test targets a very similar flow
  and can do its own setup and teardown regardless of creds preexisting.

Additionally, I took care to ensure that this does not break OKD by taking the following actions:

- I observed that the addition of the
  /home/build/.local/share/containers volume mount to the build pod
  prevented the wait-for-done container to start when running on FCOS.
  With this in mind, I modified the build pod instantiation to not
  connect this volume mount to the wait-for-done container.
- I added a TestOnClusterBuildsOnOKD e2e test which will only run
  against an OKD cluster. Conversely, I excluded other tests from
  running against an OKD cluster since those tests make assumptions
  about things that would only be present within an OCP cluster.
@cheesesashimi cheesesashimi force-pushed the zzlotnik/rhel-entitlements-with-ocb-api branch from c0d2223 to 7fdce62 Compare April 26, 2024 14:56
@cheesesashimi cheesesashimi marked this pull request as ready for review April 27, 2024 01:32
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 27, 2024
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 27, 2024
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sinnykumari
Copy link
Contributor

sinnykumari commented May 2, 2024

I tried to run this on my local AWS cluster (4.16.0-ec.6) with updating MCO pre-built image quay.io/zzlotnik/machine-config-operator:ocl-api-and-rhel-entitlements . Everything went fine until fetching rhel beta repo which looks like was able to fetch 0 packages.

time="2024-05-02T10:12:09Z" level=debug msg="setting uid"
time="2024-05-02T10:12:09Z" level=debug msg="Running &exec.Cmd{Path:\"/bin/sh\", Args:[]string{\"/bin/sh\", \"-c\", \"rm -rf /etc/rhsm-host && rpm-ostree install buildah && rpm-ostree install htop && ln -s /run/secrets/rhsm /etc/rhsm-host && ostree container commit\"}, Env:[]string{\"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\", \"HOSTNAME=49b0e7b30c97\", \"HOME=/root\"}, Dir:\"/\", Stdin:(*os.File)(0xc000068058), Stdout:(*os.File)(0xc000068060), Stderr:(*os.File)(0xc000068068), ExtraFiles:[]*os.File(nil), SysProcAttr:(*syscall.SysProcAttr)(0xc000424000), Process:(*os.Process)(nil), ProcessState:(*os.ProcessState)(nil), ctx:context.Context(nil), Err:error(nil), Cancel:(func() error)(nil), WaitDelay:0, childIOFiles:[]io.Closer(nil), parentIOPipes:[]io.Closer(nil), goroutine:[]func() error(nil), goroutineErr:(<-chan error)(nil), ctxResult:(<-chan exec.ctxResult)(nil), createdByStack:[]uint8(nil), lookPathErr:error(nil)} (PATH = \"\")"
Enabled rpm-md repositories: rhel-9-for-x86_64-baseos-beta-rpms rhel-9-for-x86_64-appstream-beta-rpms
Updating metadata for 'rhel-9-for-x86_64-baseos-beta-rpms'...done
Updating metadata for 'rhel-9-for-x86_64-appstream-beta-rpms'...done
Importing rpm-md...done
rpm-md repo 'rhel-9-for-x86_64-baseos-beta-rpms'; generated: 2024-05-02T07:02:38Z solvables: 0
rpm-md repo 'rhel-9-for-x86_64-appstream-beta-rpms'; generated: 2024-05-02T07:16:36Z solvables: 0
error: Packages not found: buildah
subprocess exited with status 1

Applied MachineOSConfig was

$ oc get machineosconfig infra -o yaml
apiVersion: machineconfiguration.openshift.io/v1alpha1
kind: MachineOSConfig
metadata:
  creationTimestamp: "2024-05-02T10:06:36Z"
  generation: 1
  name: infra
  resourceVersion: "99922"
  uid: 08138bd8-e0ef-4594-9881-2fabcf0819c2
spec:
  buildInputs:
    baseImagePullSecret:
      name: global-pull-secret-copy
    containerFile:
    - containerfileArch: noarch
      content: |-
        FROM configs AS final

        RUN rm -rf /etc/rhsm-host && \
        rpm-ostree install buildah && \
        rpm-ostree install htop && \
        ln -s /run/secrets/rhsm /etc/rhsm-host && \
        ostree container commit
    imageBuilder:
      imageBuilderType: PodImageBuilder
    renderedImagePushSecret:
      name: skumari-oclbot-pull-secret
    renderedImagePushspec: quay.io/skumari/ocl:latest
  machineConfigPool:
    name: infra

Did I miss something. Tried later with cowsay as well but same result.

@sinnykumari
Copy link
Contributor

Further testing worked. Commented in the PR where this will get merged #4327 (comment)

@cheesesashimi
Copy link
Member Author

These changes were incorporated into #4327, so this PR can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants