With the OpenShift Container Platform (OCP) release 4.1 in June 2019, Red Hat introduced Infrastructure MachineSets. These sets allow you to host only infrastructure components, such as:
- The default router
- The container image registry
- The cluster metrics collection, or monitoring service
- Cluster aggregated logging
An Infrastructure MachineSet consists of Machine resources (
kind: Machine). These Machine resources spin up new virtual machines in your cloud.
Specific Kubernetes labels can be applied to these machines to move one or more of the above mentioned infrastructure components to run on only those machines.
The kicker: The infrastructure nodes do not count towards the number of subscriptions that are required to run the environment!
Unleashing Worker Nodes
Worker nodes in the OCP cluster must be covered by subscriptions and their primary purpose is to run your application workloads.
To free resources from these worker nodes, which normally run the OCP infrastructure components, it is beneficial to move the infrastructure components to dedicated infrastructure nodes.
So let’s get started.
Creating an Infrastructure MachineSet for Production
For a production-ready deployment, it is recommended to deploy three MachineSets at minimum to run infrastructure components. The aggregated logging solution, i.e., ElasticSearch, requires three instances that run on different nodes. Since each MachineSet is assigned to one availability zone of the (public) cloud provider only, deploy three MachineSets at minimum.
For demonstration purposes, we will limit the scope to only one MachineSet in the next section.
Defining the MachineSet Custom Resource for the Google Cloud Platform
Once your OCP cluster is deployed to your Google Cloud Platform (GCP) project, you can create your first MachineSet to move infrastructure components. Sidenote: OCP 4.3 supports the installer provisioned infrastructure (IPI) installation method to pre-existing Virtual Private Clouds (VPC) and subnets. Choose the GCP region in which you deployed your OCP4 cluster. Then, select a GCP zone within that region to deploy the MachineSet.
Note: Double-check that the GCP zone actually exists. I tried to deploy to
us-east1-a, which does not exist 😉 Unfortunately, no logs or events revealed this to me. Instead, a kind colleague showed me the light.
Please find the YAML-file
machineset1.yaml defining the MachineSet below. Change the following values according to your environment:
- Replace the string
myclus-khb5hwith your OCP cluster ID
regionwith the region your OCP cluster is in
zonewith an (existing 😉 ) GCP zone
projectIDwith your GCP project ID
serviceAccountswith your service account
namemust be unique in your OCP cluster
apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: labels: machine.openshift.io/cluster-api-cluster: myclus-khb5h machine.openshift.io/cluster-api-machine-role: infra machine.openshift.io/cluster-api-machine-type: infra name: myclus-khb5h-w-a namespace: openshift-machine-api spec: replicas: 1 selector: matchLabels: machine.openshift.io/cluster-api-cluster: myclus-khb5h machine.openshift.io/cluster-api-machineset: myclus-khb5h-w-a template: metadata: creationTimestamp: null labels: machine.openshift.io/cluster-api-cluster: myclus-khb5h machine.openshift.io/cluster-api-machine-role: infra machine.openshift.io/cluster-api-machine-type: infra machine.openshift.io/cluster-api-machineset: myclus-khb5h-w-a spec: metadata: labels: node-role.kubernetes.io/infra: "" providerSpec: value: apiVersion: gcpprovider.openshift.io/v1beta1 canIPForward: false credentialsSecret: name: gcp-cloud-credentials deletionProtection: false disks: - autoDelete: true boot: true image: myclus-khb5h-rhcos-image labels: null sizeGb: 128 type: pd-ssd kind: GCPMachineProviderSpec machineType: n1-standard-4 metadata: creationTimestamp: null networkInterfaces: - network: myclus-khb5h-network subnetwork: myclus-khb5h-worker-subnet projectID: marek-ocp4-blog region: us-east1 serviceAccounts: - email: [email protected] scopes: - https://www.googleapis.com/auth/cloud-platform tags: - myclus-khb5h-infra userDataSecret: name: worker-user-data zone: us-east1-b
Now that the YAML-file is prepared, apply it to your cluster.
oc create -f machineset1.yaml machineset.machine.openshift.io/myclus-khb5h-infra-a created
You can check that the resource is starting to be created.
oc get machinesets -n openshift-machine-api NAME DESIRED CURRENT READY AVAILABLE AGE myclus-khb5h-infra-a 1 1 7s myclus-khb5h-w-b 1 1 1 1 19h myclus-khb5h-w-c 1 1 1 1 19h myclus-khb5h-w-d 1 1 1 1 19h
Further insight on the creation process can be gained with
It is important to note the output should include the
Events: section; if not, then there is likely an error with the YAML-file.
oc describe machine myclus-khb5h-infra-a -n openshift-machine-api <output omitted> Status: Addresses: Address: 10.0.64.2 Type: InternalIP Address: myclus-khb5h-infra-a-69j7n.us-east1-b.c.marek-ocp4-blog.internal Type: InternalDNS Address: myclus-khb5h-infra-a-69j7n.c.marek-ocp4-blog.internal Type: InternalDNS Last Updated: 2020-04-28T14:46:43Z Phase: Provisioned Provider Status: Conditions: Last Probe Time: 2020-04-28T14:46:23Z Last Transition Time: 2020-04-28T14:46:23Z Message: machine successfully created Reason: MachineCreationSucceeded Status: True Type: MachineCreated Instance Id: myclus-khb5h-infra-a-69j7n Instance State: RUNNING Metadata: Creation Timestamp: <nil> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedCreate 14m gcpcontroller requeue in: 20s Warning FailedUpdate 14m (x4 over 14m) gcpcontroller requeue in: 20s Normal Update 14m (x3 over 14m) gcpcontroller Updated Machine myclus-khb5h-infra-a-69j7n
Now, the GCP console shows the new instance.
Moving the Container Image Registry
To free resources from the worker node, let’s move the container image registry to the newly created infrastructure node.
Since the image registry resource already exists, we will edit the existing
config/cluster object and add the
nodeSelector to move the registry to our new infrastructure node.
oc edit config/cluster # Add these two lines to the spec: section nodeSelector: node-role.kubernetes.io/infra: "" # Save and exit the file config.imageregistry.operator.openshift.io/cluster edited
Watch the resources being moved:
watch -n 1 'oc get pods -n openshift-image-registry -o wide' # Following output is edited for brevity NAME READY STATUS AGE IP NODE cluster-image-registry-operator-9754995-rg2n5 2/2 Running 21h 10.128.0.28 myclus-khb5h-m-2.c.marek-ocp4-blog.internal image-registry-75b4bd664f-rvrn5 0/1 Pending 28s <none> <none> image-registry-dd874db66-29hzp 1/1 Running 21h 10.128.2.4 myclus-khb5h-w-c-d94s4.c.marek-ocp4-blog.internal
After a few moments, the original
image-registry pod will be removed.
As you can see, after Infrastructure MachineSets have been created, existing OCP infrastructure components can be moved easily to the dedicated infrastructure nodes.
Next, try to move the cluster monitoring service, cluster aggregated logging, or the default router to your new infrastructure nodes.
You are now ready to apply this technique in your new and existing OCP4 clusters.
Interested in learning more about the OpenShift journey? //take the first step