Skip to content

OpenShift Cheatsheet

A practical OpenShift knowledge base — cheat sheets, commands, troubleshooting tips, and admin notes for real-world cluster operations. The sidebar mirrors the folder structure of the openshift-cheatsheet repo.

OpenShift architecture overview

https://docs.openshift.com/container-platform/4.14/cli_reference/openshift_cli/developer-cli-commands.html

Table of Contents


Login and Configuration


oc client download

Terminal window
export OCP_VERSION=latest-4.16
curl -k https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$OCP_VERSION/openshift-client-linux.tar.gz -o oc.tar.gz

oc Autocompletion

Terminal window
oc completion bash >>/etc/bash_completion.d/oc_completion
echo 'source <(oc completion bash)' >> ~/.bashrc
source ~/.bashrc

Login with a user

Terminal window
oc login https://console-openshift-console.apps-crc.testing:8443 -u developer -p developer

Login as system admin

Terminal window
oc login -u system:admin

User Information

Terminal window
oc whoami
oc whoami --show-console
oc whoami --show-server
oc -info
oc cluster-info dump

View your configuration

Terminal window
oc config view

View your VSphere Credential [https://access.redhat.com/solutions/6677901]

Terminal window
oc get secret vsphere-creds -o yaml -n kube-system
oc get cm cloud-provider-config -o yaml -n openshift-config
oc get infrastructures.config.openshift.io -o yaml

Fix VSphere Credential [https://access.redhat.com/solutions/6677901]

Terminal window
https://access.redhat.com/solutions/6677901
oc get secret vsphere-creds -o yaml -n kube-system
oc patch kubecontrollermanager cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge

Update the current context to have users login to the desired namespace

Terminal window
oc config set-context `oc config current-context` --namespace=<project_name>

List OAuth Access Tokens

Terminal window
oc get useroauthaccesstokens

Useful Commands


List all Projects

Terminal window
oc get projects

Switch to a Project

Terminal window
oc project myproject

Get Resources in a Project

List all resources in the current project:

Terminal window
oc get all

List pods with custom output:

Terminal window
oc get pods -o wide

Apply Configuration from a File

Terminal window
oc apply -f config.yaml

Create Objects Using Bash Here Documents

Create a ConfigMap directly using a here document:

Terminal window
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: example-config
namespace: myproject
data:
key: value
EOF

Export Resources to a File

Terminal window
oc get deployment my-deployment -o yaml > deployment.yaml

Delete a Resource

Terminal window
oc delete pod my-pod

Debug a Pod

Start a debug session for a pod:

Terminal window
oc debug pod/my-pod

Check Cluster Status

Terminal window
oc status

View Cluster Nodes

Terminal window
oc get nodes

Describe a Node

Terminal window
oc describe node <node-name>

List nodes CPU/RAM

Terminal window
{
echo -e "NAME\tROLES\tCPU\tMEMORY"
paste \
<(oc get nodes --no-headers | awk '{print $1 "\t" $3}') \
<(oc get nodes --no-headers -o custom-columns=CPU:.status.capacity.cpu,MEMORY:.status.capacity.memory)
} | column -t

View Nodes allocation

Terminal window
for i in $(oc get nodes | awk '{print $1}'); do echo "==== $i ====";oc describe node $i 2> /dev/null | grep -A10 Allocated; echo; done
oc get nodes \
-o custom-columns=NAME:.metadata.name,CPU:.status.capacity.cpu,MEMORY:.status.capacity.memory,EPHEMERAL:.status.capacity.ephemeral-storage,ALLOC_CPU:.status.allocatable.cpu,ALLOC_MEM:.status.allocatable.memory,ALLOC_EPHEMERAL:.status.allocatable.ephemeral-storage
oc get nodes --no-headers | awk '{print $1}' | while read -r n; do
echo "===== $n ====="
oc describe node "$n" | egrep "^(Name:|Roles:|Capacity:|Allocatable:| cpu:| memory:| ephemeral-storage:|Allocated resources:)"
echo
done

View Nodes Taints

Terminal window
oc get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

View Nodes Rendered MachineConfig

Terminal window
for n in $(oc get nodes -l node-role.kubernetes.io/master -o name); do
echo -n "$n -> "
oc get $n -o jsonpath='{.metadata.annotations.machineconfiguration\.openshift\.io/currentConfig}{" | "}{.metadata.annotations.machineconfiguration\.openshift\.io/desiredConfig}{" | "}{.metadata.annotations.machineconfiguration\.openshift\.io/state}{"\n"}'
done

fstrim Nodes to free space

Terminal window
for i in $(oc get node -l '!node-role.kubernetes.io/master' -o name); do oc debug $i -- chroot /host fstrim -av; done

Get Logs for a Pod

Terminal window
oc logs my-pod

Follow Logs for a Pod

Terminal window
oc logs -f my-pod

Port Forward a Pod

Terminal window
oc port-forward my-pod 8080:80

Execute a Command in a Running Pod

Terminal window
oc exec my-pod -- ls /tmp

Scale a Deployment

Terminal window
oc scale deployment my-deployment --replicas=3

Create a New Application

Terminal window
oc new-app my-image-stream

List resource name by selector

Terminal window
oc get gw -A -o json | jq -r '.items[] | select(.spec.selector.istio == "backend-ingressgateway") | .metadata.name'

List nodeSelector per deployment

Terminal window
oc get deployments -A -o json | jq -r '.items[] | "\(.metadata.namespace)/\(.metadata.name): \(.spec.template.spec.nodeSelector)"'

Manage Kubeconfig Files

Switch kubeconfig contexts:

Terminal window
oc config use-context <context-name>

List all contexts:

Terminal window
oc config get-contexts

Set a specific context as default:

Terminal window
oc config set-context --current --namespace=myproject

Merge multiple kubeconfig files:

Terminal window
KUBECONFIG=config1:config2:config3 oc config view --merge --flatten > merged-config

Create a new app from a GitHub Repository

Terminal window
oc new-app https://github.com/sclorg/cakephp-ex

New app from a different branch

Terminal window
oc new-app --name=html-dev nginx:1.10~https://github.com/joe-speedboat/openshift.html.devops.git#mybranch

Create objects from a file

Terminal window
oc create -f myobject.yaml -n myproject

Delete objects contained in a file

Terminal window
oc delete -f myobject.yaml -n myproject

Create or merge objects from a file

Terminal window
oc apply -f myobject.yaml -n myproject

Update existing object

Terminal window
oc patch svc mysvc --type merge --patch '{"spec":{"ports":[{"port": 8080, "targetPort": 5000}]}}'

Monitor Pod status

Terminal window
watch oc get pods

Get a Specific Item (podIP) using a Go template

Terminal window
oc get pod example-pod-2 --template='{{.status.podIP}}'

Gather information on a project’s pod deployment with node information

Terminal window
oc get pods -o wide

Hide inactive Pods

Terminal window
oc get pods --show-all=false

Display all resources

Terminal window
oc get all,secret,configmap

Get the OpenShift Console Address

Terminal window
oc get -n openshift-console route console

Get the Pod name from the Selector and rsh into it

Terminal window
POD=$(oc get pods -l app=myapp -o name) oc rsh -n $POD

Execute a single command in a running pod

Terminal window
oc exec $POD $COMMAND

Create a pod for the container image “fedora” and execute commands with it

Terminal window
oc run fedora-pod --image=fedora --restart=Never --command -- sleep infinity

Copy from local folder byteman-4.0.12 to Pod wildfly-basic-1-mrlt5 under the folder /opt/wildfly

Terminal window
oc cp ./byteman-4.0.12 wildfly-basic-1-mrlt5:/opt/wildfly

Create Infra MachineSets + Move router, registry, monitoring to infra nodes

Terminal window
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
annotations:
machine.openshift.io/memoryMb: "32768"
machine.openshift.io/vCPU: "8"
labels:
hive.openshift.io/machine-pool: worker
hive.openshift.io/managed: "true"
machine.openshift.io/cluster-api-cluster: ocp01-prod-hkhmm
name: ocp01-prod-hkhmm-infra-0
namespace: openshift-machine-api
spec:
replicas: 3
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: ocp01-prod-hkhmm
machine.openshift.io/cluster-api-machineset: ocp01-prod-hkhmm-infra-0
template:
metadata:
labels:
machine.openshift.io/cluster-api-cluster: ocp01-prod-hkhmm
machine.openshift.io/cluster-api-machine-role: worker
machine.openshift.io/cluster-api-machine-type: worker
machine.openshift.io/cluster-api-machineset: ocp01-prod-hkhmm-infra-0
spec:
lifecycleHooks: {}
metadata:
labels:
node-role.kubernetes.io/infra: ""
providerSpec:
value:
apiVersion: machine.openshift.io/v1beta1
kind: VSphereMachineProviderSpec
credentialsSecret:
name: vsphere-cloud-credentials
diskGiB: 150
memoryMiB: 32768
metadata:
creationTimestamp: null
network:
devices:
- networkName: 2245-AGOS-LAN-OCP01-PROD
numCPUs: 8
numCoresPerSocket: 1
snapshot: ""
template: ocp01-prod-hkhmm-rhcos-generated-region-generated-zone
userDataSecret:
name: worker-user-data
workspace:
datacenter: ACME
datastore: /ACME/datastore/BT/LUN-BT-OPENSHIFT-250
folder: /ACME/vm/AGOS_OCP_OCP01_PROD
resourcePool: /ACME/host/ClusterLNX01/Resources
server: agsvcs001.acme.it
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
---
oc patch ingresscontroller/default -n openshift-ingress-operator --type=merge -p '{
"spec":{
"nodePlacement":{
"nodeSelector":{
"matchLabels":{
"node-role.kubernetes.io/infra":""
}
},
"tolerations":[
{
"key":"node-role.kubernetes.io/infra",
"operator":"Exists",
"effect":"NoSchedule"
}
]
}
}
}'
oc patch ingresscontroller/default -n openshift-ingress-operator --type=merge -p '{
"spec":{
"replicas":3
}
}'
oc patch configs.imageregistry.operator.openshift.io/cluster --type=merge -p '{
"spec":{
"nodeSelector":{
"node-role.kubernetes.io/infra":""
},
"tolerations":[
{
"key":"node-role.kubernetes.io/infra",
"operator":"Exists",
"effect":"NoSchedule"
}
]
}
}'
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |+
alertmanagerMain:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
prometheusK8s:
retention: 7d
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
volumeClaimTemplate:
spec:
storageClassName: thin
resources:
requests:
storage: 100Gi
prometheusOperator:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
metricsServer:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
k8sPrometheusAdapter:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
kubeStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
telemeterClient:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
openshiftStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
thanosQuerier:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
monitoringPlugin:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
operator: Exists
effect: NoSchedule
oc apply -f cluster-monitoring-configmap.yaml

Deployments

Manual deployment

Terminal window
oc rollout latest ruby-ex

Rollout a Deployment

Terminal window
oc rollout latest deployment/my-deployment

Pause a Deployment

Terminal window
oc rollout pause deployment/my-deployment

Resume a Deployment

Terminal window
oc rollout resume deployment/my-deployment

Scale a Deployment

Terminal window
oc scale deployment/my-deployment --replicas=3

Undo a Deployment Rollout

Terminal window
oc rollout undo deployment/my-deployment

Check Deployment History

Terminal window
oc rollout history deployment/my-deployment

Set Deployment Strategies

spec:
strategy:
type: Rolling
rollingParams:
intervalSeconds: 1
updatePeriodSeconds: 1
timeoutSeconds: 600
maxUnavailable: 25%
maxSurge: 25%

Define resource requests and limits in DeploymentConfig

Terminal window
oc set resources deployment nginx --limits=cpu=200m,memory=512Mi --requests=cpu=100m,memory=256Mi

Define livenessProbe and readinessProbe in DeploymentConfig

Terminal window
oc set probe dc/nginx --readiness --get-url=http://:8080/healthz --initial-delay-seconds=10
oc set probe dc/nginx --liveness --get-url=http://:8080/healthz --initial-delay-seconds=10

Scale the number of Pods to 2

Terminal window
oc scale dc/nginx --replicas=2

Define Horizontal Pod Autoscaler (HPA)

Terminal window
oc autoscale dc foo --min=2 --max=4 --cpu-percent=10

LIST DEPLOY/REPLICAS x NAMESPACE (DR-check)

Terminal window
kubectl get deploy,pod -A -o json | jq -r '
.items[]
| select(.metadata.namespace | test("^(openshift-|kube-|default$|registry$|istio|dyna|sentinel|turbo|zabbix|operator|cluster-management)")==false)
| if .kind=="Deployment" then
{
ns: .metadata.namespace,
deploys: 1,
desired: (.spec.replicas // 0),
available: (.status.availableReplicas // 0),
pods: 0,
notready: 0
}
elif .kind=="Pod"
and (.metadata.deletionTimestamp | not)
and (.status.phase == "Running" or .status.phase == "Pending") then
{
ns: .metadata.namespace,
deploys: 0,
desired: 0,
available: 0,
pods: 1,
notready: (
if ([.status.containerStatuses[]? | select(.ready==false)] | length) > 0
then 1 else 0 end
)
}
else
empty
end
' | jq -sr '
group_by(.ns)[]
| [
.[0].ns,
(map(.deploys) | add),
(map(.desired) | add),
(map(.available) | add),
(map(.pods) | add),
(map(.notready) | add)
]
| @tsv
' | (echo -e "NAMESPACE\tN_DEPLOY\tDESIRED_REPLICAS\tAVAILABLE_REPLICAS\tACTIVE_PODS\tPOD_NON_READY"; cat) | column -t -s $'\t'
---
alias k8s-ns-report='kubectl get deploy,pod -A -o json | jq -r "
.items[]
| select(.metadata.namespace | test(\"^(openshift-|kube-|default$|registry$|istio|dyna|sentinel|turbo|zabbix|operator|cluster-management)\")==false)
| if .kind==\"Deployment\" then
{
ns: .metadata.namespace,
deploys: 1,
desired: (.spec.replicas // 0),
available: (.status.availableReplicas // 0),
pods: 0,
notready: 0
}
elif .kind==\"Pod\"
and (.metadata.deletionTimestamp | not)
and (.status.phase == \"Running\" or .status.phase == \"Pending\") then
{
ns: .metadata.namespace,
deploys: 0,
desired: 0,
available: 0,
pods: 1,
notready: (
if ([.status.containerStatuses[]? | select(.ready==false)] | length) > 0
then 1 else 0 end
)
}
else
empty
end
" | jq -sr "
group_by(.ns)[]
| [
.[0].ns,
(map(.deploys) | add),
(map(.desired) | add),
(map(.available) | add),
(map(.pods) | add),
(map(.notready) | add)
]
| @tsv
" | (echo -e "NAMESPACE\tN_DEPLOY\tDESIRED_REPLICAS\tAVAILABLE_REPLICAS\tACTIVE_PODS\tPOD_NON_READY"; cat) | column -t -s $'\''\t'\'''

ConfigMaps

View ConfigMap Data

Terminal window
oc get configmap my-config -o yaml

Update a ConfigMap

Terminal window
oc create configmap my-config --from-literal=key=value --dry-run=client -o yaml | oc apply -f -

Managing Routes

Create a route

Terminal window
oc expose service ruby-ex

Create Route and expose it through a custom Hostname

Terminal window
oc expose service ruby-ex --hostname=<custom-hostname>

Read the Route Host attribute

Terminal window
oc get route my-route -o jsonpath --template="{.spec.host}"

Forward traffic from pod “myphp” from 8080 to local 8080

Terminal window
oc port-forward pod/myphp 8080:8080

Managing Services

Make a service idle. When the service is next accessed it will automatically boot up the pods again

Terminal window
oc idle ruby-ex

Read a Service IP

Terminal window
oc get services rook-ceph-mon-a --template='{{.spec.clusterIP}}'

Resource Usage

List the memory and CPU usage of all pods in the cluster

Terminal window
oc adm top pods -A --sum

List the resource usage of the containers in the pod “mypod” in the “example” namespace

Terminal window
oc adm top pods mypod -n example --containers

Resource consumption for the node

Terminal window
oc adm top node

List all resources, their status, and their types in the “example” namespace

Terminal window
oc get all -n example --show-kind

Displays the resource consumption for each container running on the node (requires “cri-tools”)

Terminal window
crictl stats

Clean up Non Running pods

Terminal window
oc get pods -A -o wide | grep -v 'Runn\|Comp'
oc get pods -A | grep -v 'Runn\|Comp' | grep openshift | awk 'system("oc delete pods "$2" -n "$1" --force --grace-period=0")'

Delete Completed Pods

Terminal window
oc delete pod --field-selector=status.phase==Succeeded --all-namespaces
oc get pods --all-namespaces | awk '{if ($4 == "Completed") system ("oc delete pod " $2 " -n " $1 )}'
read -p "Namespace: " ns; read -p "Stato (e.g. Error, Completed): " status; oc get pods -n "$ns" --no-headers | awk -v s="$status" '$3 == s { system("oc delete pod " $1 " -n " "'$ns'") }'
oc delete pod --field-selector=status.phase==Failed --all-namespaces
oc delete pod --field-selector=status.phase==Pending --all-namespaces
oc delete pod --field-selector=status.phase==Evicted --all-namespaces
oc get pods --all-namespaces | awk '{if ($4 != "Running") system ("oc delete pod " $2 " -n " $1 )}'

Change the image garbage collection (GC) thresholds

Modify kubelet GC settings:

Terminal window
oc label machineconfigpool worker custom-kubelet=enabled
cat <<EOF | oc apply -f -
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: custom-config
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: enabled
kubeletConfig:
ImageGCHighThresholdPercent: 70
ImageGCLowThresholdPercent: 60
EOF

Full cleanup with Podman

Run a full system prune:

Terminal window
sudo podman system prune -a -f

Delete all resources

Terminal window
oc delete all --all

Delete resources for one specific app

Terminal window
oc delete services -l app=ruby-ex
oc delete all -l app=ruby-ex

Clean up old docker images on nodes

Keeping up to three tag revisions and resources younger than sixty minutes

Terminal window
oc adm prune images --keep-tag-revisions=3 --keep-younger-than=60m

Pruning every image that exceeds defined limits

Terminal window
oc adm prune images --prune-over-size-limit

Jobs

Create a simple Job

Terminal window
kubectl create job hello --image=alpine -- echo "Hello World"

Create a CronJob that prints “Hello World” every minute

Terminal window
kubectl create cronjob hello --image=alpine --schedule="*/1 * * * *" -- echo "Hello World"

Cluster

Set control-plane nodes as NoSchedulable

Terminal window
oc patch schedulers.config.openshift.io/cluster --type merge --patch '{"spec":{"mastersSchedulable": false}}'

This removes the worker label from the masters. OpenShift components will move to worker nodes when rescheduled. Delete the pods to trigger reconciliation.


Set a Default Node Selector

Terminal window
oc patch namespace default -p '{"metadata": {"annotations": {"openshift.io/node-selector": "node-role.kubernetes.io/worker"}}}'

Disable Project-wide Node Selector

Terminal window
oc annotate namespace default openshift.io/node-selector-

Routers

Rollout the latest deployment

Terminal window
oc rollout -n openshift-ingress restart deployment/router-default

Delete router pods to force reconciliation

Terminal window
oc delete pod -n openshift-ingress -l ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default

RBAC

List role per groups

Terminal window
oc get rolebindings,clusterrolebindings --all-namespaces -o json | jq -r '
.items[] |
select(.subjects[]? | select(.kind == "Group")) as $binding |
$binding.subjects[] |
select(.kind == "Group") |
"NAMESPACE: \($binding.metadata.namespace // "Cluster-wide") KIND: \($binding.kind) NAME: \($binding.metadata.name) ROLE: \($binding.roleRef.name) GROUP: \(.name)"'

List all users/groups with cluster-admin rights

Terminal window
oc get clusterrolebindings -o json | jq '.items[] | select(.roleRef.name=="cluster-admin")' | jq '.subjects[0].name'

List all cluster-role / role

Terminal window
oc get clusterroles -o json | jq '.items[].metadata.name'
oc get roles -o json | jq '.items[].metadata.name'

Add a role to a user

Terminal window
oc adm policy add-role-to-user admin oia -n python

Add a cluster role to a user

Terminal window
oc adm policy add-cluster-role-to-user cluster-reader system:serviceaccount:monitoring:default

Add a security context constraint (SCC) to a user

Terminal window
oc adm policy add-scc-to-user anyuid -z default

Verify user permission

Terminal window
oc auth can-i command --as user_to_impersonate \
--as-group group_to_impersonate
oc auth can-i get pods -A \
--as system:serviceaccount:auth-tls:health-robot
oc auth can-i create project -A \
--as system:serviceaccount:auth-tls:health-robot
oc auth can-i get users -A \
--as admin-backdoor --as-group backdoor-administrators

Verify user permission

Terminal window
oc get nodes --as admin

Show SCC and add policy

Terminal window
oc get pods -A -o custom-columns="NAME:.metadata.name,SCC:.metadata.annotations.openshift\.io/scc"
oc get pods -o custom-columns="NAME:.metadata.name,SECURITY_CONTEXT:.spec.securityContext"
oc get deployment <DEPLOY> -n <NAMESPACE> -o yaml | oc adm policy scc-subject-review -f -
oc get pod <POD> -o yaml | oc adm policy scc-subject-review -f -
oc adm policy add-scc-to-user hostmount-anyuid -z default
oc get scc -o custom-columns=Name:.metadata.name,Users:.users,Priority:.priority
oc get scc restricted-v2 -o custom-columns=SECCOMP_PROFILE:.seccompProfiles

Identity Providers

Add an HTPasswd Identity Provider

Create a secret with the htpasswd file:

Terminal window
oc create secret generic htpass-secret --from-file=htpasswd=/path/to/htpasswd -n openshift-config

Patch the OAuth resource to add the htpasswd provider:

apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: my_htpasswd_provider
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpass-secret

Apply the configuration:

Terminal window
oc apply -f oauth.yaml

Add a GitHub Identity Provider

Create a GitHub OAuth client:

Terminal window
oc create secret generic github-secret --from-literal=clientSecret=<your-client-secret> -n openshift-config

Patch the OAuth resource to add the GitHub provider:

apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: github
mappingMethod: claim
type: GitHub
github:
clientID: <your-client-id>
clientSecret:
name: github-secret
organizations:
- my-org

Apply the configuration:

Terminal window
oc apply -f oauth.yaml

OpenShift Authentication / LDAP

1. Stato rapido

Terminal window
oc get co authentication console ingress
oc -n openshift-authentication get pods -o wide
oc get oauth cluster -o yaml

2. Log OAuth con errori LDAP/TLS/timeout

Terminal window
for p in $(oc -n openshift-authentication get pod -l app=oauth-openshift -o name); do
echo "### $p"
oc -n openshift-authentication logs "$p" --since=30m | \
egrep -i 'AuthenticationError|ldap|x509|tls|invalid credentials|claimed by identity|not found|no such object|timeout'
done

3. Test LDAPS da tutti i pod OAuth — versione breve

Terminal window
for p in $(oc -n openshift-authentication get pod -l app=oauth-openshift -o name); do
echo -n "$p -> "
oc -n openshift-authentication exec "$p" -- bash -lc '
timeout 5 bash -c "cat < /dev/null > /dev/tcp/10.213.48.178/636" >/dev/null 2>&1 \
&& echo OK || echo FAIL
'
done

4. Test LDAPS da tutti i pod OAuth — versione estesa

Terminal window
for p in $(oc -n openshift-authentication get pod -l app=oauth-openshift -o name); do
echo "### $p"
oc -n openshift-authentication exec "$p" -- bash -lc '
echo "HOST=$(hostname)"
getent hosts msad1.cariprpc.it || true
cat /etc/resolv.conf
echo -n "TCP636: "
timeout 5 bash -c "cat < /dev/null > /dev/tcp/10.213.48.178/636" && echo OK || echo FAIL
'
done

5. Test LDAPS via DNS invece che IP

Terminal window
for p in $(oc -n openshift-authentication get pod -l app=oauth-openshift -o name); do
echo -n "$p -> "
oc -n openshift-authentication exec "$p" -- bash -lc '
timeout 5 bash -c "cat < /dev/null > /dev/tcp/msad1.cariprpc.it/636" >/dev/null 2>&1 \
&& echo OK || echo FAIL
'
done

6. Verifica certificato della route OAuth

Terminal window
HOST=$(oc -n openshift-authentication get route oauth-openshift -o jsonpath='{.spec.host}')
echo "$HOST"
openssl s_client -connect ${HOST}:443 -servername ${HOST} </dev/null 2>/dev/null | \
openssl x509 -noout -subject -issuer -dates

7. Verifica reachability LDAP dai nodi master (host network)

Terminal window
for n in ocpapp-dr-g5t4w-master-0 ocpapp-dr-g5t4w-master-1 ocpapp-dr-g5t4w-master-2; do
echo "### $n"
oc debug node/$n -- chroot /host bash -lc '
echo -n "NODE=$(hostname) TCP636: "
timeout 5 bash -c "cat < /dev/null > /dev/tcp/10.213.48.178/636" && echo OK || echo FAIL
echo -n "ROUTE: "
ip route get 10.213.48.178 2>/dev/null || true
'
done

8. NetworkPolicy nel namespace openshift-authentication

Terminal window
oc get netpol -n openshift-authentication -o yaml

9. Restart mirato di un solo pod OAuth

Terminal window
oc delete pod -n openshift-authentication <oauth-openshift-pod>
oc -n openshift-authentication get pods -w
---
## **Images**
### List All Images in the Cluster
```bash
oc get images

Import an Image from an External Registry

Terminal window
oc import-image myimage:latest --from=docker.io/library/myimage:latest --confirm

Tag an Image for Internal Use

Terminal window
oc tag myimage:latest myproject/myimage:stable

Prune Unused Images

Terminal window
oc adm prune images --confirm

Build an Image from Source Code

Terminal window
oc new-build https://github.com/openshift/ruby-hello-world.git --name=ruby-app

Start a Build

Terminal window
oc start-build ruby-app

Monitor Build Logs

Terminal window
oc logs -f bc/ruby-app

Deploy an Image

Terminal window
oc new-app myimage:stable -n myproject

Image Registry

Rollout the latest deployment

Terminal window
oc rollout -n openshift-image-registry restart deploy/image-registry

Delete image registry pods

Terminal window
oc delete pod -n openshift-image-registry -l docker-registry=default

Monitoring Stack

Rollout the latest deployments and statefulsets

Terminal window
oc rollout -n openshift-monitoring restart statefulset/alertmanager-main
oc rollout -n openshift-monitoring restart statefulset/prometheus-k8s
oc rollout -n openshift-monitoring restart deployment/grafana
oc rollout -n openshift-monitoring restart deployment/kube-state-metrics
oc rollout -n openshift-monitoring restart deployment/openshift-state-metrics
oc rollout -n openshift-monitoring restart deployment/prometheus-adapter
oc rollout -n openshift-monitoring restart deployment/telemeter-client
oc rollout -n openshift-monitoring restart deployment/thanos-querier

Delete monitoring stack pods to force reconciliation

Terminal window
oc delete pod -n openshift-monitoring -l app=alertmanager
oc delete pod -n openshift-monitoring -l app=prometheus
oc delete pod -n openshift-monitoring -l app=grafana
oc delete pod -n openshift-monitoring -l app.kubernetes.io/name=kube-state-metrics
oc delete pod -n openshift-monitoring -l k8s-app=openshift-state-metrics
oc delete pod -n openshift-monitoring -l name=prometheus-adapter
oc delete pod -n openshift-monitoring -l k8s-app=telemeter-client
oc delete pod -n openshift-monitoring -l app.kubernetes.io/component=query-layer

List All Container Images

List all container images running in a cluster

Terminal window
oc get pods -A -o go-template --template='{{range .items}}{{range .spec.containers}}{{printf "%s\\n" .image -}} {{end}}{{end}}' | sort -u | uniq

List all container images stored in a cluster

Terminal window
for node in $(oc get nodes -o name); do
oc debug ${node} -- chroot /host sh -c 'crictl images -o json' 2>/dev/null | jq -r .images[].repoTags[];
done | sort -u

Cluster Upgrade

Terminal window
oc get clusterversion
oc adm upgrade
oc patch clusterversion version --type merge -p '{"spec":{"channel":"stable-4.14"}}'
oc adm upgrade --to=4.14.10
watch oc get clusterversion
oc get co

Switch Cluster Version Channel

Terminal window
oc patch \
--patch='{"spec": {"channel": "prerelease-4.1"}}' \
--type=merge \
clusterversion/version

Unmanage Operators

Retrieve current overrides

Terminal window
oc get -o json clusterversion version | jq .spec.overrides

Add a ComponentOverride to set the network operator unmanaged

  1. Extract the operator definition:

    Terminal window
    head -n5 /tmp/mystuff/0000_07_cluster-network-operator_03_daemonset.yaml

    Example:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: network-operator
    namespace: openshift-network-operator
  2. Create the patch YAML file: If no overrides exist:

    - op: add
    path: /spec/overrides
    value:
    - kind: Deployment
    group: apps
    name: network-operator
    namespace: openshift-network-operator
    unmanaged: true

    If overrides already exist:

    - op: add
    path: /spec/overrides/-
    value:
    - kind: Deployment
    group: apps
    name: network-operator
    namespace: openshift-network-operator
    unmanaged: true
  3. Apply the patch:

    Terminal window
    oc patch clusterversion version --type json -p "$(cat version-patch.yaml)"

Verify

Terminal window
oc get -o json clusterversion version | jq .spec.overrides

Disabling the Cluster Version Operator

Terminal window
oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator

Machine Config

List all MachineConfig objects

Terminal window
oc get machineconfigs

View details of a specific MachineConfig

Terminal window
oc describe machineconfig <machineconfig-name>

Create a custom MachineConfig

Example YAML:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: custom-config
labels:
machineconfiguration.openshift.io/role: worker
spec:
config:
ignition:
version: 3.2.0
storage:
files:
- path: /etc/mycustomconfig
contents:
source: data:,custom%20content%20here

Apply the configuration:

Terminal window
oc apply -f custom-config.yaml

Update Kubelet Configuration

Create a KubeletConfig:

apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: custom-kubelet
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: enabled
kubeletConfig:
cpuManagerPolicy: "static"
cpuManagerReconcilePeriod: "5s"

Apply the configuration:

Terminal window
oc apply -f kubelet-config.yaml

Update MCP maxUnavailable

Terminal window
oc patch --type merge machineconfigpool/<machineconfigpool> -p '{"spec":{"maxUnavailable":<value>}}'

Pause/Unpause MCP

Terminal window
oc patch mcp/<mcp_name> --patch '{"spec":{"paused":true}}' --type=merge
oc patch mcp/<mcp_name> --patch '{"spec":{"paused":false}}' --type=merge

Scale Up Control Plane Machineset

Terminal window
oc patch controlplanemachineset.machine.openshift.io cluster -n openshift-machine-api --type=merge -p '{"spec":{"template":{"machines_v1beta1_machine_openshift_io":{"spec":{"providerSpec":{"value":{"numCPUs":8,"memoryMiB":32768}}}}}}}'

Monitoring

List Monitoring Stack Components

Terminal window
oc get pods -n openshift-monitoring

Restart a Monitoring Component

Terminal window
oc rollout restart deployment/grafana -n openshift-monitoring

Silence Alerts

Create a silence using the Alertmanager UI or CLI. Example CLI:

Terminal window
amtool silence add alertname="TargetDown" instance="example-instance"

Query Prometheus

Access the Prometheus UI or use oc to query:

Terminal window
oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl 'http://localhost:9090/api/v1/query?query=up'

Enable User Workload Monitoring

Patch the config to enable it:

Terminal window
oc patch configmap cluster-monitoring-config -n openshift-monitoring --patch='{"data":{"config.yaml":"enableUserWorkload: true"}}'

Monitor Custom Metrics

Deploy a custom application exposing metrics and configure Prometheus to scrape them by creating a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: custom-app-monitor
labels:
team: custom-app
spec:
selector:
matchLabels:
app: custom-app
endpoints:
- port: metrics

Apply the configuration:

Terminal window
oc apply -f custom-app-monitor.yaml

OVN

OpenShift OVN-Kubernetes

1. Stato rapido dei pod OVN sui master

Terminal window
oc get pods -n openshift-ovn-kubernetes -o wide | \
egrep 'ovnkube-node|ovnkube-control-plane|master-0|master-1|master-2'

2. Stato dei container degli ovnkube-node

Terminal window
for p in $(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-node -o jsonpath='{range.items[*]}{.metadata.name}{"\n"}{end}'); do
echo "=== $p ==="
oc get pod -n openshift-ovn-kubernetes "$p" -o json | \
jq -r '.spec.nodeName, (.status.containerStatuses[] | "\(.name)=\(.ready)")'
done

3. Eventi recenti di OVN

Terminal window
oc get events -n openshift-ovn-kubernetes --sort-by=.lastTimestamp | tail -100

4. Log utili degli ovnkube-node sui 3 master

Terminal window
for p in ovnkube-node-kd568 ovnkube-node-tcr28 ovnkube-node-ms268; do
echo "### $p : ovn-controller"
oc logs -n openshift-ovn-kubernetes $p -c ovn-controller --since=2h | \
egrep -i 'error|warn|timeout|conntrack|openflow|geneve|health|route|gateway|mtu'
echo "### $p : ovnkube-controller"
oc logs -n openshift-ovn-kubernetes $p -c ovnkube-controller --since=2h | \
egrep -i 'error|warn|timeout|egress|route|gateway|management port'
done

5. Log del control plane OVN

Terminal window
for p in $(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-control-plane -o name); do
echo "### $p"
oc logs -n openshift-ovn-kubernetes "$p" -c ovnkube-cluster-manager --since=2h | \
egrep -i 'error|warn|timeout|master|node|egress|route|gateway'
done

6. PodNetworkConnectivityCheck verso i master

Terminal window
for x in \
network-check-source-ocpapp-dr-g5t4w-worker-0-t56f7-to-network-check-target-ocpapp-dr-g5t4w-master-0 \
network-check-source-ocpapp-dr-g5t4w-worker-0-t56f7-to-network-check-target-ocpapp-dr-g5t4w-master-1 \
network-check-source-ocpapp-dr-g5t4w-worker-0-t56f7-to-network-check-target-ocpapp-dr-g5t4w-master-2
do
echo "### $x"
oc get podnetworkconnectivitycheck -n openshift-network-diagnostics "$x" -o yaml | \
sed -n '/status:/,$p'
done

7. Test host network dei master verso LDAP

Terminal window
for n in ocpapp-dr-g5t4w-master-0 ocpapp-dr-g5t4w-master-1 ocpapp-dr-g5t4w-master-2; do
echo "### $n"
oc debug node/$n -- chroot /host bash -lc '
echo -n "NODE=$(hostname) TCP636: "
timeout 5 bash -c "cat < /dev/null > /dev/tcp/10.213.48.178/636" && echo OK || echo FAIL
echo -n "ROUTE: "
ip route get 10.213.48.178 2>/dev/null || true
'
done

8. Test pod network dai pod OAuth verso LDAP

Terminal window
for p in $(oc -n openshift-authentication get pod -l app=oauth-openshift -o name); do
echo -n "$p -> "
oc -n openshift-authentication exec "$p" -- bash -lc '
timeout 5 bash -c "cat < /dev/null > /dev/tcp/10.213.48.178/636" >/dev/null 2>&1 \
&& echo OK || echo FAIL
'
done

9. Restart mirato degli ovnkube-node sui master problematici

Master-0

Terminal window
oc delete pod -n openshift-ovn-kubernetes ovnkube-node-kd568
oc get pod -n openshift-ovn-kubernetes -w | grep ovnkube-node-kd568

Master-1

Terminal window
oc delete pod -n openshift-ovn-kubernetes ovnkube-node-tcr28
oc get pod -n openshift-ovn-kubernetes -w | grep ovnkube-node-tcr28

10. Ritest dopo restart OVN

Terminal window
for p in $(oc -n openshift-authentication get pod -l app=oauth-openshift -o name); do
echo -n "$p -> "
oc -n openshift-authentication exec "$p" -- bash -lc '
timeout 5 bash -c "cat < /dev/null > /dev/tcp/10.213.48.178/636" >/dev/null 2>&1 \
&& echo OK || echo FAIL
'
done

Operator-Lifecycle-Manager (OLM)

List Installed Operators

Terminal window
oc get csv -n openshift-operators
oc get csv -A --no-headers -o custom-columns=NAME:.metadata.name,DISPLAY:.spec.displayName,VERSION:.spec.version | sort | uniq

Install an Operator

Create a subscription for the operator:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: my-operator
namespace: openshift-operators
spec:
channel: stable
name: my-operator
source: operatorhubio-catalog
sourceNamespace: openshift-marketplace

Apply the subscription:

Terminal window
oc apply -f subscription.yaml

Check the Status of an Operator

Terminal window
oc get csv -n openshift-operators

Uninstall an Operator

Delete the subscription and CSV:

Terminal window
oc delete subscription my-operator -n openshift-operators
oc delete csv my-operator.v1.0.0 -n openshift-operators

Approve a Manual InstallPlan

Terminal window
oc patch installplan install-xxxxx -n openshift-operators --type merge --patch '{"spec": {"approved": true}}'

View Operator Logs

Find the operator’s pod and view logs:

Terminal window
oc get pods -n openshift-operators
oc logs my-operator-pod -n openshift-operators

Create a Custom Resource for an Operator

Example YAML:

apiVersion: app.example.com/v1
kind: ExampleApp
metadata:
name: example-app
namespace: myproject
spec:
size: 3

Apply the custom resource:

Terminal window
oc apply -f example-app.yaml

Check Operator Conditions

Terminal window
oc get csv my-operator.v1.0.0 -n openshift-operators -o jsonpath='{.status.conditions}'

List Available Operators in the Marketplace

Terminal window
oc get packagemanifests -n openshift-marketplace

Describe a Specific Operator

Terminal window
oc describe packagemanifest my-operator -n openshift-marketplace

Update an Operator Subscription

Terminal window
oc patch subscription my-operator -n openshift-operators --type merge --patch '{"spec": {"channel": "stable"}}'

Routers

Restart a Router

Restart the default router deployment:

Terminal window
oc rollout restart deployment/router-default -n openshift-ingress

List Router Pods

Terminal window
oc get pods -n openshift-ingress -l ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default

Delete Router Pods to Trigger Reconciliation

Terminal window
oc delete pod -n openshift-ingress -l ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default

Check Router Logs

Terminal window
oc logs -n openshift-ingress pod/router-default-xxxxx

Expose a Route

Expose a service using a route:

Terminal window
oc expose service my-service --hostname=my.custom.domain

List Routes

Terminal window
oc get routes -A

Storage

List Persistent Volume Claims (PVCs)

Terminal window
oc get pvc -A

Describe a PVC

Terminal window
oc describe pvc my-pvc

Create a PVC

Example YAML for a PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi

Apply the PVC:

Terminal window
oc apply -f pvc.yaml

List Storage Classes

Terminal window
oc get storageclass

Set Default Storage Class

Terminal window
oc patch storageclass <storage-class-name> -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "true"}}}'

Delete a PVC

Terminal window
oc delete pvc my-pvc

Create a Persistent Volume (PV)

Example YAML for a PV:

apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: /mnt/data

Apply the PV:

Terminal window
oc apply -f pv.yaml

Expand a PVC

Ensure the underlying storage class supports expansion. Then patch the PVC:

Terminal window
oc patch pvc my-pvc -p '{"spec": {"resources": {"requests": {"storage": "20Gi"}}}}'

Pull Secrets

Create a Pull Secret

Create a pull secret to authenticate with an external container registry:

Terminal window
oc create secret docker-registry my-pull-secret \
--docker-server=<registry-server> \
--docker-username=<username> \
--docker-password=<password> \
--docker-email=<email>
Terminal window
oc secrets link default my-pull-secret --for=pull

View Linked Secrets for a ServiceAccount

Terminal window
oc get serviceaccount default -o yaml

Update the Global Pull Secret

  1. Edit the pull secret:
    Terminal window
    oc edit secret pull-secret -n openshift-config
  2. Add the credentials for the desired registry in the auths section.

Add Credentials to a Namespaced Secret

Create a new secret with updated credentials:

Terminal window
oc create secret docker-registry my-namespace-pull-secret \
--docker-server=<registry-server> \
--docker-username=<username> \
--docker-password=<password> \
--docker-email=<email> -n mynamespace

Link the new secret to a service account:

Terminal window
oc secrets link my-serviceaccount my-namespace-pull-secret --for=pull -n mynamespace

View secret in STDOUT:

Terminal window
oc extract secret my-namespace-pull-secret -n mynamespace --to=-

Registries

List Images in the Internal Registry

Terminal window
oc get is -A

Expose the Internal Registry Externally

Terminal window
oc patch configs.imageregistry.operator.openshift.io/cluster \
--type merge \
--patch '{"spec":{"defaultRoute":true}}'

Retrieve the route:

Terminal window
oc get route default-route -n openshift-image-registry

Mirror an External Image to the Internal Registry

Terminal window
oc image mirror docker.io/library/nginx:latest \
image-registry.openshift-image-registry.svc:5000/myproject/nginx:latest

Set Registry Resource Limits

Terminal window
oc patch configs.imageregistry.operator.openshift.io/cluster \
--type merge \
--patch '{"spec":{"resources":{"requests":{"memory":"1Gi"},"limits":{"memory":"2Gi"}}}}'

Prune Old Images

Terminal window
oc adm prune images --confirm

Force Garbage Collection on the Internal Registry

Terminal window
oc patch configs.imageregistry.operator.openshift.io/cluster \
--type merge \
--patch '{"spec":{"managementState":"Managed"}}'

Run garbage collection:

Terminal window
oc exec -n openshift-image-registry -it $(oc get pods -n openshift-image-registry -l docker-registry=default -o jsonpath='{.items[0].metadata.name}') -- registry garbage-collect /config.yml

OpenShift Container Platform Troubleshooting

Inspect all resources in a namespace

Terminal window
oc adm inspect ns/mynamespace

Run cluster diagnostics

Terminal window
oc adm diagnostics

Collect must-gather

Terminal window
oc adm must-gather

Check status of the current project

Terminal window
oc status

Get events for a project sorted by timestamp

Terminal window
oc get events --sort-by=.metadata.creationTimestamp
oc get events --sort-by='.lastTimestamp'

Get events of type Warning

Terminal window
oc get ev --field-selector type=Warning -o jsonpath='{.items[].message}{"\n"}'

Logs management

Get the logs of a specific pod

Terminal window
oc logs myrunning-pod-2-fdthn

Follow the logs of a specific pod

Terminal window
oc logs -f myrunning-pod-2-fdthn

Tail the logs of a specific pod

Terminal window
oc logs myrunning-pod-2-fdthn --tail=50

Check the integrated Docker registry logs

Terminal window
oc logs docker-registry-n-{xxxxx} -n default | less

Create a temporary namespace to debug the node

Terminal window
oc debug node/master01

Troubleshooting

Check Cluster Status

Terminal window
oc status

View Cluster Events

Terminal window
oc get events -A --sort-by=.metadata.creationTimestamp

Check Pod Logs

Terminal window
oc logs pod-name

Follow logs for a pod:

Terminal window
oc logs -f pod-name

Debug a Pod

Start a debug session:

Terminal window
oc debug pod/pod-name

Inspect a Node

Terminal window
oc debug node/node-name

Restart a Deployment

Terminal window
oc rollout restart deployment/deployment-name

Check Network Connectivity from a Pod

Use a debug pod to check connectivity:

Terminal window
oc run debug-pod --image=registry.access.redhat.com/ubi8/ubi --restart=Never --command -- sleep infinity
oc exec -it debug-pod -- curl -v http://service-name:port

Diagnose DNS Issues

Check if DNS resolution works:

Terminal window
oc exec -it pod-name -- nslookup service-name

View Resource Usage

View node resource usage:

Terminal window
oc adm top nodes

View pod resource usage:

Terminal window
oc adm top pods -A

Describe Resources

Describe a pod:

Terminal window
oc describe pod pod-name

Describe a node:

Terminal window
oc describe node node-name

Collect Cluster Diagnostics

Terminal window
oc adm diagnostics

Use Must-Gather

Collect diagnostics using must-gather:

Terminal window
oc adm must-gather

Check Image Registry Logs

Terminal window
oc logs -n openshift-image-registry deployment/image-registry

Analyze CrashLoopBackOff

Check the previous logs for a pod:

Terminal window
oc logs --previous pod-name

Debug with a Temporary Namespace

Create a temporary debug namespace:

Terminal window
oc new-project debug-namespace

Delete it when done:

Terminal window
oc delete project debug-namespace

Reset a Node

Drain and reboot a node:

Terminal window
oc adm drain node-name --ignore-daemonsets --force
reboot

Stato Kubelet sui nodi

Terminal window
oc get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.conditions[?(@.type=="Ready")].status}{"\n"}{end}'

Recupero Proxy Cluster

Terminal window
oc get proxy/cluster -o json | jq -r '
"export HTTP_PROXY=\(.spec.httpProxy)
export HTTPS_PROXY=\(.spec.httpsProxy)
export NO_PROXY=\"\(.spec.noProxy)\"
export http_proxy=\(.spec.httpProxy)
export https_proxy=\(.spec.httpsProxy)
export no_proxy=\"\(.spec.noProxy)\""
'

Disable disableCopiedCSVs parameter to true for the OLMConfig

Terminal window
https://docs.redhat.com/en/documentation/openshift_container_platform/4.16/html/operators/administrator-tasks#olm-disabling-copied-csvs_olm-config
oc apply -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
name: cluster
spec:
features:
disableCopiedCSVs: true
EOF

ETCD

Check the etcd status:

Terminal window
export ETCD_POD_NAME=$(oc get pods -n openshift-etcd -l app=etcd -o jsonpath='{.items[0].metadata.name}')
export ETCD_POD_NAME=$(oc get pods -n openshift-etcd -l app=etcd --field-selector="status.phase==Running" -o jsonpath="{.items[0].metadata.name}")
oc exec -n openshift-etcd -c etcd ${ETCD_POD_NAME} -- etcdctl member list -w table
oc exec -n openshift-etcd -c etcd ${ETCD_POD_NAME} -- etcdctl endpoint health --cluster -w table
oc exec -n openshift-etcd -c etcd ${ETCD_POD_NAME} -- etcdctl endpoint status --cluster -w table
oc exec -n openshift-etcd -c etcd ${ETCD_POD_NAME} -- etcdctl endpoint status --cluster -w json | jq '.[] | ((.Status.dbSize - .Status.dbSizeInUse)/.Status.dbSize)*100'
oc exec -n openshift-etcd -c etcd $ETCD_POD_NAME -- etcdctl alarm list
oc exec -n openshift-etcd -c etcd $ETCD_POD_NAME -- etcdctl defrag
oc exec -n openshift-etcd -c etcdctl ${ETCD_POD_NAME} -- sh -c "etcdctl get / --prefix --keys-only | grep -oE '^/[a-z|.]+/[a-z|.|8]*' | sort | uniq -c | sort -rn" | while read KEY; do printf "$KEY\t" && oc exec -n openshift-etcd ${ETCD_POD_NAME} -c etcdctl -- etcdctl get ${KEY##* } --prefix --write-out=json | jq '[.kvs[].value | length] | add ' | numfmt --to=iec ; done | sort -k3 -hr | column -t
for i in `oc get pods -n openshift-etcd | egrep -v "NAME|guard|Succeeded" | awk '{ print $1 }'`; do echo "-- $i"; oc logs $i -c etcd -n openshift-etcd 2>&1 | awk -v min=999 'function norm(p){split($0,a,",");gsub("[tok:\"]","",a[p]);if (a[p] ~ ".*[0-9]s")a[p]*=1000; return a[p]*=1} {if (NR==1) start=$1} /took too long/ {b=norm(5); if (tmin==0) tmin=b; if (b<tmin) tmin=b; if (b>tmax) tmax=b; tavg+=b; t++} /context deadline exceeded/ {d++} /finished scheduled compaction/ {b=norm(6); if (b<min) min=b; if (b>max) max=b; avg+=b; c++} ENDFILE{end=$1} END{if (t==0) t--; printf " Log range:\t\t%s - %s\n took too long:\ttotal %d - min %d - max %d - avg %d\n deadline exceeded:\t%d\n compaction times:\ttotal %d - min %d - max %d - avg %d\n",start,end,t,tmin,tmax,tavg/t,d,c,min,max,avg/c}'; done
oc logs -n openshift-etcd -c etcd $ETCD_POD_NAME --tail=500 | egrep -i 'fsync|slow|leader|timeout|alarm'

Diagnostic Steps:

Terminal window
oc get pod -n openshift-etcd
oc logs etcd-XYZ-master-0 -c etcd -n openshift-etcd
oc rsh -n openshift-etcd <etcd pod>
(From inside container run below commands)
etcdctl member list -w table
etcdctl endpoint health --cluster
etcdctl endpoint status -w table
in case oc command doesn't work, connect with ssh to node and run
crictl logs $(crictl ps -aql --label "io.kubernetes.container.name=etcd-member")
crictl logs --since 48h $(crictl ps -aql --label "io.kubernetes.container.name=etcd-member")

Collect metrics:

Terminal window
mkdir etcd-metrics
for etcd_pod in `oc get pods -l k8s-app=etcd -n openshift-etcd -o jsonpath='{.items[*].metadata.name}'`; do oc exec -it $etcd_pod -n "openshift-etcd" -c "etcdctl" -- sh -c 'curl --cert $ETCDCTL_CERT --key $ETCDCTL_KEY --cacert $ETCDCTL_CACERT https://localhost:2379/metrics' &> etcd-metrics/${etcd_pod}_metrics.txt;done

Check the etcd objects:

Terminal window
export ETCD_POD_NAME=$(oc get pods -n openshift-etcd -l app=etcd --field-selector="status.phase==Running" -o jsonpath="{.items[0].metadata.name}")
oc exec -n openshift-etcd -c etcdctl ${ETCD_POD_NAME} -- sh -c "etcdctl get / --prefix --keys-only | grep -oE '^/[a-z|.]+/[a-z|.|8]*' | sort | uniq -c | sort -rn" | while read KEY; do printf "$KEY\t" && oc exec -n openshift-etcd ${ETCD_POD_NAME} -c etcdctl -- etcdctl get ${KEY##* } --prefix --write-out=json | jq '[.kvs[].value | length] | add ' | numfmt --to=iec ; done | sort -k3 -hr | column -t

Check the number of etcd objects:

Terminal window
oc project openshift-etcd
oc get po
oc rsh etcd-pod-name
sh-5.1# etcdctl get / --prefix --keys-only | sed '/^$/d' | cut -d/ -f3 | sort | uniq -c | sort -rn

Backup ETCD shell:

### 0 0 * * * /usr/local/bin/etcd_backup.sh GCP-PRD 172.26.3.13 >> /home/ocp/backup-etcd/etcd_backup.log 2>&1
cat <<EOF > backup_script.sh
#!/bin/bash
# Uso: ./backup_script.sh <Nome Cluster> <IP Master>
if [ "\$#" -ne 2 ]; then
echo "Uso: \$0 <Nome Cluster> <IP Master>"
exit 1
fi
CLUSTER_NAME=\$1
MASTER_IP=\$2
BACKUP_PATH="/root/backup-etcd/\${CLUSTER_NAME}"
/bin/echo [\$(date +"%F %T")] Starting \${CLUSTER_NAME} Backup... &>> /var/log/\${CLUSTER_NAME}-backup.log
/bin/ssh -i /root/.ssh/ocp-acmac core@\${MASTER_IP} '/bin/sudo /usr/local/bin/cluster-backup.sh /home/core/backup && /bin/sudo /bin/find /home/core/backup -mtime +5 -delete && /bin/sudo /bin/chown -vR core:core /home/core/backup'
/bin/rsync -av --delete -e "/bin/ssh -i /root/.ssh/ocp-acmac" core@\${MASTER_IP}:/home/core/backup \${BACKUP_PATH} &>> /var/log/\${CLUSTER_NAME}-backup.log
/bin/echo [\$(date +"%F %T")] Terminated \${CLUSTER_NAME} Backup. &>> /var/log/\${CLUSTER_NAME}-backup.log
EOF

Backup ETCD cronjob:

Terminal window
apiVersion: v1
kind: Namespace
metadata:
name: ocp-backup-etcd
labels:
app: openshift-backup
annotations:
openshift.io/node-selector: ''
---
kind: ServiceAccount
apiVersion: v1
metadata:
name: openshift-backup
namespace: ocp-backup-etcd
labels:
app: openshift-backup
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-etcd-backup
labels:
app: openshift-backup
rules:
- apiGroups: [""]
resources:
- "nodes"
verbs: ["get", "list"]
- apiGroups: [""]
resources:
- "pods"
- "pods/log"
verbs: ["get", "list", "create", "delete", "watch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: openshift-backup
labels:
app: openshift-backup
subjects:
- kind: ServiceAccount
name: openshift-backup
namespace: ocp-backup-etcd
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-etcd-backup
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: openshift-backup-privileged
namespace: ocp-backup-etcd
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:openshift:scc:privileged
subjects:
- kind: ServiceAccount
name: openshift-backup
namespace: ocp-backup-etcd
---
kind: CronJob
apiVersion: batch/v1
metadata:
name: openshift-backup
namespace: ocp-backup-etcd
labels:
app: openshift-backup
spec:
schedule: "56 23 * * *"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 5
failedJobsHistoryLimit: 5
jobTemplate:
metadata:
labels:
app: openshift-backup
spec:
backoffLimit: 0
template:
metadata:
labels:
app: openshift-backup
spec:
containers:
- name: backup
image: "registry.redhat.io/openshift4/ose-cli:v4.10"
command:
- "/bin/bash"
- "-c"
- oc get no -l node-role.kubernetes.io/master --no-headers -o name | xargs -I {} -- oc debug {} -- bash -c 'chroot /host sudo -E /usr/local/bin/cluster-backup.sh /home/core/backup/ && chroot /host sudo -E find /home/core/backup/ -type f -ctime +"2" -delete'
restartPolicy: "Never"
terminationGracePeriodSeconds: 30
activeDeadlineSeconds: 600
dnsPolicy: "ClusterFirst"
serviceAccountName: "openshift-backup"
serviceAccount: "openshift-backup"

API Server (correlazione con error budget):

Terminal window
oc -n openshift-kube-apiserver get pods
oc -n openshift-kube-apiserver logs <pod-apiserver> --tail=500 | egrep -i 'slow|etcd|timeout'

Security

Create a secret from the CLI

Terminal window
oc create secret generic oia-secret --from-literal=username=myuser \
--from-literal=password=mypassword

Use secret in deployment env

Terminal window
oc set env deployment/ --from secret/oia-secret

Mount the Secret on a Volume

Terminal window
oc set volumes dc/myapp --add --name=secret-volume --mount-path=/opt/app-root/ \
--secret-name=oia-secret

List Istio Authorization Policies details (extract to csv)

Terminal window
(echo "Namespace,Name,Action,Principals,Namespaces,Paths"; oc get authorizationpolicies.security.istio.io --all-namespaces -o json | jq -r '.items[] | [.metadata.namespace, .metadata.name, .spec.action // "N/A", (.spec.rules[]?.from[]?.source.principals[]? // "N/A"), (if (.spec.rules[]?.from[]?.source.namespaces | type) == "array" then (.spec.rules[]?.from[]?.source.namespaces | join(",")) else .spec.rules[]?.from[]?.source.namespaces end // "N/A"), (if (.spec.rules[]?.to[]?.operation.paths | type) == "array" then (.spec.rules[]?.to[]?.operation.paths | join(",")) else .spec.rules[]?.to[]?.operation.paths end // "N/A")] | @csv') > authorizationpolicies.csv

Certificates

Sign all pending Certificate Signing Requests (CSRs)

Terminal window
oc get csr -o name | xargs oc adm certificate approve

Authenticate users using TLS certificates

  1. Generate a private key and CSR:
    Terminal window
    mkdir ${OCP_USERNAME}
    openssl req -new -nodes -subj "/CN=${OCP_USERNAME}" \
    -keyout ${OCP_USERNAME}/private.key -out ${OCP_USERNAME}/request.csr
  2. Create a CertificateSigningRequest:
    Terminal window
    cat <<EOF | oc apply -f -
    apiVersion: certificates.k8s.io/v1beta1
    kind: CertificateSigningRequest
    metadata:
    name: tls-auth-${OCP_USERNAME}
    spec:
    signerName: "kubernetes.io/kube-apiserver-client"
    request: $(cat ${OCP_USERNAME}/request.csr | base64 | tr -d '\n')
    usages:
    - digital signature
    - key encipherment
    - client auth
    EOF
  3. Approve the CSR:
    Terminal window
    oc adm certificate approve tls-auth-${OCP_USERNAME}

API

API Resources

List all API resources:

Terminal window
oc api-resources

API resources per API group

Terminal window
oc api-resources --api-group config.openshift.io -o name
oc api-resources --api-group machineconfiguration.openshift.io -o name

Explain resources

Explain resource details:

Terminal window
oc explain pods.spec.containers

For a specific API group:

Terminal window
oc explain --api-version=config.openshift.io/v1 scheduler

Miscellaneous Commands

Manage node state

Terminal window
oc adm manage node <node> --schedulable=false

Get VSphere config

Terminal window
oc get cm cloud-provider-config -o json -n openshift-config | jq -r .data.config

List installed operators

Terminal window
oc get csv

Export resources as a template

Terminal window
oc export is,bc,dc,svc --as-template=app.yaml

Show user in prompt

Terminal window
function ps1(){
export PS1='[\u@\h($(oc whoami -c 2>/dev/null|cut -d/ -f3,1)) \W]\$ '
}

Backup OpenShift objects

Terminal window
oc get all --all-namespaces --no-headers=true | awk '{print $1","$2}' | while read obj; do
NS=$(echo $obj | cut -d, -f1)
OBJ=$(echo $obj | cut -d, -f2)
FILE=$(echo $obj | sed 's/\//-/g;s/,/-/g')
echo $NS $OBJ $FILE
oc export -n $NS $OBJ -o yaml > $FILE.yml
done

Show machine-config-controller logs

Terminal window
oc logs -n openshift-machine-config-operator $(oc get pod -n openshift-machine-config-operator -o name | grep controller)

Operator stuck in “Unknown Failure” while upgrading in RHOCP 4

Terminal window
oc delete pods -l 'app in (catalog-operator, olm-operator)' -n openshift-operator-lifecycle-manager
oc rollout restart deployment.apps/catalog-operator deployment.apps/olm-operator -n openshift-operator-lifecycle-manager
for sub in $(oc get subs -n openshift-storage -o json | jq '.items[] | select((.metadata.annotations."olm.generated-by" | .!= null) and (.status.installplan==null)) | .metadata.name' -r); do oc patch subs -n openshift-storage $sub --type json -p '[{"op":"remove", "path":"/metadata/annotations/olm.generated-by"}]'; done;
oc delete pod -l olm.catalogSource=redhat-operators -n openshift-marketplace
oc delete pod -l app=catalog-operator -n openshift-operator-lifecycle-manager
oc patch sub ${SUBSCRIPTION} -n ${PROJECT} --subresource=status --type json -p '[{"op":"remove","path":"/status/conditions"}]'

Operator Upgrade Not Progressing [https://access.redhat.com/solutions/7020921]

Terminal window
for OPERATOR in ocs-operator mcg-operator odf-operator odf-csi-addons-operator cephcsi-operator ocs-client-operator odf-prometheus-operator rook-ceph-operator recipe odf-dependencies; do export OPERATOR; oc get job -n openshift-marketplace -o json | jq -r '.items[] | select(.spec.template.spec.containers[].env[].value|contains (env.OPERATOR)) | .metadata.name' >> /tmp/jobs; done
cat /tmp/jobs ( example, could be many more in customer env.)
6d97dfcfa4d148a766632d834e1ebbd6fa245631f49e8243eb42ff596722969
6f70c8b65e5a693e11613dd966e9a37bb81e3324323c2dfe14badc99e71077e
for i in `cat /tmp/jobs`; do oc delete job $i -n openshift-marketplace; oc delete configmap $i -n openshift-marketplace; done
oc delete installplans -n openshift-storage --all
oc delete subs odf-operator -n openshift-storage
oc get subs -n openshift-storage
for i in $(oc get csv -n openshift-storage -o name | grep rhodf); do oc delete $i -n openshift-storage; done
oc get catalogsource -n openshift-marketplace|grep redhat-operators
oc delete pods -l 'app in (catalog-operator, olm-operator)' -n openshift-operator-lifecycle-manager
$ vi subscription.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: odf-operator
namespace: openshift-storage
spec:
channel: "stable-4.14" # <-- Channel should be modified depending on the OCS version to be installed. Please ensure to maintain compatibility with OCP version
installPlanApproval: Automatic
name: odf-operator
source: redhat-operators # <-- Modify the name of the redhat-operators catalogsource if not default
sourceNamespace: openshift-marketplace
$ oc apply -f subscription.yaml

Retrieve MachineNetwork, Pod CIDR, Service CIDR

Terminal window
echo -n "Pod CIDR (clusterNetwork): " ; oc get network.config.openshift.io cluster -o jsonpath='{.spec.clusterNetwork[*].cidr}{"\n"}'
echo -n "Service CIDR (serviceNetwork): " ; oc get network.config.openshift.io cluster -o jsonpath='{.spec.serviceNetwork[*]}{"\n"}'
echo -n "Machine Network: " ; oc get infrastructure.config.openshift.io cluster -o jsonpath='{.status.platformStatus.vsphere.machineNetworks}{"\n"}'
- GCP
gcloud compute instances list \
--project gcp-prj-ocp-srv-prd-001 \
--filter="name~'^ocp-prd-f5ckt-'" \
--format="table(name,zone,networkInterfaces[0].network,networkInterfaces[0].subnetwork,networkInterfaces[0].networkIP)"

ODF


Script to patch CephTools

Terminal window
oc exec -n openshift-storage deployment/rook-ceph-tools -- ceph status
ceph status
ceph osd status
ceph osd pool ls
ceph df
rados df
ceph health detail
ceph versions
ceph config dump
ceph osd df tree
ceph osd pool ls detail
ceph df
ceph osd dump
ceph pg dump
ceph report
ceph osd pool autoscale-status
ceph osd crush dump
#!/bin/bash
if [ "$1" == "off" ]; then
oc patch OCSInitialization/ocsinit -n openshift-storage \
--type=merge -p='{"spec":{ "enableCephTools": false}}'
sleep 3
echo "removing any existing toolbox pod"
oc delete pods -n openshift-storage -l app=rook-ceph-tools
else
oc patch OCSInitialization/ocsinit -n openshift-storage \
--type=merge -p='{"spec":{ "enableCephTools": true}}'
TOOLS_POD=""
echo -n "waiting for ceph tools pod to schedule "
until [ -n "$TOOLS_POD" ]; do
echo -n "."
sleep 5
TOOLS_POD=$(oc get pod -n openshift-storage -l app=rook-ceph-tools -o name)
done
echo "$TOOLS_POD"
echo "waiting for ceph tools pod to startup"
oc wait $TOOLS_POD --for=condition=Ready --timeout=300s -n openshift-storage
echo "connecting to ceph toolbox"
oc rsh -n openshift-storage $TOOLS_POD
fi

Ceph Status

Terminal window
oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph status -c /var/lib/rook/openshift-storage/openshift-storage.config

Ceph Time Sync

Terminal window
oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph time-sync-status -c /var/lib/rook/openshift-storage/openshift-storage.config

StorageCluster Status

Terminal window
oc get storagecluster -n openshift-storage

Noobaa check oggetti e size

Terminal window
radosgw-admin bucket stats | jq -r '
.[] | "\(.bucket) objs=\(.usage["rgw.main"].num_objects) sizeGB=\(.usage["rgw.main"].size_kb/1024/1024|floor)"'

Check bucket status.

Terminal window
oc get ob -o custom-columns=NAME":metadata.name",BUKCKET_NAME":spec.endpoint.bucketName",STORAGE-CLASS":spec.storageClassName",PHASE":status.phase"
https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.12/html-single/managing_hybrid_and_multicloud_resources/index#accessing-the-Multicloud-object-gateway-from-the-mcg-command-line-interface_rhodf
noobaa bucket status {bucket_name}