How to Install IBM Cloud Pak for Data 3.5 on Openshift on IBM Power Systems

8 min readOct 20, 2021

Hybrid Cloud — featuring IBM Power Systems, IBM Cloud Pak for Data and Red Hat Openshift Container Platform Offerings

Vergie Hadiana, Solution Specialist Hybrid Cloud — Sinergi Wahana Gemilang

*Illustration-1:* IBM Cloud Pak for Data.

Intro

IBM® Cloud Pak® for Data unifies and simplifies the collection, organization, and analysis of data. Enterprises can turn data into insights through an integrated cloud-native architecture. IBM® Cloud Pak® for Data is extensible and can be customized to a client’s unique data and AI landscapes through an integrated catalog of IBM, open source, and third-party microservices add-on.

*Illustration-2:* Cloud Pak for Data with IBM Power + Red Hat OpenShift.

This tutorial shows how to perform an online installation of Cloud Pak for Data 3.5 on IBM Power Systems™ Virtual Machine and some of the services that are needed to use the Cloud Pak for Data industry accelerators available at here.

Prerequisites

1. Have Red Hat® OpenShift® Container Platform environment and have been installed on IBM Power Systems Virtual Machine.

If you doesnt have Openshift environment on IBM Power Systems you can look my post before:

How to Install Red Hat Openshift Container Platform 4 on IBM Power Systems (PowerVM)

Hybrid Cloud — featuring IBM Power Systems with PowerVM® Vergie Hadiana, Solution Specialist Hybrid Cloud — Sinergi…

vergiehadiana.medium.com

It is assumed that you have it already installed, you have access to it, and have the credentials for the kubeadmin (OpenShift cluster administrator).

2. Must have created a local repository on a persistent storage and have a Network File System (NFS) storage class where the NFS export has the no_root_squash property set. You can look my post before how to setup :

[GUIDE] Deploy NFS Storage for Dynamic Provisioning and Image Registry for Red Hat Openshift…

Private Cloud and Hybrid Cloud — featuring IBM Power Systems and Red Hat Openshift Container Platform Offerings

vergiehadiana.medium.com

3. You must be familiar with Linux® command line and have at least basic understanding of Red Hat OpenShift.

4. You need to have the wget and the oc clients already installed and on your PATH variable.

Estimated Time : It is expected to take around 2 to 3 hours to complete the installation of IBM Cloud Pak for Data on IBM Power Systems Virtual Machine.
This lengthy duration is because we need to install the software from repositories on the internet.

Pre-installation tasks for IBM Cloud Pak for Data

Log in as the root user on the bastion node.
Verify that the NFS export has the no_root_squash property set. Restart the NFS server if changed.

cat /etc/exports
# /export *(rw,sync,no_root_squash)

systemctl restart nfs-server

3. Verify that the I/O performance of the NFS export meets the requirements. The value from the first dd command (disk latency) should be equal to or better than 2.5 MBps. The value of the second dd command (disk throughput) should be equal to or better than 209 MBps.

BASTION_IP=$(nslookup $(hostname -s) | tail -n2 | head -1 | awk '{print $2}')
NODE=$(oc get nodes | grep Ready | grep worker | head -1 | awk '{print $1}')

cat <<EOF > /tmp/verify_disk.sh
mkdir -p /mnt/export
mount -t nfs ${BASTION_IP}:/export /mnt/export
echo "Verifying disk latency of NFS share - should be equal or better than 2.5 MB/s"
dd if=/dev/zero of=/mnt/export/testfile bs=4096 count=1000 oflag=dsync
echo "Verifying disk throuhgput of NFS share - should be equal or better than 209 MB/s"
dd if=/dev/zero of=/mnt/export/testfile bs=1G count=1 oflag=dsync
rm /mnt/export/testfile; umount /mnt/export; rm -rf /mnt/export
echo "Cf. https://www.ibm.com/support/knowledgecenter/SSQNUZ_3.5.0/cpd/plan/rhos-reqs.html#rhos-reqs__disk"
echo "Done."
EOF

scp /tmp/verify_disk.sh core@${NODE}:/tmp
ssh core@${NODE} "sudo sh /tmp/verify_disk.sh; rm /tmp/verify_disk.sh"
rm /tmp/verify_disk.sh

4. Log in to your Kubernetes cluster using the kubeadmin login and password.

oc login https://api.<ClusterName>.<Domain>:6443
# Authentication required for https:// api.<ClusterName>.<Domain>:6443 (openshift)
# Username: kubeadmin
# Password:

5. Expose the internal OpenShift image registry (if not done earlier).

oc patch configs.imageregistry.operator.openshift.io/cluster --type merge -p '{"spec":{"defaultRoute":true}}'

6. Use the kernel.yaml file to apply kernel tuning parameters.

Note: that these settings are for the worker nodes with 64 GB RAM. Refer to the following documentation to understand how to adapt: https://www.ibm.com/docs/en/cloud-paks/cp-data/3.5.0?topic=tasks-changing-required-node-settings#node-settings__kernel

Download the kernel.yaml file

oc apply -f kernel.yaml#ORoc apply -f https://s3.us.cloud-object-storage.appdomain.cloud/developer/default/tutorials/installation-of-cloud-pak-on-ocp-on-powervs/static/kernel.yaml

7. Use the smt_crio_slub.yaml file to make sure that the required OpenShift Container Platform configuration is applied.

Download the smt_crio_slub yaml file

oc apply -f smt_crio_slub.yaml#ORoc apply -f https://s3.us.cloud-object-storage.appdomain.cloud/developer/default/tutorials/installation-of-cloud-pak-on-ocp-on-powervs/static/smt_crio_slub.yaml

8. Verify that smt_crio_slub.yaml changes have been applied. You need to wait until all worker nodes have been updated, that is until the status of the worker nodes shows: UPDATED=True, UPDATING=False, and DEGRADED=False. This could take up to 30 minutes as the worker nodes are being rebooted.

oc get mcp

*Illustration-3: Updated Config on All Worker Nodes . Captured as of August 14, 2021.*

Installing IBM Cloud Pak for Data

Download the Cloud Pak for Data installation utility from the public IBM GitHub repository.

https://github.com/IBM/cpd-cli/releases — *Illustration-4: Github repository utility for IBM Cloud Pak for Data . Captured as of August 14, 2021.*

wget https://github.com/IBM/cpd-cli/releases/download/v3.5.6/cpd-cli-ppc64le-EE-3.5.6.tgz

2. Extract the cpd-cli-ppc64le-EE-3.5.6.tgz package.

tar -xvf cpd-cli-ppc64le-EE-3.5.6.tgz

3. Using your preferred text editor, change the apikey entry in the repo.yaml file that was decompressed with the apikey you acquired with your IBM ID from the IBM container library.

---
fileservers:
-
 url: "https://raw.github.com/IBM/cloud-pak/master/repo/cpd/3.5"
 registry:
-
 url: cp.icr.io/cp/cpd
 name: base-registry
 namespace: ""
 username: cp
 apikey: [Get you entitlement key here https://myibm.ibm.com/products-services/containerlibrary]

Replace <enter_api_key> with your entitlement key (apikey).
For get apikey: https://myibm.ibm.com/products-services/containerlibrary

*Illustration-5: Inside repo.yaml file. Captured as of August 14, 2021.*

4. Create a new project called zen1 or something you like.

⚠️ Make sure you update if your namespaces or project names are different.

oc new-project zen1

5. Install IBM Cloud Pak for Data Control Plane (lite).
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’

./cpd-cli adm --assembly lite --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a lite --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses

*Illustration-6: Progress apply RBAC and SA for requirement IBM Cloud Pak for Data. Captured as of August 14, 2021.*

*Illustration-7: Progress installing images for IBM Cloud Pak for Data. Captured as of August 14, 2021.*

*Illustration-8: Finished Installing Core / Lite IBM Cloud Pak for Data. Captured as of August 14, 2021.*

Install the Services IBM Cloud Pak for Data

List Supported the services on Cloud Pak for Data your cluster architecture x86, POWER (ppc64le), Z (s390x) and the minimum required resources :
https://www.ibm.com/docs/en/cloud-paks/cp-data/3.5.0?topic=requirements-system-services

🚨 Every install the services can take some time (depends your internet speed and your openshift cluster environment) and go get come coffee or grab some lunch.

Template services installation commands

./cpd-cli adm --assembly <assembly-services> --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a <assembly-services> --arch ppc64le -c <storage-class-name> -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses

*Illustration-10: Service catalog on IBM Cloud Pak for Data. Captured as of August 14, 2021.*

6. Install Watson® Studio Local (wsl) on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :

./cpd-cli adm --assembly wsl --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a wsl --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses

Service Data Refinery & Watson® Knowled Catalog is automatically installed when you install Watson® Studio on POWER.

7. Install Watson® Machine Learning (wml) on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :

./cpd-cli adm --assembly wml --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a wml --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses

8. Install Analytics Engine powered by Apache Spark (Spark) on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :

./cpd-cli adm --assembly spark --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a spark --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses

9. Install RStudio on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :

./cpd-cli adm --assembly rstudio --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a rstudio --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses

10. Install R 3.6 runtime add-on on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :

./cpd-cli adm --assembly runtime-addon-r36 --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a runtime-addon-r36 --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses

Accessing Web Console IBM Cloud Pak for Data

11. Direct your browser to the Cloud Pak for Data web console running at https://{{ cluster.namespace }}-cpd-{{ cluster.namespace }}.apps.{{ dns.clusterid }}.{{ dns.domain }}

The default admin user ID and password are:
User ID: admin
Password: password

# e.g my case web console cloud pak for data like this :
https://zen1-cpd-zen1.apps.p1214.cecc.ihost.com

*Illustration-11: Login page of IBM Cloud Pak for Data. Captured as of August 14, 2021.*

*Illustration-12: Dashboard IBM Cloud Pak for Data. Captured as of August 14, 2021.*

Congrats! You have installed the IBM Cloud Pak for Data on your Openshift cluster!

Your installation should be done!