How to Install IBM Cloud Pak for Data 3.5 on Openshift on IBM Power Systems
Hybrid Cloud — featuring IBM Power Systems, IBM Cloud Pak for Data and Red Hat Openshift Container Platform Offerings
Vergie Hadiana, Solution Specialist Hybrid Cloud — Sinergi Wahana Gemilang
Intro
IBM® Cloud Pak® for Data unifies and simplifies the collection, organization, and analysis of data. Enterprises can turn data into insights through an integrated cloud-native architecture. IBM® Cloud Pak® for Data is extensible and can be customized to a client’s unique data and AI landscapes through an integrated catalog of IBM, open source, and third-party microservices add-on.
This tutorial shows how to perform an online installation of Cloud Pak for Data 3.5 on IBM Power Systems™ Virtual Machine and some of the services that are needed to use the Cloud Pak for Data industry accelerators available at here.
Prerequisites
1. Have Red Hat® OpenShift® Container Platform environment and have been installed on IBM Power Systems Virtual Machine.
If you doesnt have Openshift environment on IBM Power Systems you can look my post before:
It is assumed that you have it already installed, you have access to it, and have the credentials for the kubeadmin (OpenShift cluster administrator).
2. Must have created a local repository on a persistent storage and have a Network File System (NFS) storage class where the NFS export has the no_root_squash
property set. You can look my post before how to setup :
3. You must be familiar with Linux® command line and have at least basic understanding of Red Hat OpenShift.
4. You need to have the wget
and the oc
clients already installed and on your PATH
variable.
Estimated Time : It is expected to take around 2 to 3 hours to complete the installation of IBM Cloud Pak for Data on IBM Power Systems Virtual Machine.
This lengthy duration is because we need to install the software from repositories on the internet.
Pre-installation tasks for IBM Cloud Pak for Data
- Log in as the root user on the bastion node.
- Verify that the NFS export has the
no_root_squash
property set. Restart the NFS server if changed.
cat /etc/exports
# /export *(rw,sync,no_root_squash)
systemctl restart nfs-server
3. Verify that the I/O performance of the NFS export meets the requirements. The value from the first dd
command (disk latency) should be equal to or better than 2.5 MBps. The value of the second dd
command (disk throughput) should be equal to or better than 209 MBps.
BASTION_IP=$(nslookup $(hostname -s) | tail -n2 | head -1 | awk '{print $2}')
NODE=$(oc get nodes | grep Ready | grep worker | head -1 | awk '{print $1}')
cat <<EOF > /tmp/verify_disk.sh
mkdir -p /mnt/export
mount -t nfs ${BASTION_IP}:/export /mnt/export
echo "Verifying disk latency of NFS share - should be equal or better than 2.5 MB/s"
dd if=/dev/zero of=/mnt/export/testfile bs=4096 count=1000 oflag=dsync
echo "Verifying disk throuhgput of NFS share - should be equal or better than 209 MB/s"
dd if=/dev/zero of=/mnt/export/testfile bs=1G count=1 oflag=dsync
rm /mnt/export/testfile; umount /mnt/export; rm -rf /mnt/export
echo "Cf. https://www.ibm.com/support/knowledgecenter/SSQNUZ_3.5.0/cpd/plan/rhos-reqs.html#rhos-reqs__disk"
echo "Done."
EOF
scp /tmp/verify_disk.sh core@${NODE}:/tmp
ssh core@${NODE} "sudo sh /tmp/verify_disk.sh; rm /tmp/verify_disk.sh"
rm /tmp/verify_disk.sh
4. Log in to your Kubernetes cluster using the kubeadmin login and password.
oc login https://api.<ClusterName>.<Domain>:6443
# Authentication required for https:// api.<ClusterName>.<Domain>:6443 (openshift)
# Username: kubeadmin
# Password:
5. Expose the internal OpenShift image registry (if not done earlier).
oc patch configs.imageregistry.operator.openshift.io/cluster --type merge -p '{"spec":{"defaultRoute":true}}'
6. Use the kernel.yaml file to apply kernel tuning parameters.
Note: that these settings are for the worker nodes with 64 GB RAM. Refer to the following documentation to understand how to adapt: https://www.ibm.com/docs/en/cloud-paks/cp-data/3.5.0?topic=tasks-changing-required-node-settings#node-settings__kernel
oc apply -f kernel.yaml#ORoc apply -f https://s3.us.cloud-object-storage.appdomain.cloud/developer/default/tutorials/installation-of-cloud-pak-on-ocp-on-powervs/static/kernel.yaml
7. Use the smt_crio_slub.yaml file to make sure that the required OpenShift Container Platform configuration is applied.
Download the smt_crio_slub yaml file
oc apply -f smt_crio_slub.yaml#ORoc apply -f https://s3.us.cloud-object-storage.appdomain.cloud/developer/default/tutorials/installation-of-cloud-pak-on-ocp-on-powervs/static/smt_crio_slub.yaml
8. Verify that smt_crio_slub.yaml changes have been applied. You need to wait until all worker nodes have been updated, that is until the status of the worker nodes shows: UPDATED=True, UPDATING=False, and DEGRADED=False. This could take up to 30 minutes as the worker nodes are being rebooted.
oc get mcp
Installing IBM Cloud Pak for Data
- Download the Cloud Pak for Data installation utility from the public IBM GitHub repository.
wget https://github.com/IBM/cpd-cli/releases/download/v3.5.6/cpd-cli-ppc64le-EE-3.5.6.tgz
2. Extract the cpd-cli-ppc64le-EE-3.5.6.tgz package.
tar -xvf cpd-cli-ppc64le-EE-3.5.6.tgz
3. Using your preferred text editor, change the apikey entry in the repo.yaml file that was decompressed with the apikey you acquired with your IBM ID from the IBM container library.
---
fileservers:
-
url: "https://raw.github.com/IBM/cloud-pak/master/repo/cpd/3.5"
registry:
-
url: cp.icr.io/cp/cpd
name: base-registry
namespace: ""
username: cp
apikey: [Get you entitlement key here https://myibm.ibm.com/products-services/containerlibrary]
Replace <enter_api_key> with your entitlement key (apikey).
For get apikey: https://myibm.ibm.com/products-services/containerlibrary
4. Create a new project called zen1 or something you like.
⚠️ Make sure you update if your namespaces or project names are different.
oc new-project zen1
5. Install IBM Cloud Pak for Data Control Plane (lite).
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’
./cpd-cli adm --assembly lite --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a lite --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses
Install the Services IBM Cloud Pak for Data
List Supported the services on Cloud Pak for Data your cluster architecture x86, POWER (ppc64le), Z (s390x) and the minimum required resources :
https://www.ibm.com/docs/en/cloud-paks/cp-data/3.5.0?topic=requirements-system-services
🚨 Every install the services can take some time (depends your internet speed and your openshift cluster environment) and go get come coffee or grab some lunch.
Template services installation commands
./cpd-cli adm --assembly <assembly-services> --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a <assembly-services> --arch ppc64le -c <storage-class-name> -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses
6. Install Watson® Studio Local (wsl) on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :
./cpd-cli adm --assembly wsl --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a wsl --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses
Service Data Refinery & Watson® Knowled Catalog is automatically installed when you install Watson® Studio on POWER.
7. Install Watson® Machine Learning (wml) on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :
./cpd-cli adm --assembly wml --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a wml --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses
8. Install Analytics Engine powered by Apache Spark (Spark) on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :
./cpd-cli adm --assembly spark --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a spark --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses
9. Install RStudio on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :
./cpd-cli adm --assembly rstudio --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a rstudio --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses
10. Install R 3.6 runtime add-on on POWER
In my case i’m install on namespace ‘zen1’ and using storageClass ‘nfs’ :
./cpd-cli adm --assembly runtime-addon-r36 --arch ppc64le --namespace zen1 -r repo.yaml --apply --latest-dependency --accept-all-licenses./cpd-cli install -a runtime-addon-r36 --arch ppc64le -c nfs -n zen1 -r repo.yaml --latest-dependency --accept-all-licenses
Accessing Web Console IBM Cloud Pak for Data
11. Direct your browser to the Cloud Pak for Data web console running at https://{{ cluster.namespace }}-cpd-{{ cluster.namespace }}.apps.{{ dns.clusterid }}.{{ dns.domain }}
The default admin user ID and password are:
User ID: admin
Password: password
# e.g my case web console cloud pak for data like this :
https://zen1-cpd-zen1.apps.p1214.cecc.ihost.com
Congrats! You have installed the IBM Cloud Pak for Data on your Openshift cluster!
Your installation should be done!