vnfs/DAaaS/README.md

   1 # Distributed Analytics Framework
   2
   3
   4 ## Pre-requisites
   5 | Required   | Version |
   6 |------------|---------|
   7 | Kubernetes | 1.12.3+ |
   8 | Docker CE  | 18.09+  |
   9 | Helm       | >=2.12.1 and <=2.13.1 |
  10 ## Download Framework
  11 ```bash
  12 git clone https://github.com/onap/demo.git
  13 DA_WORKING_DIR=$PWD/demo/vnfs/DAaaS/deploy
  14 ```
  15
  16 ## Install Istio Service Mesh
  17
  18 ## Istio is installed in two Steps
  19 ```bash
  20 1. Istio-Operator
  21 2. Istio-config
  22 ```
  23
  24 ## Download the Istio Installation repo
  25
  26 ```bash
  27 cd $DA_WORKING_DIR/00-init
  28 helm install --name=istio-operator istio-operator --namespace=istio-system
  29 cd $DA_WORKING_DIR/00-init/istio
  30 helm install --name istio istio-instance --namespace istio-system
  31 ```
  32
  33 ## Install Metallb to act as a Loadbalancer
  34 ```bash
  35 cd  $DA_WORKING_DIR/00-init
  36 NOTE: Update the IP Address Ranges before you Install Metallb
  37 NOTE: If you are using a single IP, use <IP>/32 format
  38 helm install --name metallb metallb --namespace metallb-system
  39 ```
  40
  41 ## Install Rook-Ceph for Persistent Storage
  42 Note: This is unusual but Flex volume path can be different than the default value. values.yaml has the most common flexvolume path configured. In case of errors related to flexvolume please refer to the https://rook.io/docs/rook/v0.9/flexvolume.html#configuring-the-flexvolume-path to find the appropriate flexvolume-path and set it in values.yaml
  43 ```bash
  44 cd $DA_WORKING_DIR/00-init/rook-ceph
  45 helm install -n rook . -f values.yaml --namespace=rook-ceph-system
  46 ```
  47 Check for the status of the pods in rook-ceph namespace. Once all pods are in Ready state move on to the next section.
  48
  49 ```bash
  50 $ kubectl get pods -n rook-ceph-system
  51 NAME                                 READY   STATUS    RESTARTS   AGE
  52 rook-ceph-agent-9wszf                1/1     Running   0          121s
  53 rook-ceph-agent-xnbt8                1/1     Running   0          121s
  54 rook-ceph-operator-bc77d6d75-ltwww   1/1     Running   0          158s
  55 rook-discover-bvj65                  1/1     Running   0          133s
  56 rook-discover-nbfrp                  1/1     Running   0          133s
  57 ```
  58 ```bash
  59 $ kubectl -n rook-ceph get pod
  60 NAME                                   READY   STATUS      RESTARTS   AGE
  61 rook-ceph-mgr-a-d9dcf5748-5s9ft        1/1     Running     0          77s
  62 rook-ceph-mon-a-7d8f675889-nw5pl       1/1     Running     0          105s
  63 rook-ceph-mon-b-856fdd5cb9-5h2qk       1/1     Running     0          94s
  64 rook-ceph-mon-c-57545897fc-j576h       1/1     Running     0          85s
  65 rook-ceph-osd-0-7cbbbf749f-j8fsd       1/1     Running     0          25s
  66 rook-ceph-osd-1-7f67f9646d-44p7v       1/1     Running     0          25s
  67 rook-ceph-osd-2-6cd4b776ff-v4d68       1/1     Running     0          25s
  68 rook-ceph-osd-prepare-vx2rz            0/2     Completed   0          60s
  69 rook-ceph-tools-5bd5cdb949-j68kk       1/1     Running     0          53s
  70 ```
  71
  72 #### Troubleshooting Rook-Ceph installation
  73
  74 In case your machine had rook previously installed successfully or unsuccessfully
  75 and you are attempting a fresh installation of rook operator, you may face some issues.
  76 Lets help you with that.
  77
  78 * First check if there are some rook CRDs existing :
  79 ```
  80 kubectl get crds | grep rook
  81 ```
  82 If this return results like :
  83 ```
  84 otc@otconap7 /var/lib/rook $  kubectl get crds | grep rook
  85 cephblockpools.ceph.rook.io         2019-07-19T18:19:05Z
  86 cephclusters.ceph.rook.io           2019-07-19T18:19:05Z
  87 cephfilesystems.ceph.rook.io        2019-07-19T18:19:05Z
  88 cephobjectstores.ceph.rook.io       2019-07-19T18:19:05Z
  89 cephobjectstoreusers.ceph.rook.io   2019-07-19T18:19:05Z
  90 volumes.rook.io                     2019-07-19T18:19:05Z
  91 ```
  92 then you should delete these previously existing rook based CRDs by generating a delete
  93 manifest file by these commands and then deleting those files:
  94 ```
  95 helm template -n rook . -f values.yaml > ~/delete.yaml
  96 kubectl delete -f ~/delete.yaml
  97 ```
  98
  99 After this, delete the below directory in all the nodes.
 100 ```
 101 sudo rm -rf /var/lib/rook/
 102 ```
 103 Now, again attempt :
 104 ```
 105 helm install -n rook . -f values.yaml --namespace=rook-ceph-system
 106 ```
 107
 108 ## Install Operator package
 109 ### Build docker images
 110 #### collectd-operator
 111 ```bash
 112 cd $DA_WORKING_DIR/../microservices
 113
 114 ## Note: The image tag and respository in the Collectd-operator helm charts needs to match the IMAGE_NAME
 115 IMAGE_NAME=dcr.cluster.local:32644/collectd-operator:latest
 116 ./build_image.sh collectd-operator $IMAGE_NAME
 117 ```
 118 #### visualization-operator
 119 ```bash
 120 cd $DA_WORKING_DIR/../microservices
 121
 122 ## Note: The image tag and respository in the Visualization-operator helm charts needs to match the IMAGE_NAME
 123 IMAGE_NAME=dcr.cluster.local:32644/visualization-operator:latest
 124 ./build_image.sh visualization-operator $IMAGE_NAME
 125 ```
 126
 127 ### Install the Operator Package
 128 ```bash
 129 cd $DA_WORKING_DIR/deploy/operator
 130 helm install -n operator . -f values.yaml --namespace=operator
 131 ```
 132 Check for the status of the pods in operator namespace. Check if Prometheus operator pods are in Ready state.
 133 ```bash
 134 kubectl get pods -n operator
 135 NAME                                                      READY   STATUS    RESTARTS
 136 m3db-operator-0                                           1/1     Running   0
 137 op-etcd-operator-etcd-backup-operator-6cdc577f7d-ltgsr    1/1     Running   0
 138 op-etcd-operator-etcd-operator-79fd99f8b7-fdc7p           1/1     Running   0
 139 op-etcd-operator-etcd-restore-operator-855f7478bf-r7qxp   1/1     Running   0
 140 op-prometheus-operator-operator-5c9b87965b-wjtw5          1/1     Running   1
 141 op-sparkoperator-6cb4db884c-75rcd                         1/1     Running   0
 142 strimzi-cluster-operator-5bffdd7b85-rlrvj                 1/1     Running   0
 143 ```
 144 #### Troubleshooting Operator installation
 145 Sometimes deleting the previously installed Operator package will fail to remove all operator pods. To troubleshoot this ensure these following steps.
 146
 147 1. Make sure that all the other deployments or helm release is deleted (purged). Operator package is a baseline package for the applications, so if the applications are still running while trying to delete the operator package might result in unwarrented state.
 148
 149 2. Delete all the resources and CRDs associated with operator package.
 150 ```bash
 151 #NOTE: Use the same release name and namespace as in installation of operator package in the previous section
 152 cd $DA_WORKING_DIR/operator
 153 helm template -n operator . -f values.yaml --namespace=operator > ../delete_operator.yaml
 154 cd ../
 155 kubectl delete -f delete_operator.yaml
 156 ```
 157 ## Install Collection package
 158 Note: Collectd.conf is avaliable in $DA_WORKING_DIR/collection/charts/collectd/resources/config directory. Any valid collectd.conf can be placed here.
 159 ```bash
 160 Default (For custom collectd skip this section)
 161 =======
 162 cd $DA_WORKING_DIR/deploy/collection
 163 helm install -n cp . -f values.yaml --namespace=edge1
 164
 165 Custom Collectd
 166 ===============
 167 1. Build the custom collectd image
 168 2. Set COLLECTD_IMAGE_NAME with appropriate image_repository:tag
 169 3. Push the image to docker registry using the command
 170 4. docker push ${COLLECTD_IMAGE_NAME}
 171 5. Edit the values.yaml and change the image repository and tag using
 172    COLLECTD_IMAGE_NAME appropriately.
 173 6. Place the collectd.conf in
 174    $DA_WORKING_DIR/collection/charts/collectd/resources
 175
 176 7. cd $DA_WORKING_DIR/collection
 177 8. helm install -n cp . -f values.yaml --namespace=edge1
 178 ```
 179
 180 #### Verify Collection package
 181 * Check if all pods are up in edge1 namespace
 182 * Check the prometheus UI using port-forwarding port 9090 (default for prometheus service)
 183 ```
 184 $ kubectl get pods -n edge1
 185 NAME                                      READY   STATUS    RESTARTS   AGE
 186 cp-cadvisor-8rk2b                       1/1     Running   0          15s
 187 cp-cadvisor-nsjr6                       1/1     Running   0          15s
 188 cp-collectd-h5krd                       1/1     Running   0          23s
 189 cp-collectd-jc9m2                       1/1     Running   0          23s
 190 cp-prometheus-node-exporter-blc6p       1/1     Running   0          17s
 191 cp-prometheus-node-exporter-qbvdx       1/1     Running   0          17s
 192 prometheus-cp-prometheus-prometheus-0   4/4     Running   1          33s
 193
 194 $ kubectl get svc -n edge1
 195 NAME                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)
 196 cadvisor                        NodePort    10.43.53.122   <none>        80:30091/TCP
 197 collectd                        ClusterIP   10.43.222.34   <none>        9103/TCP
 198 cp13-prometheus-node-exporter   ClusterIP   10.43.17.242   <none>        9100/TCP
 199 cp13-prometheus-prometheus      NodePort    10.43.26.155   <none>        9090:30090/TCP
 200 prometheus-operated             ClusterIP   None           <none>        9090/TCP
 201 ```
 202 #### Configure Collectd Plugins
 203 1. Using the sample [collectdglobal.yaml](microservices/collectd-operator/examples/collectd/collectdglobal.yaml), Configure the CollectdGlobal CR
 204 2. If there are additional Types.db files to update, Copy the additional types.db files to resources folder.
 205 3. Create a ConfigMap to load the types.db and update the configMap with name of the ConfigMap created.
 206 4. Create and configure the required CollectdPlugin CRs. Use these samples as a reference [cpu_collectdplugin_cr.yaml](microservices/collectd-operator/examples/collectd/cpu_collectdplugin_cr.yaml), [prometheus_collectdplugin_cr.yaml](microservices/collectd-operator/examples/collectd/prometheus_collectdplugin_cr.yaml).
 207 4. Use the same namespace where the collection package was installed.
 208 5. Assuming it is edge1, create the config resources that are applicable. Apply the following commands in the same order.
 209 ```yaml
 210 # Note:
 211 ## 1. Create Configmap is optional and required only if additional types.db file needs to be mounted.
 212 ## 2. Add/Remove --from-file accordingly. Use the correct file name based on the context.
 213 kubectl create configmap typesdb-configmap --from-file ./resource/[FILE_NAME1] --from-file ./resource/[FILE_NAME2]
 214 kubectl create -f edge1 collectdglobal.yaml
 215 kubectl create -f edge1 [PLUGIN_NAME1]_collectdplugin_cr.yaml
 216 kubectl create -f edge1 [PLUGIN_NAME2]_collectdplugin_cr.yaml
 217 kubectl create -f edge1 [PLUGIN_NAME3]_collectdplugin_cr.yaml
 218 ...
 219 ```
 220
 221 #Install visualization package
 222 ```bash
 223 Default (For custom Grafana dashboards skip this section)
 224 =======
 225 cd $DA_WORKING_DIR/visualization
 226 helm install -n viz . -f values.yaml -f grafana-values.yaml
 227
 228 Custom Grafana dashboards
 229 =========================
 230 1. Place the custom dashboard definition into the folder $DA_WORKING_DIR/visualization/charts/grafana/dashboards
 231     Example dashboard definition can be found at $DA_WORKING_DIR/visualization/charts/grafana/dashboards/dashboard1.json
 232 2. Create a configmap.yaml that imports above created dashboard.json file as config and copy that configmap.yaml to $DA_WORKING_DIR/visualization/charts/grafana/templates/
 233     Example configmap can be found at $DA_WORKING_DIR/visualization/charts/grafana/templates/configmap-add-dashboard.yaml
 234 3. Add custom dashboard configuration to values.yaml or an overriding values.yaml.
 235     Example configuration can be found in the "dashboardProviders" section of grafana-values.yaml
 236
 237 4. cd $DA_WORKING_DIR/visualization
 238 5. For a fresh install of visualization package, do "helm install"
 239     e.g., helm install -n viz . -f values.yaml -f grafana-values.yaml
 240    If the custom dashboard is being added to an already running Grafana, do "helm upgrade"
 241     e.g., helm upgrade -n viz . -f values.yaml -f grafana-values.yaml -f ......
 242 ```
 243
 244 #### Verify Visualization package
 245 Check if the visualization pod is up
 246 ```
 247 $ kubectl get pods
 248     NAME                          READY   STATUS    RESTARTS   AGE
 249     viz-grafana-78dcffd75-sxnjv   1/1     Running   0          52m
 250 ```
 251
 252 ### Login to Grafana
 253 ```
 254 1. Get your 'admin' user password by running:
 255     kubectl get secret --namespace default viz-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
 256
 257 2. Get the Grafana URL to visit by running these commands in the same shell:
 258     export POD_NAME=$(kubectl get pods --namespace default -l "app=grafana,release=viz" -o jsonpath="{.items[0].metadata.name}")
 259     kubectl --namespace default port-forward $POD_NAME 3000
 260
 261 3. Visit the URL : http://localhost:3000 and login with the password from step 1 and the username: admin
 262 ```
 263
 264 #### Configure Grafana Datasources
 265 Using the sample [prometheus_grafanadatasource_cr.yaml](microservices/visualization-operator/examples/grafana/prometheus_grafanadatasource_cr.yaml), Configure the GrafanaDataSource CR by running the command below
 266 ```yaml
 267 kubectl create -f [DATASOURCE_NAME1]_grafanadatasource_cr.yaml
 268 kubectl create -f [DATASOURCE_NAME2]_grafanadatasource_cr.yaml
 269 ...
 270 ```
 271
 272 ## Install Minio Model repository
 273 * Prerequisite: Dynamic storage provisioner needs to be enabled. Either rook-ceph ($DA_WORKING_DIR/00-init) or another alternate provisioner needs to be enabled.
 274 ```bash
 275 cd $DA_WORKING_DIR/minio
 276
 277 Edit the values.yaml to set the credentials to access the minio UI.
 278 Default values are
 279 accessKey: "onapdaas"
 280 secretKey: "onapsecretdaas"
 281
 282 helm install -n minio . -f values.yaml --namespace=edge1
 283 ```
 284
 285 ## Install Messaging platform
 286
 287 We have currently support strimzi based kafka operator.
 288 Navigate to ```$DA_WORKING_DIR/deploy/messaging/charts/strimzi-kafka-operator``` directory.
 289 Use the below command :
 290 ```
 291 helm install . -f values.yaml  --name sko --namespace=test
 292 ```
 293
 294 NOTE: Make changes in the values.yaml if required.
 295
 296 Once the strimzi operator ready, you shall get a pod like :
 297
 298 ```
 299 strimzi-cluster-operator-5cf7648b8c-zgxv7       1/1     Running   0          53m
 300 ```
 301
 302 Once this done, install the kafka package like any other helm charts you have.
 303 Navigate to dir : ```$DA_WORKING_DIRdeploy/messaging``` and use command:
 304 ```
 305 helm install --name kafka-cluster charts/kafka/
 306 ```
 307
 308 Once this done, you should have the following pods up and running.
 309
 310 ```
 311 kafka-cluster-entity-operator-b6557fc6c-hlnkm   3/3     Running   0          47m
 312 kafka-cluster-kafka-0                           2/2     Running   0          48m
 313 kafka-cluster-kafka-1                           2/2     Running   0          48m
 314 kafka-cluster-kafka-2                           2/2     Running   0          48m
 315 kafka-cluster-zookeeper-0                       2/2     Running   0          49m
 316 kafka-cluster-zookeeper-1                       2/2     Running   0          49m
 317 kafka-cluster-zookeeper-2                       2/2     Running   0          49m
 318 ```
 319
 320 You should have the following services when do a ```kubectl get svc```
 321
 322 ```
 323 kafka-cluster-kafka-bootstrap    ClusterIP   10.XX.YY.ZZ   <none>        9091/TCP,9092/TCP,9093/TCP   53m
 324 kafka-cluster-kafka-brokers      ClusterIP   None           <none>        9091/TCP,9092/TCP,9093/TCP   53m
 325 kafka-cluster-zookeeper-client   ClusterIP   10.XX.YY.ZZ   <none>        2181/TCP                     55m
 326 kafka-cluster-zookeeper-nodes    ClusterIP   None           <none>        2181/TCP,2888/TCP,3888/TCP   55m
 327 ```
 328 #### Testing messaging
 329
 330 You can test your kafka brokers by creating a simple producer and consumer.
 331
 332 Producer :
 333 ```
 334 kubectl run kafka-producer -ti --image=strimzi/kafka:0.12.2-kafka-2.2.1 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list kafka-cluster-kafka-bootstrap:9092 --topic my-topic
 335  ```
 336  Consumer :
 337  ```
 338
 339 kubectl run kafka-consumer -ti --image=strimzi/kafka:0.12.2-kafka-2.2.1 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server kafka-cluster-kafka-bootstrap:9092 --topic my-topic --from-beginning
 340 ```
 341
 342 ## Install Training Package
 343
 344 #### Install M3DB (Time series Data lake)
 345 ##### Pre-requisites
 346 1.  kubernetes cluster with atleast 3 nodes
 347 2.  Etcd operator, M3DB operator
 348 3.  Node labelled with zone and region.
 349
 350 ```bash
 351 ## Defult region is us-west1, Default labels are us-west1-a, us-west1-b, us-west1-c
 352 ## If this is changed then isolationGroups in training-core/charts/m3db/values.yaml needs to be updated.
 353 NODES=($(kubectl get nodes --output=jsonpath={.items..metadata.name}))
 354
 355 kubectl label node/${NODES[0]} failure-domain.beta.kubernetes.io/region=us-west1
 356 kubectl label node/${NODES[1]} failure-domain.beta.kubernetes.io/region=us-west1
 357 kubectl label node/${NODES[2]} failure-domain.beta.kubernetes.io/region=us-west1
 358
 359 kubectl label node/${NODES[0]} failure-domain.beta.kubernetes.io/zone=us-west1-a --overwrite=true
 360 kubectl label node/${NODES[1]} failure-domain.beta.kubernetes.io/zone=us-west1-b --overwrite=true
 361 kubectl label node/${NODES[2]} failure-domain.beta.kubernetes.io/zone=us-west1-c --overwrite=true
 362 ```
 363 ```bash
 364 cd $DA_WORKING_DIR/training-core/charts/m3db
 365 helm install -n m3db . -f values.yaml --namespace training
 366 ```
 367 ```
 368 $ kubectl get pods -n training
 369 NAME                   READY   STATUS    RESTARTS   AGE
 370 m3db-cluster-rep0-0    1/1     Running   0          103s
 371 m3db-cluster-rep1-0    1/1     Running   0          83s
 372 m3db-cluster-rep1-0    1/1     Running   0          62s
 373 m3db-etcd-sjhgl4xfgc   1/1     Running   0          83s
 374 m3db-etcd-lfs96hngz6   1/1     Running   0          67s
 375 m3db-etcd-rmgdkkx4bq   1/1     Running   0          51s
 376 ```
 377
 378 ##### Configure remote write from Prometheus to M3DB
 379 ```bash
 380 cd $DA_WORKING_DIR/day2_configs/prometheus/
 381 ```
 382 ```yaml
 383 cat << EOF > add_m3db_remote.yaml
 384 spec:
 385   remoteWrite:
 386   - url: "http://m3coordinator-m3db.training.svc.cluster.local:7201/api/v1/prom/remote/write"
 387     writeRelabelConfigs:
 388       - targetLabel: metrics_storage
 389         replacement: m3db_remote
 390 EOF
 391 ```
 392 ```bash
 393 kubectl patch --namespace=edge1 prometheus cp-prometheus-prometheus -p "$(cat add_m3db_remote.yaml)" --type=merge
 394 ```
 395 Verify the prometheus GUI to see if the m3db remote write is enabled.