vnfs/DAaaS/README.md

   1 # Distributed Analytics Framework
   2
   3
   4 ## Pre-requisites
   5 | Required   | Version |
   6 |------------|---------|
   7 | Kubernetes | 1.12.3+ |
   8 | Docker CE  | 18.09+  |
   9 | Helm       | >=2.12.1 and <=2.13.1 |
  10 ## Download Framework
  11 ```bash
  12 git clone https://github.com/onap/demo.git
  13 DA_WORKING_DIR=$PWD/demo/vnfs/DAaaS/deploy
  14 ```
  15
  16 ## Install Rook-Ceph for Persistent Storage
  17 Note: This is unusual but Flex volume path can be different than the default value. values.yaml has the most common flexvolume path configured. In case of errors related to flexvolume please refer to the https://rook.io/docs/rook/v0.9/flexvolume.html#configuring-the-flexvolume-path to find the appropriate flexvolume-path and set it in values.yaml
  18 ```bash
  19 cd $DA_WORKING_DIR/00-init/rook-ceph
  20 helm install -n rook . -f values.yaml --namespace=rook-ceph-system
  21 ```
  22 Check for the status of the pods in rook-ceph namespace. Once all pods are in Ready state move on to the next section.
  23
  24 ```bash
  25 $ kubectl get pods -n rook-ceph-system
  26 NAME                                 READY   STATUS    RESTARTS   AGE
  27 rook-ceph-agent-9wszf                1/1     Running   0          121s
  28 rook-ceph-agent-xnbt8                1/1     Running   0          121s
  29 rook-ceph-operator-bc77d6d75-ltwww   1/1     Running   0          158s
  30 rook-discover-bvj65                  1/1     Running   0          133s
  31 rook-discover-nbfrp                  1/1     Running   0          133s
  32 ```
  33 ```bash
  34 $ kubectl -n rook-ceph get pod
  35 NAME                                   READY   STATUS      RESTARTS   AGE
  36 rook-ceph-mgr-a-d9dcf5748-5s9ft        1/1     Running     0          77s
  37 rook-ceph-mon-a-7d8f675889-nw5pl       1/1     Running     0          105s
  38 rook-ceph-mon-b-856fdd5cb9-5h2qk       1/1     Running     0          94s
  39 rook-ceph-mon-c-57545897fc-j576h       1/1     Running     0          85s
  40 rook-ceph-osd-0-7cbbbf749f-j8fsd       1/1     Running     0          25s
  41 rook-ceph-osd-1-7f67f9646d-44p7v       1/1     Running     0          25s
  42 rook-ceph-osd-2-6cd4b776ff-v4d68       1/1     Running     0          25s
  43 rook-ceph-osd-prepare-vx2rz            0/2     Completed   0          60s
  44 rook-ceph-tools-5bd5cdb949-j68kk       1/1     Running     0          53s
  45 ```
  46
  47 #### Troubleshooting Rook-Ceph installation
  48
  49 In case your machine had rook previously installed successfully or unsuccessfully
  50 and you are attempting a fresh installation of rook operator, you may face some issues.
  51 Lets help you with that.
  52
  53 * First check if there are some rook CRDs existing :
  54 ```
  55 kubectl get crds | grep rook
  56 ```
  57 If this return results like :
  58 ```
  59 otc@otconap7 /var/lib/rook $  kc get crds | grep rook
  60 cephblockpools.ceph.rook.io         2019-07-19T18:19:05Z
  61 cephclusters.ceph.rook.io           2019-07-19T18:19:05Z
  62 cephfilesystems.ceph.rook.io        2019-07-19T18:19:05Z
  63 cephobjectstores.ceph.rook.io       2019-07-19T18:19:05Z
  64 cephobjectstoreusers.ceph.rook.io   2019-07-19T18:19:05Z
  65 volumes.rook.io                     2019-07-19T18:19:05Z
  66 ```
  67 then you should delete these previously existing rook based CRDs by generating a delete
  68 manifest file by these commands and then deleting those files:
  69 ```
  70 helm template -n rook . -f values.yaml > ~/delete.yaml
  71 kc delete -f ~/delete.yaml
  72 ```
  73
  74 After this, delete the below directory in all the nodes.
  75 ```
  76 sudo rm -rf /var/lib/rook/
  77 ```
  78 Now, again attempt :
  79 ```
  80 helm install -n rook . -f values.yaml --namespace=rook-ceph-system
  81 ```
  82
  83 ## Install Operator package
  84 ### Build docker images
  85 #### collectd-operator
  86 ```bash
  87 cd $DA_WORKING_DIR/../microservices
  88
  89 ## Note: The image tag and respository in the Collectd-operator helm charts needs to match the IMAGE_NAME
  90 IMAGE_NAME=dcr.cluster.local:32644/collectd-operator:latest
  91 ./build_image.sh collectd-operator $IMAGE_NAME
  92 ```
  93 #### visualization-operator
  94 ```bash
  95 cd $DA_WORKING_DIR/../microservices
  96
  97 ## Note: The image tag and respository in the Visualization-operator helm charts needs to match the IMAGE_NAME
  98 IMAGE_NAME=dcr.cluster.local:32644/visualization-operator:latest
  99 ./build_image.sh visualization-operator $IMAGE_NAME
 100 ```
 101
 102 ### Install the Operator Package
 103 ```bash
 104 cd $DA_WORKING_DIR/operator
 105 helm install -n operator . -f values.yaml --namespace=operator
 106 ```
 107 Check for the status of the pods in operator namespace. Check if Prometheus operator pods are in Ready state.
 108 ```bash
 109 kubectl get pods -n operator
 110 NAME                                                      READY   STATUS    RESTARTS
 111 m3db-operator-0                                           1/1     Running   0
 112 op-etcd-operator-etcd-backup-operator-6cdc577f7d-ltgsr    1/1     Running   0
 113 op-etcd-operator-etcd-operator-79fd99f8b7-fdc7p           1/1     Running   0
 114 op-etcd-operator-etcd-restore-operator-855f7478bf-r7qxp   1/1     Running   0
 115 op-prometheus-operator-operator-5c9b87965b-wjtw5          1/1     Running   1
 116 op-sparkoperator-6cb4db884c-75rcd                         1/1     Running   0
 117 strimzi-cluster-operator-5bffdd7b85-rlrvj                 1/1     Running   0
 118 ```
 119 #### Troubleshooting Operator installation
 120 Sometimes deleting the previously installed Operator package will fail to remove all operator pods. To troubleshoot this ensure these following steps.
 121
 122 1. Make sure that all the other deployments or helm release is deleted (purged). Operator package is a baseline package for the applications, so if the applications are still running while trying to delete the operator package might result in unwarrented state.
 123
 124 2. Delete all the resources and CRDs associated with operator package.
 125 ```bash
 126 #NOTE: Use the same release name and namespace as in installation of operator package in the previous section
 127 cd $DA_WORKING_DIR/operator
 128 helm template -n operator . -f values.yaml --namespace=operator > ../delete_operator.yaml
 129 cd ../
 130 kubectl delete -f delete_operator.yaml
 131 ```
 132 ## Install Collection package
 133 Note: Collectd.conf is avaliable in $DA_WORKING_DIR/collection/charts/collectd/resources/config directory. Any valid collectd.conf can be placed here.
 134 ```bash
 135 Default (For custom collectd skip this section)
 136 =======
 137 cd $DA_WORKING_DIR/collection
 138 helm install -n cp . -f values.yaml --namespace=edge1
 139
 140 Custom Collectd
 141 ===============
 142 1. Build the custom collectd image
 143 2. Set COLLECTD_IMAGE_NAME with appropriate image_repository:tag
 144 3. Push the image to docker registry using the command
 145 4. docker push ${COLLECTD_IMAGE_NAME}
 146 5. Edit the values.yaml and change the image repository and tag using
 147    COLLECTD_IMAGE_NAME appropriately.
 148 6. Place the collectd.conf in
 149    $DA_WORKING_DIR/collection/charts/collectd/resources
 150
 151 7. cd $DA_WORKING_DIR/collection
 152 8. helm install -n cp . -f values.yaml --namespace=edge1
 153 ```
 154
 155 #### Verify Collection package
 156 * Check if all pods are up in edge1 namespace
 157 * Check the prometheus UI using port-forwarding port 9090 (default for prometheus service)
 158 ```
 159 $ kubectl get pods -n edge1
 160 NAME                                      READY   STATUS    RESTARTS   AGE
 161 cp-cadvisor-8rk2b                       1/1     Running   0          15s
 162 cp-cadvisor-nsjr6                       1/1     Running   0          15s
 163 cp-collectd-h5krd                       1/1     Running   0          23s
 164 cp-collectd-jc9m2                       1/1     Running   0          23s
 165 cp-prometheus-node-exporter-blc6p       1/1     Running   0          17s
 166 cp-prometheus-node-exporter-qbvdx       1/1     Running   0          17s
 167 prometheus-cp-prometheus-prometheus-0   4/4     Running   1          33s
 168
 169 $ kubectl get svc -n edge1
 170 NAME                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)
 171 cadvisor                        NodePort    10.43.53.122   <none>        80:30091/TCP
 172 collectd                        ClusterIP   10.43.222.34   <none>        9103/TCP
 173 cp13-prometheus-node-exporter   ClusterIP   10.43.17.242   <none>        9100/TCP
 174 cp13-prometheus-prometheus      NodePort    10.43.26.155   <none>        9090:30090/TCP
 175 prometheus-operated             ClusterIP   None           <none>        9090/TCP
 176 ```
 177 #### Configure Collectd Plugins
 178 1. Using the sample [collectdglobal.yaml](microservices/collectd-operator/examples/collectd/collectdglobal.yaml), Configure the CollectdGlobal CR
 179 2. If there are additional Types.db files to update, Copy the additional types.db files to resources folder.
 180 3. Create a ConfigMap to load the types.db and update the configMap with name of the ConfigMap created.
 181 4. Create and configure the required CollectdPlugin CRs. Use these samples as a reference [cpu_collectdplugin_cr.yaml](microservices/collectd-operator/examples/collectd/cpu_collectdplugin_cr.yaml), [prometheus_collectdplugin_cr.yaml](microservices/collectd-operator/examples/collectd/prometheus_collectdplugin_cr.yaml).
 182 4. Use the same namespace where the collection package was installed.
 183 5. Assuming it is edge1, create the config resources that are applicable. Apply the following commands in the same order.
 184 ```yaml
 185 # Note:
 186 ## 1. Create Configmap is optional and required only if additional types.db file needs to be mounted.
 187 ## 2. Add/Remove --from-file accordingly. Use the correct file name based on the context.
 188 kubectl create configmap typesdb-configmap --from-file ./resource/[FILE_NAME1] --from-file ./resource/[FILE_NAME2]
 189 kubectl create -f edge1 collectdglobal.yaml
 190 kubectl create -f edge1 [PLUGIN_NAME1]_collectdplugin_cr.yaml
 191 kubectl create -f edge1 [PLUGIN_NAME2]_collectdplugin_cr.yaml
 192 kubectl create -f edge1 [PLUGIN_NAME3]_collectdplugin_cr.yaml
 193 ...
 194 ```
 195
 196 #Install visualization package
 197 ```bash
 198 Default (For custom Grafana dashboards skip this section)
 199 =======
 200 cd $DA_WORKING_DIR/visualization
 201 helm install -n viz . -f values.yaml -f grafana-values.yaml
 202
 203 Custom Grafana dashboards
 204 =========================
 205 1. Place the custom dashboard definition into the folder $DA_WORKING_DIR/visualization/charts/grafana/dashboards
 206     Example dashboard definition can be found at $DA_WORKING_DIR/visualization/charts/grafana/dashboards/dashboard1.json
 207 2. Create a configmap.yaml that imports above created dashboard.json file as config and copy that configmap.yaml to $DA_WORKING_DIR/visualization/charts/grafana/templates/
 208     Example configmap can be found at $DA_WORKING_DIR/visualization/charts/grafana/templates/configmap-add-dashboard.yaml
 209 3. Add custom dashboard configuration to values.yaml or an overriding values.yaml.
 210     Example configuration can be found in the "dashboardProviders" section of grafana-values.yaml
 211
 212 4. cd $DA_WORKING_DIR/visualization
 213 5. For a fresh install of visualization package, do "helm install"
 214     e.g., helm install -n viz . -f values.yaml -f grafana-values.yaml
 215    If the custom dashboard is being added to an already running Grafana, do "helm upgrade"
 216     e.g., helm upgrade -n viz . -f values.yaml -f grafana-values.yaml -f ......
 217 ```
 218
 219 #### Verify Visualization package
 220 Check if the visualization pod is up
 221 ```
 222 $ kubectl get pods
 223     NAME                          READY   STATUS    RESTARTS   AGE
 224     viz-grafana-78dcffd75-sxnjv   1/1     Running   0          52m
 225 ```
 226
 227 ### Login to Grafana
 228 ```
 229 1. Get your 'admin' user password by running:
 230     kubectl get secret --namespace default viz-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
 231
 232 2. Get the Grafana URL to visit by running these commands in the same shell:
 233     export POD_NAME=$(kubectl get pods --namespace default -l "app=grafana,release=viz" -o jsonpath="{.items[0].metadata.name}")
 234     kubectl --namespace default port-forward $POD_NAME 3000
 235
 236 3. Visit the URL : http://localhost:3000 and login with the password from step 1 and the username: admin
 237 ```
 238
 239 #### Configure Grafana Datasources
 240 Using the sample [prometheus_grafanadatasource_cr.yaml](microservices/visualization-operator/examples/grafana/prometheus_grafanadatasource_cr.yaml), Configure the GrafanaDataSource CR by running the command below
 241 ```yaml
 242 kubectl create -f [DATASOURCE_NAME1]_grafanadatasource_cr.yaml
 243 kubectl create -f [DATASOURCE_NAME2]_grafanadatasource_cr.yaml
 244 ...
 245 ```
 246
 247 ## Install Minio Model repository
 248 * Prerequisite: Dynamic storage provisioner needs to be enabled. Either rook-ceph ($DA_WORKING_DIR/00-init) or another alternate provisioner needs to be enabled.
 249 ```bash
 250 cd $DA_WORKING_DIR/minio
 251
 252 Edit the values.yaml to set the credentials to access the minio UI.
 253 Default values are
 254 accessKey: "onapdaas"
 255 secretKey: "onapsecretdaas"
 256
 257 helm install -n minio . -f values.yaml --namespace=edge1
 258 ```
 259
 260 ## Install Messaging platform
 261
 262 We have currently support strimzi based kafka operator.
 263 Navigate to ```$DA_WORKING_DIR/deploy/messaging/charts/strimzi-kafka-operator``` directory.
 264 Use the below command :
 265 ```
 266 helm install . -f values.yaml  --name sko --namespace=test
 267 ```
 268
 269 NOTE: Make changes in the values.yaml if required.
 270
 271 Once the strimzi operator ready, you shall get a pod like :
 272
 273 ```
 274 strimzi-cluster-operator-5cf7648b8c-zgxv7       1/1     Running   0          53m
 275 ```
 276
 277 Once this done, install the kafka package like any other helm charts you have.
 278 Navigate to dir : ```$DA_WORKING_DIRdeploy/messaging``` and use command:
 279 ```
 280 helm install --name kafka-cluster charts/kafka/
 281 ```
 282
 283 Once this done, you should have the following pods up and running.
 284
 285 ```
 286 kafka-cluster-entity-operator-b6557fc6c-hlnkm   3/3     Running   0          47m
 287 kafka-cluster-kafka-0                           2/2     Running   0          48m
 288 kafka-cluster-kafka-1                           2/2     Running   0          48m
 289 kafka-cluster-kafka-2                           2/2     Running   0          48m
 290 kafka-cluster-zookeeper-0                       2/2     Running   0          49m
 291 kafka-cluster-zookeeper-1                       2/2     Running   0          49m
 292 kafka-cluster-zookeeper-2                       2/2     Running   0          49m
 293 ```
 294
 295 You should have the following services when do a ```kubectl get svc```
 296
 297 ```
 298 kafka-cluster-kafka-bootstrap    ClusterIP   10.XX.YY.ZZ   <none>        9091/TCP,9092/TCP,9093/TCP   53m
 299 kafka-cluster-kafka-brokers      ClusterIP   None           <none>        9091/TCP,9092/TCP,9093/TCP   53m
 300 kafka-cluster-zookeeper-client   ClusterIP   10.XX.YY.ZZ   <none>        2181/TCP                     55m
 301 kafka-cluster-zookeeper-nodes    ClusterIP   None           <none>        2181/TCP,2888/TCP,3888/TCP   55m
 302 ```
 303 #### Testing messaging
 304
 305 You can test your kafka brokers by creating a simple producer and consumer.
 306
 307 Producer :
 308 ```
 309 kubectl run kafka-producer -ti --image=strimzi/kafka:0.12.2-kafka-2.2.1 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list kafka-cluster-kafka-bootstrap:9092 --topic my-topic
 310  ```
 311  Consumer :
 312  ```
 313
 314 kubectl run kafka-consumer -ti --image=strimzi/kafka:0.12.2-kafka-2.2.1 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server kafka-cluster-kafka-bootstrap:9092 --topic my-topic --from-beginning
 315 ```
 316
 317 ## Install Training Package
 318
 319 #### Install M3DB (Time series Data lake)
 320 ##### Pre-requisites
 321 1.  kubernetes cluster with atleast 3 nodes
 322 2.  Etcd operator, M3DB operator
 323 3.  Node labelled with zone and region.
 324
 325 ```bash
 326 ## Defult region is us-west1, Default labels are us-west1-a, us-west1-b, us-west1-c
 327 ## If this is changed then isolationGroups in training-core/charts/m3db/values.yaml needs to be updated.
 328 NODES=($(kubectl get nodes --output=jsonpath={.items..metadata.name}))
 329
 330 kubectl label node/${NODES[0]} failure-domain.beta.kubernetes.io/region=us-west1
 331 kubectl label node/${NODES[1]} failure-domain.beta.kubernetes.io/region=us-west1
 332 kubectl label node/${NODES[2]} failure-domain.beta.kubernetes.io/region=us-west1
 333
 334 kubectl label node/${NODES[0]} failure-domain.beta.kubernetes.io/zone=us-west1-a --overwrite=true
 335 kubectl label node/${NODES[1]} failure-domain.beta.kubernetes.io/zone=us-west1-b --overwrite=true
 336 kubectl label node/${NODES[2]} failure-domain.beta.kubernetes.io/zone=us-west1-c --overwrite=true
 337 ```
 338 ```bash
 339 cd $DA_WORKING_DIR/training-core/charts/m3db
 340 helm install -n m3db . -f values.yaml --namespace training
 341 ```
 342 ```
 343 $ kubectl get pods -n training
 344 NAME                   READY   STATUS    RESTARTS   AGE
 345 m3db-cluster-rep0-0    1/1     Running   0          103s
 346 m3db-cluster-rep1-0    1/1     Running   0          83s
 347 m3db-cluster-rep1-0    1/1     Running   0          62s
 348 m3db-etcd-sjhgl4xfgc   1/1     Running   0          83s
 349 m3db-etcd-lfs96hngz6   1/1     Running   0          67s
 350 m3db-etcd-rmgdkkx4bq   1/1     Running   0          51s
 351 ```
 352
 353 ##### Configure remote write from Prometheus to M3DB
 354 ```bash
 355 cd $DA_WORKING_DIR/day2_configs/prometheus/
 356 ```
 357 ```yaml
 358 cat << EOF > add_m3db_remote.yaml
 359 spec:
 360   remoteWrite:
 361   - url: "http://m3coordinator-m3db.training.svc.cluster.local:7201/api/v1/prom/remote/write"
 362     writeRelabelConfigs:
 363       - targetLabel: metrics_storage
 364         replacement: m3db_remote
 365 EOF
 366 ```
 367 ```bash
 368 kubectl patch --namespace=edge1 prometheus cp-prometheus-prometheus -p "$(cat add_m3db_remote.yaml)" --type=merge
 369 ```
 370 Verify the prometheus GUI to see if the m3db remote write is enabled.