1 # Distributed Analytics Framework
6 |------------|---------|
7 | Kubernetes | 1.12.3+ |
9 | Helm | >=2.12.1 and <=2.13.1 |
12 git clone https://github.com/onap/demo.git
13 DA_WORKING_DIR=$PWD/demo/vnfs/DAaaS/deploy
16 ## Install Rook-Ceph for Persistent Storage
17 Note: This is unusual but Flex volume path can be different than the default value. values.yaml has the most common flexvolume path configured. In case of errors related to flexvolume please refer to the https://rook.io/docs/rook/v0.9/flexvolume.html#configuring-the-flexvolume-path to find the appropriate flexvolume-path and set it in values.yaml
19 cd $DA_WORKING_DIR/00-init/rook-ceph
20 helm install -n rook . -f values.yaml --namespace=rook-ceph-system
22 Check for the status of the pods in rook-ceph namespace. Once all pods are in Ready state move on to the next section.
25 $ kubectl get pods -n rook-ceph-system
26 NAME READY STATUS RESTARTS AGE
27 rook-ceph-agent-9wszf 1/1 Running 0 121s
28 rook-ceph-agent-xnbt8 1/1 Running 0 121s
29 rook-ceph-operator-bc77d6d75-ltwww 1/1 Running 0 158s
30 rook-discover-bvj65 1/1 Running 0 133s
31 rook-discover-nbfrp 1/1 Running 0 133s
34 $ kubectl -n rook-ceph get pod
35 NAME READY STATUS RESTARTS AGE
36 rook-ceph-mgr-a-d9dcf5748-5s9ft 1/1 Running 0 77s
37 rook-ceph-mon-a-7d8f675889-nw5pl 1/1 Running 0 105s
38 rook-ceph-mon-b-856fdd5cb9-5h2qk 1/1 Running 0 94s
39 rook-ceph-mon-c-57545897fc-j576h 1/1 Running 0 85s
40 rook-ceph-osd-0-7cbbbf749f-j8fsd 1/1 Running 0 25s
41 rook-ceph-osd-1-7f67f9646d-44p7v 1/1 Running 0 25s
42 rook-ceph-osd-2-6cd4b776ff-v4d68 1/1 Running 0 25s
43 rook-ceph-osd-prepare-vx2rz 0/2 Completed 0 60s
44 rook-ceph-tools-5bd5cdb949-j68kk 1/1 Running 0 53s
47 #### Troubleshooting Rook-Ceph installation
49 In case your machine had rook previously installed successfully or unsuccessfully
50 and you are attempting a fresh installation of rook operator, you may face some issues.
51 Lets help you with that.
53 * First check if there are some rook CRDs existing :
55 kubectl get crds | grep rook
57 If this return results like :
59 otc@otconap7 /var/lib/rook $ kc get crds | grep rook
60 cephblockpools.ceph.rook.io 2019-07-19T18:19:05Z
61 cephclusters.ceph.rook.io 2019-07-19T18:19:05Z
62 cephfilesystems.ceph.rook.io 2019-07-19T18:19:05Z
63 cephobjectstores.ceph.rook.io 2019-07-19T18:19:05Z
64 cephobjectstoreusers.ceph.rook.io 2019-07-19T18:19:05Z
65 volumes.rook.io 2019-07-19T18:19:05Z
67 then you should delete these previously existing rook based CRDs by generating a delete
68 manifest file by these commands and then deleting those files:
70 helm template -n rook . -f values.yaml > ~/delete.yaml
71 kc delete -f ~/delete.yaml
74 After this, delete the below directory in all the nodes.
76 sudo rm -rf /var/lib/rook/
80 helm install -n rook . -f values.yaml --namespace=rook-ceph-system
83 ## Install Operator package
84 ### Build docker images
85 #### collectd-operator
87 cd $DA_WORKING_DIR/../microservices
89 ## Note: The image tag and respository in the Collectd-operator helm charts needs to match the IMAGE_NAME
90 IMAGE_NAME=dcr.cluster.local:32644/collectd-operator:latest
91 ./build_image.sh collectd-operator $IMAGE_NAME
93 #### visualization-operator
95 cd $DA_WORKING_DIR/../microservices
97 ## Note: The image tag and respository in the Visualization-operator helm charts needs to match the IMAGE_NAME
98 IMAGE_NAME=dcr.cluster.local:32644/visualization-operator:latest
99 ./build_image.sh visualization-operator $IMAGE_NAME
102 ### Install the Operator Package
104 cd $DA_WORKING_DIR/operator
105 helm install -n operator . -f values.yaml --namespace=operator
107 Check for the status of the pods in operator namespace. Check if Prometheus operator pods are in Ready state.
109 kubectl get pods -n operator
110 NAME READY STATUS RESTARTS
111 m3db-operator-0 1/1 Running 0
112 op-etcd-operator-etcd-backup-operator-6cdc577f7d-ltgsr 1/1 Running 0
113 op-etcd-operator-etcd-operator-79fd99f8b7-fdc7p 1/1 Running 0
114 op-etcd-operator-etcd-restore-operator-855f7478bf-r7qxp 1/1 Running 0
115 op-prometheus-operator-operator-5c9b87965b-wjtw5 1/1 Running 1
116 op-sparkoperator-6cb4db884c-75rcd 1/1 Running 0
117 strimzi-cluster-operator-5bffdd7b85-rlrvj 1/1 Running 0
119 #### Troubleshooting Operator installation
120 Sometimes deleting the previously installed Operator package will fail to remove all operator pods. To troubleshoot this ensure these following steps.
122 1. Make sure that all the other deployments or helm release is deleted (purged). Operator package is a baseline package for the applications, so if the applications are still running while trying to delete the operator package might result in unwarrented state.
124 2. Delete all the resources and CRDs associated with operator package.
126 #NOTE: Use the same release name and namespace as in installation of operator package in the previous section
127 cd $DA_WORKING_DIR/operator
128 helm template -n operator . -f values.yaml --namespace=operator > ../delete_operator.yaml
130 kubectl delete -f delete_operator.yaml
132 ## Install Collection package
133 Note: Collectd.conf is avaliable in $DA_WORKING_DIR/collection/charts/collectd/resources/config directory. Any valid collectd.conf can be placed here.
135 Default (For custom collectd skip this section)
137 cd $DA_WORKING_DIR/collection
138 helm install -n cp . -f values.yaml --namespace=edge1
142 1. Build the custom collectd image
143 2. Set COLLECTD_IMAGE_NAME with appropriate image_repository:tag
144 3. Push the image to docker registry using the command
145 4. docker push ${COLLECTD_IMAGE_NAME}
146 5. Edit the values.yaml and change the image repository and tag using
147 COLLECTD_IMAGE_NAME appropriately.
148 6. Place the collectd.conf in
149 $DA_WORKING_DIR/collection/charts/collectd/resources
151 7. cd $DA_WORKING_DIR/collection
152 8. helm install -n cp . -f values.yaml --namespace=edge1
155 #### Verify Collection package
156 * Check if all pods are up in edge1 namespace
157 * Check the prometheus UI using port-forwarding port 9090 (default for prometheus service)
159 $ kubectl get pods -n edge1
160 NAME READY STATUS RESTARTS AGE
161 cp-cadvisor-8rk2b 1/1 Running 0 15s
162 cp-cadvisor-nsjr6 1/1 Running 0 15s
163 cp-collectd-h5krd 1/1 Running 0 23s
164 cp-collectd-jc9m2 1/1 Running 0 23s
165 cp-prometheus-node-exporter-blc6p 1/1 Running 0 17s
166 cp-prometheus-node-exporter-qbvdx 1/1 Running 0 17s
167 prometheus-cp-prometheus-prometheus-0 4/4 Running 1 33s
169 $ kubectl get svc -n edge1
170 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
171 cadvisor NodePort 10.43.53.122 <none> 80:30091/TCP
172 collectd ClusterIP 10.43.222.34 <none> 9103/TCP
173 cp13-prometheus-node-exporter ClusterIP 10.43.17.242 <none> 9100/TCP
174 cp13-prometheus-prometheus NodePort 10.43.26.155 <none> 9090:30090/TCP
175 prometheus-operated ClusterIP None <none> 9090/TCP
177 #### Configure Collectd Plugins
178 1. Using the sample [collectdglobal.yaml](microservices/collectd-operator/examples/collectd/collectdglobal.yaml), Configure the CollectdGlobal CR
179 2. If there are additional Types.db files to update, Copy the additional types.db files to resources folder.
180 3. Create a ConfigMap to load the types.db and update the configMap with name of the ConfigMap created.
181 4. Create and configure the required CollectdPlugin CRs. Use these samples as a reference [cpu_collectdplugin_cr.yaml](microservices/collectd-operator/examples/collectd/cpu_collectdplugin_cr.yaml), [prometheus_collectdplugin_cr.yaml](microservices/collectd-operator/examples/collectd/prometheus_collectdplugin_cr.yaml).
182 4. Use the same namespace where the collection package was installed.
183 5. Assuming it is edge1, create the config resources that are applicable. Apply the following commands in the same order.
186 ## 1. Create Configmap is optional and required only if additional types.db file needs to be mounted.
187 ## 2. Add/Remove --from-file accordingly. Use the correct file name based on the context.
188 kubectl create configmap typesdb-configmap --from-file ./resource/[FILE_NAME1] --from-file ./resource/[FILE_NAME2]
189 kubectl create -f edge1 collectdglobal.yaml
190 kubectl create -f edge1 [PLUGIN_NAME1]_collectdplugin_cr.yaml
191 kubectl create -f edge1 [PLUGIN_NAME2]_collectdplugin_cr.yaml
192 kubectl create -f edge1 [PLUGIN_NAME3]_collectdplugin_cr.yaml
196 #Install visualization package
198 Default (For custom Grafana dashboards skip this section)
200 cd $DA_WORKING_DIR/visualization
201 helm install -n viz . -f values.yaml -f grafana-values.yaml
203 Custom Grafana dashboards
204 =========================
205 1. Place the custom dashboard definition into the folder $DA_WORKING_DIR/visualization/charts/grafana/dashboards
206 Example dashboard definition can be found at $DA_WORKING_DIR/visualization/charts/grafana/dashboards/dashboard1.json
207 2. Create a configmap.yaml that imports above created dashboard.json file as config and copy that configmap.yaml to $DA_WORKING_DIR/visualization/charts/grafana/templates/
208 Example configmap can be found at $DA_WORKING_DIR/visualization/charts/grafana/templates/configmap-add-dashboard.yaml
209 3. Add custom dashboard configuration to values.yaml or an overriding values.yaml.
210 Example configuration can be found in the "dashboardProviders" section of grafana-values.yaml
212 4. cd $DA_WORKING_DIR/visualization
213 5. For a fresh install of visualization package, do "helm install"
214 e.g., helm install -n viz . -f values.yaml -f grafana-values.yaml
215 If the custom dashboard is being added to an already running Grafana, do "helm upgrade"
216 e.g., helm upgrade -n viz . -f values.yaml -f grafana-values.yaml -f ......
219 #### Verify Visualization package
220 Check if the visualization pod is up
223 NAME READY STATUS RESTARTS AGE
224 viz-grafana-78dcffd75-sxnjv 1/1 Running 0 52m
229 1. Get your 'admin' user password by running:
230 kubectl get secret --namespace default viz-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
232 2. Get the Grafana URL to visit by running these commands in the same shell:
233 export POD_NAME=$(kubectl get pods --namespace default -l "app=grafana,release=viz" -o jsonpath="{.items[0].metadata.name}")
234 kubectl --namespace default port-forward $POD_NAME 3000
236 3. Visit the URL : http://localhost:3000 and login with the password from step 1 and the username: admin
239 #### Configure Grafana Datasources
240 Using the sample [prometheus_grafanadatasource_cr.yaml](microservices/visualization-operator/examples/grafana/prometheus_grafanadatasource_cr.yaml), Configure the GrafanaDataSource CR by running the command below
242 kubectl create -f [DATASOURCE_NAME1]_grafanadatasource_cr.yaml
243 kubectl create -f [DATASOURCE_NAME2]_grafanadatasource_cr.yaml
247 ## Install Minio Model repository
248 * Prerequisite: Dynamic storage provisioner needs to be enabled. Either rook-ceph ($DA_WORKING_DIR/00-init) or another alternate provisioner needs to be enabled.
250 cd $DA_WORKING_DIR/minio
252 Edit the values.yaml to set the credentials to access the minio UI.
254 accessKey: "onapdaas"
255 secretKey: "onapsecretdaas"
257 helm install -n minio . -f values.yaml --namespace=edge1
260 ## Install Messaging platform
262 We have currently support strimzi based kafka operator.
263 Navigate to ```$DA_WORKING_DIR/deploy/messaging/charts/strimzi-kafka-operator``` directory.
264 Use the below command :
266 helm install . -f values.yaml --name sko --namespace=test
269 NOTE: Make changes in the values.yaml if required.
271 Once the strimzi operator ready, you shall get a pod like :
274 strimzi-cluster-operator-5cf7648b8c-zgxv7 1/1 Running 0 53m
277 Once this done, install the kafka package like any other helm charts you have.
278 Navigate to dir : ```$DA_WORKING_DIRdeploy/messaging``` and use command:
280 helm install --name kafka-cluster charts/kafka/
283 Once this done, you should have the following pods up and running.
286 kafka-cluster-entity-operator-b6557fc6c-hlnkm 3/3 Running 0 47m
287 kafka-cluster-kafka-0 2/2 Running 0 48m
288 kafka-cluster-kafka-1 2/2 Running 0 48m
289 kafka-cluster-kafka-2 2/2 Running 0 48m
290 kafka-cluster-zookeeper-0 2/2 Running 0 49m
291 kafka-cluster-zookeeper-1 2/2 Running 0 49m
292 kafka-cluster-zookeeper-2 2/2 Running 0 49m
295 You should have the following services when do a ```kubectl get svc```
298 kafka-cluster-kafka-bootstrap ClusterIP 10.XX.YY.ZZ <none> 9091/TCP,9092/TCP,9093/TCP 53m
299 kafka-cluster-kafka-brokers ClusterIP None <none> 9091/TCP,9092/TCP,9093/TCP 53m
300 kafka-cluster-zookeeper-client ClusterIP 10.XX.YY.ZZ <none> 2181/TCP 55m
301 kafka-cluster-zookeeper-nodes ClusterIP None <none> 2181/TCP,2888/TCP,3888/TCP 55m
303 #### Testing messaging
305 You can test your kafka brokers by creating a simple producer and consumer.
309 kubectl run kafka-producer -ti --image=strimzi/kafka:0.12.2-kafka-2.2.1 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list kafka-cluster-kafka-bootstrap:9092 --topic my-topic
314 kubectl run kafka-consumer -ti --image=strimzi/kafka:0.12.2-kafka-2.2.1 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server kafka-cluster-kafka-bootstrap:9092 --topic my-topic --from-beginning
317 ## Install Training Package
319 #### Install M3DB (Time series Data lake)
321 1. kubernetes cluster with atleast 3 nodes
322 2. Etcd operator, M3DB operator
323 3. Node labelled with zone and region.
326 ## Defult region is us-west1, Default labels are us-west1-a, us-west1-b, us-west1-c
327 ## If this is changed then isolationGroups in training-core/charts/m3db/values.yaml needs to be updated.
328 NODES=($(kubectl get nodes --output=jsonpath={.items..metadata.name}))
330 kubectl label node/${NODES[0]} failure-domain.beta.kubernetes.io/region=us-west1
331 kubectl label node/${NODES[1]} failure-domain.beta.kubernetes.io/region=us-west1
332 kubectl label node/${NODES[2]} failure-domain.beta.kubernetes.io/region=us-west1
334 kubectl label node/${NODES[0]} failure-domain.beta.kubernetes.io/zone=us-west1-a --overwrite=true
335 kubectl label node/${NODES[1]} failure-domain.beta.kubernetes.io/zone=us-west1-b --overwrite=true
336 kubectl label node/${NODES[2]} failure-domain.beta.kubernetes.io/zone=us-west1-c --overwrite=true
339 cd $DA_WORKING_DIR/training-core/charts/m3db
340 helm install -n m3db . -f values.yaml --namespace training
343 $ kubectl get pods -n training
344 NAME READY STATUS RESTARTS AGE
345 m3db-cluster-rep0-0 1/1 Running 0 103s
346 m3db-cluster-rep1-0 1/1 Running 0 83s
347 m3db-cluster-rep1-0 1/1 Running 0 62s
348 m3db-etcd-sjhgl4xfgc 1/1 Running 0 83s
349 m3db-etcd-lfs96hngz6 1/1 Running 0 67s
350 m3db-etcd-rmgdkkx4bq 1/1 Running 0 51s
353 ##### Configure remote write from Prometheus to M3DB
355 cd $DA_WORKING_DIR/day2_configs/prometheus/
358 cat << EOF > add_m3db_remote.yaml
361 - url: "http://m3coordinator-m3db.training.svc.cluster.local:7201/api/v1/prom/remote/write"
363 - targetLabel: metrics_storage
364 replacement: m3db_remote
368 kubectl patch --namespace=edge1 prometheus cp-prometheus-prometheus -p "$(cat add_m3db_remote.yaml)" --type=merge
370 Verify the prometheus GUI to see if the m3db remote write is enabled.