Visualization operator - update/delete datasource
[demo.git] / vnfs / DAaaS / README.md
1 # Distributed Analytics Framework
2
3
4 ## Pre-requisites
5 | Required   | Version |
6 |------------|---------|
7 | Kubernetes | 1.12.3+ |
8 | Docker CE  | 18.09+  |
9 | Helm       | >=2.12.1 and <=2.13.1 |
10 ## Download Framework
11 ```bash
12 git clone https://github.com/onap/demo.git
13 DA_WORKING_DIR=$PWD/demo/vnfs/DAaaS/deploy
14 ```
15
16 ## Install Rook-Ceph for Persistent Storage
17 Note: This is unusual but Flex volume path can be different than the default value. values.yaml has the most common flexvolume path configured. In case of errors related to flexvolume please refer to the https://rook.io/docs/rook/v0.9/flexvolume.html#configuring-the-flexvolume-path to find the appropriate flexvolume-path and set it in values.yaml
18 ```bash
19 cd $DA_WORKING_DIR/00-init/rook-ceph
20 helm install -n rook . -f values.yaml --namespace=rook-ceph-system
21 ```
22 Check for the status of the pods in rook-ceph namespace. Once all pods are in Ready state move on to the next section.
23
24 ```bash
25 $ kubectl get pods -n rook-ceph-system
26 NAME                                 READY   STATUS    RESTARTS   AGE
27 rook-ceph-agent-9wszf                1/1     Running   0          121s
28 rook-ceph-agent-xnbt8                1/1     Running   0          121s
29 rook-ceph-operator-bc77d6d75-ltwww   1/1     Running   0          158s
30 rook-discover-bvj65                  1/1     Running   0          133s
31 rook-discover-nbfrp                  1/1     Running   0          133s
32 ```
33 ```bash
34 $ kubectl -n rook-ceph get pod
35 NAME                                   READY   STATUS      RESTARTS   AGE
36 rook-ceph-mgr-a-d9dcf5748-5s9ft        1/1     Running     0          77s
37 rook-ceph-mon-a-7d8f675889-nw5pl       1/1     Running     0          105s
38 rook-ceph-mon-b-856fdd5cb9-5h2qk       1/1     Running     0          94s
39 rook-ceph-mon-c-57545897fc-j576h       1/1     Running     0          85s
40 rook-ceph-osd-0-7cbbbf749f-j8fsd       1/1     Running     0          25s
41 rook-ceph-osd-1-7f67f9646d-44p7v       1/1     Running     0          25s
42 rook-ceph-osd-2-6cd4b776ff-v4d68       1/1     Running     0          25s
43 rook-ceph-osd-prepare-vx2rz            0/2     Completed   0          60s
44 rook-ceph-tools-5bd5cdb949-j68kk       1/1     Running     0          53s
45 ```
46
47 #### Troubleshooting Rook-Ceph installation
48
49 In case your machine had rook previously installed successfully or unsuccessfully
50 and you are attempting a fresh installation of rook operator, you may face some issues.
51 Lets help you with that.
52
53 * First check if there are some rook CRDs existing :
54 ```
55 kubectl get crds | grep rook
56 ```
57 If this return results like :
58 ```
59 otc@otconap7 /var/lib/rook $  kc get crds | grep rook
60 cephblockpools.ceph.rook.io         2019-07-19T18:19:05Z
61 cephclusters.ceph.rook.io           2019-07-19T18:19:05Z
62 cephfilesystems.ceph.rook.io        2019-07-19T18:19:05Z
63 cephobjectstores.ceph.rook.io       2019-07-19T18:19:05Z
64 cephobjectstoreusers.ceph.rook.io   2019-07-19T18:19:05Z
65 volumes.rook.io                     2019-07-19T18:19:05Z
66 ```
67 then you should delete these previously existing rook based CRDs by generating a delete 
68 manifest file by these commands and then deleting those files:
69 ```
70 helm template -n rook . -f values.yaml > ~/delete.yaml
71 kc delete -f ~/delete.yaml
72 ```
73
74 After this, delete the below directory in all the nodes.
75 ```
76 sudo rm -rf /var/lib/rook/
77 ```
78 Now, again attempt : 
79 ```
80 helm install -n rook . -f values.yaml --namespace=rook-ceph-system
81 ```
82
83 ## Install Operator package
84 ### Build docker images
85 #### collectd-operator
86 ```bash
87 cd $DA_WORKING_DIR/../microservices
88
89 ## Note: The image tag and respository in the Collectd-operator helm charts needs to match the IMAGE_NAME
90 IMAGE_NAME=dcr.cluster.local:32644/collectd-operator:latest
91 ./build_image.sh collectd-operator $IMAGE_NAME
92 ```
93 #### visualization-operator
94 ```bash
95 cd $DA_WORKING_DIR/../microservices/visualization-operator
96
97 ## Note: The image tag and respository in the Visualization-operator helm charts needs to match the IMAGE_NAME
98 IMAGE_NAME=dcr.cluster.local:32644/visualization-operator:latest
99 ./build/build_image.sh $IMAGE_NAME
100 ```
101
102 ### Install the Operator Package
103 ```bash
104 cd $DA_WORKING_DIR/operator
105 helm install -n operator . -f values.yaml --namespace=operator
106 ```
107 Check for the status of the pods in operator namespace. Check if Prometheus operator pods are in Ready state.
108 ```bash
109 kubectl get pods -n operator
110 NAME                                                      READY   STATUS    RESTARTS
111 m3db-operator-0                                           1/1     Running   0
112 op-etcd-operator-etcd-backup-operator-6cdc577f7d-ltgsr    1/1     Running   0
113 op-etcd-operator-etcd-operator-79fd99f8b7-fdc7p           1/1     Running   0
114 op-etcd-operator-etcd-restore-operator-855f7478bf-r7qxp   1/1     Running   0
115 op-prometheus-operator-operator-5c9b87965b-wjtw5          1/1     Running   1
116 op-sparkoperator-6cb4db884c-75rcd                         1/1     Running   0
117 strimzi-cluster-operator-5bffdd7b85-rlrvj                 1/1     Running   0
118 ```
119 #### Troubleshooting Operator installation
120 Sometimes deleting the previously installed Operator package will fail to remove all operator pods. To troubleshoot this ensure these following steps.
121
122 1. Make sure that all the other deployments or helm release is deleted (purged). Operator package is a baseline package for the applications, so if the applications are still running while trying to delete the operator package might result in unwarrented state. 
123
124 2. Delete all the resources and CRDs associated with operator package.
125 ```bash
126 #NOTE: Use the same release name and namespace as in installation of operator package in the previous section
127 cd $DA_WORKING_DIR/operator
128 helm template -n operator . -f values.yaml --namespace=operator > ../delete_operator.yaml
129 cd ../
130 kubectl delete -f delete_operator.yaml
131 ```
132 ## Install Collection package
133 Note: Collectd.conf is avaliable in $DA_WORKING_DIR/collection/charts/collectd/resources/config directory. Any valid collectd.conf can be placed here.
134 ```bash
135 Default (For custom collectd skip this section)
136 =======
137 cd $DA_WORKING_DIR/collection
138 helm install -n cp . -f values.yaml --namespace=edge1
139
140 Custom Collectd
141 ===============
142 1. Build the custom collectd image
143 2. Set COLLECTD_IMAGE_NAME with appropriate image_repository:tag
144 3. Push the image to docker registry using the command
145 4. docker push ${COLLECTD_IMAGE_NAME}
146 5. Edit the values.yaml and change the image repository and tag using 
147    COLLECTD_IMAGE_NAME appropriately.
148 6. Place the collectd.conf in 
149    $DA_WORKING_DIR/collection/charts/collectd/resources
150
151 7. cd $DA_WORKING_DIR/collection
152 8. helm install -n cp . -f values.yaml --namespace=edge1
153 ```
154
155 #### Verify Collection package
156 * Check if all pods are up in edge1 namespace
157 * Check the prometheus UI using port-forwarding port 9090 (default for prometheus service)
158 ```
159 $ kubectl get pods -n edge1
160 NAME                                      READY   STATUS    RESTARTS   AGE
161 cp-cadvisor-8rk2b                       1/1     Running   0          15s
162 cp-cadvisor-nsjr6                       1/1     Running   0          15s
163 cp-collectd-h5krd                       1/1     Running   0          23s
164 cp-collectd-jc9m2                       1/1     Running   0          23s
165 cp-prometheus-node-exporter-blc6p       1/1     Running   0          17s
166 cp-prometheus-node-exporter-qbvdx       1/1     Running   0          17s
167 prometheus-cp-prometheus-prometheus-0   4/4     Running   1          33s
168
169 $ kubectl get svc -n edge1
170 NAME                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)  
171 cadvisor                        NodePort    10.43.53.122   <none>        80:30091/TCP
172 collectd                        ClusterIP   10.43.222.34   <none>        9103/TCP
173 cp13-prometheus-node-exporter   ClusterIP   10.43.17.242   <none>        9100/TCP
174 cp13-prometheus-prometheus      NodePort    10.43.26.155   <none>        9090:30090/TCP
175 prometheus-operated             ClusterIP   None           <none>        9090/TCP
176 ```
177 #### Configure Collectd Plugins
178 1. Using the sample [collectdglobal.yaml](microservices/collectd-operator/examples/collectd/collectdglobal.yaml), Configure the CollectdGlobal CR
179 2. If there are additional Types.db files to update, Copy the additional types.db files to resources folder. 
180 3. Create a ConfigMap to load the types.db and update the configMap with name of the ConfigMap created.
181 4. Create and configure the required CollectdPlugin CRs. Use these samples as a reference [cpu_collectdplugin_cr.yaml](microservices/collectd-operator/examples/collectd/cpu_collectdplugin_cr.yaml), [prometheus_collectdplugin_cr.yaml](microservices/collectd-operator/examples/collectd/prometheus_collectdplugin_cr.yaml).
182 4. Use the same namespace where the collection package was installed.
183 5. Assuming it is edge1, create the config resources that are applicable. Apply the following commands in the same order.
184 ```yaml
185 # Note: 
186 ## 1. Create Configmap is optional and required only if additional types.db file needs to be mounted.
187 ## 2. Add/Remove --from-file accordingly. Use the correct file name based on the context.
188 kubectl create configmap typesdb-configmap --from-file ./resource/[FILE_NAME1] --from-file ./resource/[FILE_NAME2]
189 kubectl create -f edge1 collectdglobal.yaml
190 kubectl create -f edge1 [PLUGIN_NAME1]_collectdplugin_cr.yaml
191 kubectl create -f edge1 [PLUGIN_NAME2]_collectdplugin_cr.yaml
192 kubectl create -f edge1 [PLUGIN_NAME3]_collectdplugin_cr.yaml
193 ...
194 ```
195 #### Configure Grafana Datasources
196 Using the sample [prometheus_grafanadatasource_cr.yaml](microservices/visualization-operator/examples/grafana/prometheus_grafanadatasource_cr.yaml), Configure the GrafanaDataSource CR by running the command below
197 ```yaml
198 kubectl create -f [DATASOURCE_NAME1]_grafanadatasource_cr.yaml
199 kubectl create -f [DATASOURCE_NAME2]_grafanadatasource_cr.yaml
200 ...
201 ```
202
203 ## Install Minio Model repository
204 * Prerequisite: Dynamic storage provisioner needs to be enabled. Either rook-ceph ($DA_WORKING_DIR/00-init) or another alternate provisioner needs to be enabled.
205 ```bash
206 cd $DA_WORKING_DIR/minio
207
208 Edit the values.yaml to set the credentials to access the minio UI.
209 Default values are
210 accessKey: "onapdaas"
211 secretKey: "onapsecretdaas"
212
213 helm install -n minio . -f values.yaml --namespace=edge1
214 ```
215
216 ## Install Messaging platform
217
218 We have currently support strimzi based kafka operator.
219 Navigate to ```$DA_WORKING_DIR/deploy/messaging/charts/strimzi-kafka-operator``` directory.
220 Use the below command :
221 ```
222 helm install . -f values.yaml  --name sko --namespace=test
223 ```
224
225 NOTE: Make changes in the values.yaml if required.
226
227 Once the strimzi operator ready, you shall get a pod like :
228
229 ```
230 strimzi-cluster-operator-5cf7648b8c-zgxv7       1/1     Running   0          53m
231 ```
232
233 Once this done, install the kafka package like any other helm charts you have.
234 Navigate to dir : ```$DA_WORKING_DIRdeploy/messaging``` and use command:
235 ```
236 helm install --name kafka-cluster charts/kafka/
237 ```
238
239 Once this done, you should have the following pods up and running.
240
241 ```
242 kafka-cluster-entity-operator-b6557fc6c-hlnkm   3/3     Running   0          47m
243 kafka-cluster-kafka-0                           2/2     Running   0          48m
244 kafka-cluster-kafka-1                           2/2     Running   0          48m
245 kafka-cluster-kafka-2                           2/2     Running   0          48m
246 kafka-cluster-zookeeper-0                       2/2     Running   0          49m
247 kafka-cluster-zookeeper-1                       2/2     Running   0          49m
248 kafka-cluster-zookeeper-2                       2/2     Running   0          49m
249 ```
250
251 You should have the following services when do a ```kubectl get svc```
252
253 ```
254 kafka-cluster-kafka-bootstrap    ClusterIP   10.XX.YY.ZZ   <none>        9091/TCP,9092/TCP,9093/TCP   53m
255 kafka-cluster-kafka-brokers      ClusterIP   None           <none>        9091/TCP,9092/TCP,9093/TCP   53m
256 kafka-cluster-zookeeper-client   ClusterIP   10.XX.YY.ZZ   <none>        2181/TCP                     55m
257 kafka-cluster-zookeeper-nodes    ClusterIP   None           <none>        2181/TCP,2888/TCP,3888/TCP   55m
258 ```
259 #### Testing messaging 
260
261 You can test your kafka brokers by creating a simple producer and consumer.
262
263 Producer : 
264 ```
265 kubectl run kafka-producer -ti --image=strimzi/kafka:0.12.2-kafka-2.2.1 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list kafka-cluster-kafka-bootstrap:9092 --topic my-topic
266  ```
267  Consumer :
268  ```
269
270 kubectl run kafka-consumer -ti --image=strimzi/kafka:0.12.2-kafka-2.2.1 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server kafka-cluster-kafka-bootstrap:9092 --topic my-topic --from-beginning
271 ```
272
273 ## Install Training Package
274
275 #### Install M3DB (Time series Data lake)
276 ##### Pre-requisites
277 1.  kubernetes cluster with atleast 3 nodes
278 2.  Etcd operator, M3DB operator
279 3.  Node labelled with zone and region.
280
281 ```bash
282 ## Defult region is us-west1, Default labels are us-west1-a, us-west1-b, us-west1-c
283 ## If this is changed then isolationGroups in training-core/charts/m3db/values.yaml needs to be updated.
284 NODES=($(kubectl get nodes --output=jsonpath={.items..metadata.name}))
285
286 kubectl label node/${NODES[0]} failure-domain.beta.kubernetes.io/region=us-west1
287 kubectl label node/${NODES[1]} failure-domain.beta.kubernetes.io/region=us-west1
288 kubectl label node/${NODES[2]} failure-domain.beta.kubernetes.io/region=us-west1
289
290 kubectl label node/${NODES[0]} failure-domain.beta.kubernetes.io/zone=us-west1-a --overwrite=true
291 kubectl label node/${NODES[1]} failure-domain.beta.kubernetes.io/zone=us-west1-b --overwrite=true
292 kubectl label node/${NODES[2]} failure-domain.beta.kubernetes.io/zone=us-west1-c --overwrite=true
293 ```
294 ```bash
295 cd $DA_WORKING_DIR/training-core/charts/m3db
296 helm install -n m3db . -f values.yaml --namespace training
297 ```
298 ```
299 $ kubectl get pods -n training
300 NAME                   READY   STATUS    RESTARTS   AGE
301 m3db-cluster-rep0-0    1/1     Running   0          103s
302 m3db-cluster-rep1-0    1/1     Running   0          83s
303 m3db-cluster-rep1-0    1/1     Running   0          62s
304 m3db-etcd-sjhgl4xfgc   1/1     Running   0          83s
305 m3db-etcd-lfs96hngz6   1/1     Running   0          67s
306 m3db-etcd-rmgdkkx4bq   1/1     Running   0          51s
307 ```
308
309 ##### Configure remote write from Prometheus to M3DB
310 ```bash
311 cd $DA_WORKING_DIR/day2_configs/prometheus/
312 ```
313 ```yaml
314 cat << EOF > add_m3db_remote.yaml
315 spec:
316   remoteWrite:
317   - url: "http://m3coordinator-m3db.training.svc.cluster.local:7201/api/v1/prom/remote/write"
318     writeRelabelConfigs:
319       - targetLabel: metrics_storage
320         replacement: m3db_remote
321 EOF
322 ```
323 ```bash
324 kubectl patch --namespace=edge1 prometheus cp-prometheus-prometheus -p "$(cat add_m3db_remote.yaml)" --type=merge
325 ```
326 Verify the prometheus GUI to see if the m3db remote write is enabled.