1 .. This work is licensed under a Creative Commons Attribution 4.0
2 .. International License.
3 .. http://creativecommons.org/licenses/by/4.0
4 .. Copyright (C) 2022 Nordix Foundation
7 .. _Curated applications for Kubernetes: https://github.com/kubernetes/charts
8 .. _Services: https://kubernetes.io/docs/concepts/services-networking/service/
9 .. _ReplicaSet: https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
10 .. _StatefulSet: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
11 .. _Helm Documentation: https://docs.helm.sh/helm/
12 .. _Helm: https://docs.helm.sh/
13 .. _Kubernetes: https://Kubernetes.io/
14 .. _Kubernetes LoadBalancer: https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer
26 **THIS PAGE NEEDS TO BE EITHER REWRITTEN OR SOMETING AS SOME INFO IS NO LONGER RELEVANT**
28 The ONAP Operations Manager (OOM) provide the ability to manage the entire
29 life-cycle of an ONAP installation, from the initial deployment to final
30 decommissioning. This guide provides instructions for users of ONAP to
31 use the Kubernetes_/Helm_ system as a complete ONAP management system.
33 This guide provides many examples of Helm command line operations. For a
34 complete description of these commands please refer to the `Helm
37 .. figure:: ../../resources/images/oom_logo/oomLogoV2-medium.png
40 The following sections describe the life-cycle operations:
42 - Deploy_ - with built-in component dependency management
43 - Configure_ - unified configuration across all ONAP components
44 - Monitor_ - real-time health monitoring feeding to a Consul UI and Kubernetes
45 - Heal_- failed ONAP containers are recreated automatically
46 - Scale_ - cluster ONAP services to enable seamless scaling
47 - Upgrade_ - change-out containers or configuration with little or no service impact
48 - Delete_ - cleanup individual containers or entire deployments
50 .. figure:: ../../resources/images/oom_logo/oomLogoV2-Deploy.png
56 The OOM team with assistance from the ONAP project teams, have built a
57 comprehensive set of Helm charts, yaml files very similar to TOSCA files, that
58 describe the composition of each of the ONAP components and the relationship
59 within and between components. Using this model Helm is able to deploy all of
60 ONAP with a few simple commands.
62 Please refer to the :ref:`oom_deploy_guide` for deployment pre-requisites and options
65 Refer to the :ref:`oom_customize_overrides` section on how to update overrides.yaml and values.yaml
67 .. figure:: ../../resources/images/oom_logo/oomLogoV2-Configure.png
73 Each project within ONAP has its own configuration data generally consisting
74 of: environment variables, configuration files, and database initial values.
75 Many technologies are used across the projects resulting in significant
76 operational complexity and an inability to apply global parameters across the
77 entire ONAP deployment. OOM solves this problem by introducing a common
78 configuration technology, Helm charts, that provide a hierarchical
79 configuration with the ability to override values with higher
80 level charts or command line options.
82 The structure of the configuration of ONAP is shown in the following diagram.
83 Note that key/value pairs of a parent will always take precedence over those
84 of a child. Also note that values set on the command line have the highest
92 oValues [label="values.yaml"]
93 demo [label="onap-demo.yaml"]
94 prod [label="onap-production.yaml"]
95 oReq [label="Chart.yaml"]
96 soValues [label="values.yaml"]
97 soReq [label="Chart.yaml"]
98 mdValues [label="values.yaml"]
101 oResources [label="resources"]
105 oResources -> environments
118 The top level onap/values.yaml file contains the values required to be set
119 before deploying ONAP. Here is the contents of this file:
121 .. collapse:: Default ONAP values.yaml
123 .. include:: ../../../../kubernetes/onap/values.yaml
129 One may wish to create a value file that is specific to a given deployment such
130 that it can be differentiated from other deployments. For example, a
131 onap-development.yaml file may create a minimal environment for development
132 while onap-production.yaml might describe a production deployment that operates
133 independently of the developer version.
135 For example, if the production OpenStack instance was different from a
136 developer's instance, the onap-production.yaml file may contain a different
137 value for the vnfDeployment/openstack/oam_network_cidr key as shown below.
143 apps: consul msb mso message-router sdnc vid robot portal policy appc aai
144 sdc dcaegen2 log cli multicloud clamp vnfsdk aaf kube2msb
145 dataRootDir: /dockerdata-nfs
147 # docker repositories
149 onap: nexus3.onap.org:10001
152 filebeat: docker.elastic.co
157 # vnf deployment environment
160 ubuntu_14_image: "Ubuntu_14.04.5_LTS"
161 public_net_id: "e8f51956-00dd-4425-af36-045716781ffc"
162 oam_network_id: "d4769dfb-c9e4-4f72-b3d6-1d18f4ac4ee6"
163 oam_subnet_id: "191f7580-acf6-4c2b-8ec0-ba7d99b3bc4e"
164 oam_network_cidr: "192.168.30.0/24"
168 To deploy ONAP with this environment file, enter::
170 > helm deploy local/onap -n onap -f onap/resources/environments/onap-production.yaml --set global.masterPassword=password
173 .. collapse:: Default ONAP values.yaml
175 .. include:: ../../resources/yaml/environments_onap_demo.yaml
180 When deploying all of ONAP, the dependencies section of the Chart.yaml file
181 controls which and what version of the ONAP components are included.
182 Here is an excerpt of this file:
191 condition: so.enabled
194 The ~ operator in the `so` version value indicates that the latest "10.X.X"
195 version of `so` shall be used thus allowing the chart to allow for minor
196 upgrades that don't impact the so API; hence, version 10.0.1 will be installed
199 The onap/resources/environment/dev.yaml (see the excerpt below) enables
200 for fine grained control on what components are included as part of this
201 deployment. By changing this `so` line to `enabled: false` the `so` component
202 will not be deployed. If this change is part of an upgrade the existing `so`
203 component will be shut down. Other `so` parameters and even `so` child values
204 can be modified, for example the `so`'s `liveness` probe could be disabled
205 (which is not recommended as this change would disable auto-healing of `so`).
209 #################################################################
210 # Global configuration overrides.
212 # These overrides will affect all helm charts (ie. applications)
213 # that are listed below and are 'enabled'.
214 #################################################################
218 #################################################################
219 # Enable/disable and configure helm charts (ie. applications)
220 # to customize the ONAP deployment.
221 #################################################################
225 so: # Service Orchestrator
231 # necessary to disable liveness probe when setting breakpoints
232 # in debugger so K8s doesn't restart unresponsive container
237 Accessing the ONAP Portal using OOM and a Kubernetes Cluster
238 ------------------------------------------------------------
240 The ONAP deployment created by OOM operates in a private IP network that isn't
241 publicly accessible (i.e. OpenStack VMs with private internal network) which
242 blocks access to the ONAP Portal. To enable direct access to this Portal from a
243 user's own environment (a laptop etc.) the portal application's port 8989 is
244 exposed through a `Kubernetes LoadBalancer`_ object.
246 Typically, to be able to access the Kubernetes nodes publicly a public address
247 is assigned. In OpenStack this is a floating IP address.
249 When the `portal-app` chart is deployed a Kubernetes service is created that
250 instantiates a load balancer. The LB chooses the private interface of one of
251 the nodes as in the example below (10.0.0.4 is private to the K8s cluster only).
252 Then to be able to access the portal on port 8989 from outside the K8s &
253 OpenStack environment, the user needs to assign/get the floating IP address that
254 corresponds to the private IP as follows::
256 > kubectl -n onap get services|grep "portal-app"
257 portal-app LoadBalancer 10.43.142.201 10.0.0.4 8989:30215/TCP,8006:30213/TCP,8010:30214/TCP 1d app=portal-app,release=dev
260 In this example, use the 11.0.0.4 private address as a key find the
261 corresponding public address which in this example is 10.12.6.155. If you're
262 using OpenStack you'll do the lookup with the horizon GUI or the OpenStack CLI
263 for your tenant (openstack server list). That IP is then used in your
264 `/etc/hosts` to map the fixed DNS aliases required by the ONAP Portal as shown
267 10.12.6.155 portal.api.simpledemo.onap.org
268 10.12.6.155 vid.api.simpledemo.onap.org
269 10.12.6.155 sdc.api.fe.simpledemo.onap.org
270 10.12.6.155 sdc.workflow.plugin.simpledemo.onap.org
271 10.12.6.155 sdc.dcae.plugin.simpledemo.onap.org
272 10.12.6.155 portal-sdk.simpledemo.onap.org
273 10.12.6.155 policy.api.simpledemo.onap.org
274 10.12.6.155 aai.api.sparky.simpledemo.onap.org
275 10.12.6.155 cli.api.simpledemo.onap.org
276 10.12.6.155 msb.api.discovery.simpledemo.onap.org
277 10.12.6.155 msb.api.simpledemo.onap.org
278 10.12.6.155 clamp.api.simpledemo.onap.org
279 10.12.6.155 so.api.simpledemo.onap.org
280 10.12.6.155 sdc.workflow.plugin.simpledemo.onap.org
282 Ensure you've disabled any proxy settings the browser you are using to access
283 the portal and then simply access now the new ssl-encrypted URL:
284 ``https://portal.api.simpledemo.onap.org:30225/ONAPPORTAL/login.htm``
287 Using the HTTPS based Portal URL the Browser needs to be configured to accept
288 unsecure credentials.
289 Additionally when opening an Application inside the Portal, the Browser
290 might block the content, which requires to disable the blocking and reloading
294 Besides the ONAP Portal the Components can deliver additional user interfaces,
295 please check the Component specific documentation.
299 | Alternatives Considered:
301 - Kubernetes port forwarding was considered but discarded as it would
302 require the end user to run a script that opens up port forwarding tunnels
303 to each of the pods that provides a portal application widget.
305 - Reverting to a VNC server similar to what was deployed in the Amsterdam
306 release was also considered but there were many issues with resolution,
307 lack of volume mount, /etc/hosts dynamic update, file upload that were
308 a tall order to solve in time for the Beijing release.
312 - If you are not using floating IPs in your Kubernetes deployment and
313 directly attaching a public IP address (i.e. by using your public provider
314 network) to your K8S Node VMs' network interface, then the output of
315 'kubectl -n onap get services | grep "portal-app"'
316 will show your public IP instead of the private network's IP. Therefore,
317 you can grab this public IP directly (as compared to trying to find the
318 floating IP first) and map this IP in /etc/hosts.
320 .. figure:: ../../resources/images/oom_logo/oomLogoV2-Monitor.png
326 All highly available systems include at least one facility to monitor the
327 health of components within the system. Such health monitors are often used as
328 inputs to distributed coordination systems (such as etcd, Zookeeper, or Consul)
329 and monitoring systems (such as Nagios or Zabbix). OOM provides two mechanisms
330 to monitor the real-time health of an ONAP deployment:
332 - a Consul GUI for a human operator or downstream monitoring systems and
333 Kubernetes liveness probes that enable automatic healing of failed
335 - a set of liveness probes which feed into the Kubernetes manager which
336 are described in the Heal section.
338 Within ONAP, Consul is the monitoring system of choice and deployed by OOM in
341 - a three-way, centralized Consul server cluster is deployed as a highly
342 available monitor of all of the ONAP components, and
343 - a number of Consul agents.
345 The Consul server provides a user interface that allows a user to graphically
346 view the current health status of all of the ONAP components for which agents
347 have been created - a sample from the ONAP Integration labs follows:
349 .. figure:: ../../resources/images/consul/consulHealth.png
352 To see the real-time health of a deployment go to: ``http://<kubernetes IP>:30270/ui/``
353 where a GUI much like the following will be found:
356 If Consul GUI is not accessible, you can refer this
357 `kubectl port-forward <https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/>`_ method to access an application
359 .. figure:: ../../resources/images/oom_logo/oomLogoV2-Heal.png
365 The ONAP deployment is defined by Helm charts as mentioned earlier. These Helm
366 charts are also used to implement automatic recoverability of ONAP components
367 when individual components fail. Once ONAP is deployed, a "liveness" probe
368 starts checking the health of the components after a specified startup time.
370 Should a liveness probe indicate a failed container it will be terminated and a
371 replacement will be started in its place - containers are ephemeral. Should the
372 deployment specification indicate that there are one or more dependencies to
373 this container or component (for example a dependency on a database) the
374 dependency will be satisfied before the replacement container/component is
375 started. This mechanism ensures that, after a failure, all of the ONAP
376 components restart successfully.
378 To test healing, the following command can be used to delete a pod::
380 > kubectl delete pod [pod name] -n [pod namespace]
382 One could then use the following command to monitor the pods and observe the
383 pod being terminated and the service being automatically healed with the
384 creation of a replacement pod::
386 > kubectl get pods --all-namespaces -o=wide
388 .. figure:: ../../resources/images/oom_logo/oomLogoV2-Scale.png
394 Many of the ONAP components are horizontally scalable which allows them to
395 adapt to expected offered load. During the Beijing release scaling is static,
396 that is during deployment or upgrade a cluster size is defined and this cluster
397 will be maintained even in the presence of faults. The parameter that controls
398 the cluster size of a given component is found in the values.yaml file for that
399 component. Here is an excerpt that shows this parameter:
403 # default number of instances
406 In order to change the size of a cluster, an operator could use a helm upgrade
407 (described in detail in the next section) as follows::
409 > helm upgrade [RELEASE] [CHART] [flags]
411 The RELEASE argument can be obtained from the following command::
415 Below is the example for the same::
418 NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
419 dev 1 Wed Oct 14 13:49:52 2020 DEPLOYED onap-11.0.0 Kohn onap
420 dev-cassandra 5 Thu Oct 15 14:45:34 2020 DEPLOYED cassandra-11.0.0 onap
421 dev-contrib 1 Wed Oct 14 13:52:53 2020 DEPLOYED contrib-11.0.0 onap
422 dev-mariadb-galera 1 Wed Oct 14 13:55:56 2020 DEPLOYED mariadb-galera-11.0.0 onap
424 Here the Name column shows the RELEASE NAME, In our case we want to try the
425 scale operation on cassandra, thus the RELEASE NAME would be dev-cassandra.
427 Now we need to obtain the chart name for cassandra. Use the below
428 command to get the chart name::
430 > helm search cassandra
432 Below is the example for the same::
434 > helm search cassandra
435 NAME CHART VERSION APP VERSION DESCRIPTION
436 local/cassandra 11.0.0 ONAP cassandra
437 local/portal-cassandra 11.0.0 Portal cassandra
438 local/aaf-cass 11.0.0 ONAP AAF cassandra
439 local/sdc-cs 11.0.0 ONAP Service Design and Creation Cassandra
441 Here the Name column shows the chart name. As we want to try the scale
442 operation for cassandra, thus the corresponding chart name is local/cassandra
445 Now we have both the command's arguments, thus we can perform the
446 scale operation for cassandra as follows::
448 > helm upgrade dev-cassandra local/cassandra --set replicaCount=3
450 Using this command we can scale up or scale down the cassandra db instances.
453 The ONAP components use Kubernetes provided facilities to build clustered,
454 highly available systems including: Services_ with load-balancers, ReplicaSet_,
455 and StatefulSet_. Some of the open-source projects used by the ONAP components
456 directly support clustered configurations, for example ODL and MariaDB Galera.
458 The Kubernetes Services_ abstraction to provide a consistent access point for
459 each of the ONAP components, independent of the pod or container architecture
460 of that component. For example, SDN-C uses OpenDaylight clustering with a
461 default cluster size of three but uses a Kubernetes service to and change the
462 number of pods in this abstract this cluster from the other ONAP components
463 such that the cluster could change size and this change is isolated from the
464 other ONAP components by the load-balancer implemented in the ODL service
467 A ReplicaSet_ is a construct that is used to describe the desired state of the
468 cluster. For example 'replicas: 3' indicates to Kubernetes that a cluster of 3
469 instances is the desired state. Should one of the members of the cluster fail,
470 a new member will be automatically started to replace it.
472 Some of the ONAP components many need a more deterministic deployment; for
473 example to enable intra-cluster communication. For these applications the
474 component can be deployed as a Kubernetes StatefulSet_ which will maintain a
475 persistent identifier for the pods and thus a stable network id for the pods.
476 For example: the pod names might be web-0, web-1, web-{N-1} for N 'web' pods
477 with corresponding DNS entries such that intra service communication is simple
478 even if the pods are physically distributed across multiple nodes. An example
479 of how these capabilities can be used is described in the Running Consul on
482 .. figure:: ../../resources/images/oom_logo/oomLogoV2-Upgrade.png
488 Helm has built-in capabilities to enable the upgrade of pods without causing a
489 loss of the service being provided by that pod or pods (if configured as a
490 cluster). As described in the OOM Developer's Guide, ONAP components provide
491 an abstracted 'service' end point with the pods or containers providing this
492 service hidden from other ONAP components by a load balancer. This capability
493 is used during upgrades to allow a pod with a new image to be added to the
494 service before removing the pod with the old image. This 'make before break'
495 capability ensures minimal downtime.
497 Prior to doing an upgrade, determine of the status of the deployed charts::
500 NAME REVISION UPDATED STATUS CHART NAMESPACE
501 so 1 Mon Feb 5 10:05:22 2020 DEPLOYED so-11.0.0 onap
503 When upgrading a cluster a parameter controls the minimum size of the cluster
504 during the upgrade while another parameter controls the maximum number of nodes
505 in the cluster. For example, SNDC configured as a 3-way ODL cluster might
506 require that during the upgrade no fewer than 2 pods are available at all times
507 to provide service while no more than 5 pods are ever deployed across the two
508 versions at any one time to avoid depleting the cluster of resources. In this
509 scenario, the SDNC cluster would start with 3 old pods then Kubernetes may add
510 a new pod (3 old, 1 new), delete one old (2 old, 1 new), add two new pods (2
511 old, 3 new) and finally delete the 2 old pods (3 new). During this sequence
512 the constraints of the minimum of two pods and maximum of five would be
513 maintained while providing service the whole time.
515 Initiation of an upgrade is triggered by changes in the Helm charts. For
516 example, if the image specified for one of the pods in the SDNC deployment
517 specification were to change (i.e. point to a new Docker image in the nexus3
518 repository - commonly through the change of a deployment variable), the
519 sequence of events described in the previous paragraph would be initiated.
521 For example, to upgrade a container by changing configuration, specifically an
524 > helm upgrade so onap/so --version 11.0.1 --set enableDebug=true
526 Issuing this command will result in the appropriate container being stopped by
527 Kubernetes and replaced with a new container with the new environment value.
529 To upgrade a component to a new version with a new configuration file enter::
531 > helm upgrade so onap/so --version 11.0.1 -f environments/demo.yaml
533 To fetch release history enter::
536 REVISION UPDATED STATUS CHART DESCRIPTION
537 1 Mon Jul 5 10:05:22 2022 SUPERSEDED so-11.0.0 Install complete
538 2 Mon Jul 5 10:10:55 2022 DEPLOYED so-11.0.1 Upgrade complete
540 Unfortunately, not all upgrades are successful. In recognition of this the
541 lineup of pods within an ONAP deployment is tagged such that an administrator
542 may force the ONAP deployment back to the previously tagged configuration or to
543 a specific configuration, say to jump back two steps if an incompatibility
544 between two ONAP components is discovered after the two individual upgrades
547 This rollback functionality gives the administrator confidence that in the
548 unfortunate circumstance of a failed upgrade the system can be rapidly brought
549 back to a known good state. This process of rolling upgrades while under
550 service is illustrated in this short YouTube video showing a Zero Downtime
551 Upgrade of a web application while under a 10 million transaction per second
554 For example, to roll-back back to previous system revision enter::
559 REVISION UPDATED STATUS CHART DESCRIPTION
560 1 Mon Jul 5 10:05:22 2022 SUPERSEDED so-11.0.0 Install complete
561 2 Mon Jul 5 10:10:55 2022 SUPERSEDED so-11.0.1 Upgrade complete
562 3 Mon Jul 5 10:14:32 2022 DEPLOYED so-11.0.0 Rollback to 1
566 The description field can be overridden to document actions taken or include
569 Many of the ONAP components contain their own databases which are used to
570 record configuration or state information. The schemas of these databases may
571 change from version to version in such a way that data stored within the
572 database needs to be migrated between versions. If such a migration script is
573 available it can be invoked during the upgrade (or rollback) by Container
574 Lifecycle Hooks. Two such hooks are available, PostStart and PreStop, which
575 containers can access by registering a handler against one or both. Note that
576 it is the responsibility of the ONAP component owners to implement the hook
577 handlers - which could be a shell script or a call to a specific container HTTP
578 endpoint - following the guidelines listed on the Kubernetes site. Lifecycle
579 hooks are not restricted to database migration or even upgrades but can be used
580 anywhere specific operations need to be taken during lifecycle operations.
582 OOM uses Helm K8S package manager to deploy ONAP components. Each component is
583 arranged in a packaging format called a chart - a collection of files that
584 describe a set of k8s resources. Helm allows for rolling upgrades of the ONAP
585 component deployed. To upgrade a component Helm release you will need an
586 updated Helm chart. The chart might have modified, deleted or added values,
587 deployment yamls, and more. To get the release name use::
591 To easily upgrade the release use::
593 > helm upgrade [RELEASE] [CHART]
595 To roll back to a previous release version use::
597 > helm rollback [flags] [RELEASE] [REVISION]
599 For example, to upgrade the onap-so helm release to the latest SO container
602 - Edit so values.yaml which is part of the chart
603 - Change "so: nexus3.onap.org:10001/openecomp/so:v1.1.1" to
604 "so: nexus3.onap.org:10001/openecomp/so:v1.1.2"
605 - From the chart location run::
607 > helm upgrade onap-so
609 The previous so pod will be terminated and a new so pod with an updated so
610 container will be created.
612 .. figure:: ../../resources/images/oom_logo/oomLogoV2-Delete.png
618 Existing deployments can be partially or fully removed once they are no longer
619 needed. To minimize errors it is recommended that before deleting components
620 from a running deployment the operator perform a 'dry-run' to display exactly
621 what will happen with a given command prior to actually deleting anything.
624 > helm undeploy onap --dry-run
626 will display the outcome of deleting the 'onap' release from the
628 To completely delete a release and remove it from the internal store enter::
632 Once complete undeploy is done then delete the namespace as well
633 using following command::
635 > kubectl delete namespace <name of namespace>
638 You need to provide the namespace name which you used during deployment,
639 below is the example::
641 > kubectl delete namespace onap
643 One can also remove individual components from a deployment by changing the
644 ONAP configuration values. For example, to remove `so` from a running
647 > helm undeploy onap-so
649 will remove `so` as the configuration indicates it's no longer part of the
650 deployment. This might be useful if a one wanted to replace just `so` by
651 installing a custom version.