docs/oom_developer_guide.rst

   1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
   2 .. http://creativecommons.org/licenses/by/4.0
   3 .. Copyright 2018 Amdocs, Bell Canada
   4
   5 .. Links
   6 .. _Helm: https://docs.helm.sh/
   7 .. _Helm Charts: https://github.com/kubernetes/charts
   8 .. _Kubernetes: https://Kubernetes.io/
   9 .. _Docker: https://www.docker.com/
  10 .. _Nexus: https://nexus.onap.org/#welcome
  11 .. _AWS Elastic Block Store: https://aws.amazon.com/ebs/
  12 .. _Azure File: https://docs.microsoft.com/en-us/azure/storage/files/storage-files-introduction
  13 .. _GCE Persistent Disk: https://cloud.google.com/compute/docs/disks/
  14 .. _Gluster FS: https://www.gluster.org/
  15 .. _Kubernetes Storage Class: https://Kubernetes.io/docs/concepts/storage/storage-classes/
  16 .. _Assigning Pods to Nodes: https://Kubernetes.io/docs/concepts/configuration/assign-pod-node/
  17
  18
  19 .. _developer-guide-label:
  20
  21 OOM Developer Guide
  22 ###################
  23
  24 .. figure:: oomLogoV2-medium.png
  25    :align: right
  26
  27 ONAP consists of a large number of components, each of which are substantial
  28 projects within themselves, which results in a high degree of complexity in
  29 deployment and management. To cope with this complexity the ONAP Operations
  30 Manager (OOM) uses a Helm_ model of ONAP - Helm being the primary management
  31 system for Kubernetes_ container systems - to drive all user driven life-cycle
  32 management operations. The Helm model of ONAP is composed of a set of
  33 hierarchical Helm charts that define the structure of the ONAP components and
  34 the configuration of these components.  These charts are fully parameterized
  35 such that a single environment file defines all of the parameters needed to
  36 deploy ONAP.  A user of ONAP may maintain several such environment files to
  37 control the deployment of ONAP in multiple environments such as development,
  38 pre-production, and production.
  39
  40 The following sections describe how the ONAP Helm charts are constructed.
  41
  42 .. contents::
  43    :depth: 3
  44    :local:
  45 ..
  46
  47 Container Background
  48 ====================
  49 Linux containers allow for an application and all of its operating system
  50 dependencies to be packaged and deployed as a single unit without including a
  51 guest operating system as done with virtual machines. The most popular
  52 container solution is Docker_ which provides tools for container management
  53 like the Docker Host (dockerd) which can create, run, stop, move, or delete a
  54 container. Docker has a very popular registry of containers images that can be
  55 used by any Docker system; however, in the ONAP context, Docker images are
  56 built by the standard CI/CD flow and stored in Nexus_ repositories. OOM uses
  57 the "standard" ONAP docker containers and three new ones specifically created
  58 for OOM.
  59
  60 Containers are isolated from each other primarily via name spaces within the
  61 Linux kernel without the need for multiple guest operating systems. As such,
  62 multiple containers can be deployed with little overhead such as all of ONAP
  63 can be deployed on a single host. With some optimization of the ONAP components
  64 (e.g. elimination of redundant database instances) it may be possible to deploy
  65 ONAP on a single laptop computer.
  66
  67 Helm Charts
  68 ===========
  69 A Helm chart is a collection of files that describe a related set of Kubernetes
  70 resources. A simple chart might be used to deploy something simple, like a
  71 memcached pod, while a complex chart might contain many micro-service arranged
  72 in a hierarchy as found in the `aai` ONAP component.
  73
  74 Charts are created as files laid out in a particular directory tree, then they
  75 can be packaged into versioned archives to be deployed. There is a public
  76 archive of `Helm Charts`_ on GitHub that includes many technologies applicable
  77 to ONAP. Some of these charts have been used in ONAP and all of the ONAP charts
  78 have been created following the guidelines provided.
  79
  80 The top level of the ONAP charts is shown below:
  81
  82 .. code-block:: bash
  83
  84   common
  85   ├── cassandra
  86   │   ├── Chart.yaml
  87   │   ├── requirements.yaml
  88   │   ├── resources
  89   │   │   ├── config
  90   │   │   │   └── docker-entrypoint.sh
  91   │   │   ├── exec.py
  92   │   │   └── restore.sh
  93   │   ├── templates
  94   │   │   ├── backup
  95   │   │   │   ├── configmap.yaml
  96   │   │   │   ├── cronjob.yaml
  97   │   │   │   ├── pv.yaml
  98   │   │   │   └── pvc.yaml
  99   │   │   ├── configmap.yaml
 100   │   │   ├── pv.yaml
 101   │   │   ├── service.yaml
 102   │   │   └── statefulset.yaml
 103   │   └── values.yaml
 104   ├── common
 105   │   ├── Chart.yaml
 106   │   ├── templates
 107   │   │   ├── _createPassword.tpl
 108   │   │   ├── _ingress.tpl
 109   │   │   ├── _labels.tpl
 110   │   │   ├── _mariadb.tpl
 111   │   │   ├── _name.tpl
 112   │   │   ├── _namespace.tpl
 113   │   │   ├── _repository.tpl
 114   │   │   ├── _resources.tpl
 115   │   │   ├── _secret.yaml
 116   │   │   ├── _service.tpl
 117   │   │   ├── _storage.tpl
 118   │   │   └── _tplValue.tpl
 119   │   └── values.yaml
 120   ├── ...
 121   └── postgres-legacy
 122       ├── Chart.yaml
 123       ├── requirements.yaml
 124       ├── charts
 125       └── configs
 126
 127 The common section of charts consists of a set of templates that assist with
 128 parameter substitution (`_name.tpl`, `_namespace.tpl` and others) and a set of charts
 129 for components used throughout ONAP.  When the common components are used by other charts they
 130 are instantiated each time or we can deploy a shared instances for several components.
 131
 132 All of the ONAP components have charts that follow the pattern shown below:
 133
 134 .. code-block:: bash
 135
 136   name-of-my-component
 137   ├── Chart.yaml
 138   ├── requirements.yaml
 139   ├── component
 140   │   └── subcomponent-folder
 141   ├── charts
 142   │   └── subchart-folder
 143   ├── resources
 144   │   ├── folder1
 145   │   │   ├── file1
 146   │   │   └── file2
 147   │   └── folder1
 148   │       ├── file3
 149   │       └── folder3
 150   │           └── file4
 151   ├── templates
 152   │   ├── NOTES.txt
 153   │   ├── configmap.yaml
 154   │   ├── deployment.yaml
 155   │   ├── ingress.yaml
 156   │   ├── job.yaml
 157   │   ├── secrets.yaml
 158   │   └── service.yaml
 159   └── values.yaml
 160
 161 Note that the component charts / components may include a hierarchy of sub
 162 components and in themselves can be quite complex.
 163
 164 You can use either `charts` or `components` folder for your subcomponents.
 165 `charts` folder means that the subcomponent will always been deployed.
 166
 167 `components` folders means we can choose if we want to deploy the sub component.
 168
 169 This choice is done in root `values.yaml`:
 170
 171 .. code-block:: yaml
 172
 173   ---
 174   global:
 175     key: value
 176
 177   component1:
 178     enabled: true
 179   component2:
 180     enabled: true
 181
 182 Then in `requirements.yaml`, you'll use these values:
 183
 184 .. code-block:: yaml
 185
 186   ---
 187   dependencies:
 188     - name: common
 189       version: ~x.y-0
 190       repository: '@local'
 191     - name: component1
 192       version: ~x.y-0
 193       repository: 'file://components/component1'
 194       condition: component1.enabled
 195     - name: component2
 196       version: ~x.y-0
 197       repository: 'file://components/component2'
 198       condition: component2.enabled
 199
 200 Configuration of the components varies somewhat from component to component but
 201 generally follows the pattern of one or more `configmap.yaml` files which can
 202 directly provide configuration to the containers in addition to processing
 203 configuration files stored in the `config` directory.  It is the responsibility
 204 of each ONAP component team to update these configuration files when changes
 205 are made to the project containers that impact configuration.
 206
 207 The following section describes how the hierarchical ONAP configuration system
 208 is key to management of such a large system.
 209
 210 Configuration Management
 211 ========================
 212
 213 ONAP is a large system composed of many components - each of which are complex
 214 systems in themselves - that needs to be deployed in a number of different
 215 ways.  For example, within a single operator's network there may be R&D
 216 deployments under active development, pre-production versions undergoing system
 217 testing and production systems that are operating live networks.  Each of these
 218 deployments will differ in significant ways, such as the version of the
 219 software images deployed.  In addition, there may be a number of application
 220 specific configuration differences, such as operating system environment
 221 variables.  The following describes how the Helm configuration management
 222 system is used within the OOM project to manage both ONAP infrastructure
 223 configuration as well as ONAP components configuration.
 224
 225 One of the artifacts that OOM/Kubernetes uses to deploy ONAP components is the
 226 deployment specification, yet another yaml file.  Within these deployment specs
 227 are a number of parameters as shown in the following example:
 228
 229 .. code-block:: yaml
 230
 231   apiVersion: apps/v1
 232   kind: StatefulSet
 233   metadata:
 234     labels:
 235       app.kubernetes.io/name: zookeeper
 236       helm.sh/chart: zookeeper
 237       app.kubernetes.io/component: server
 238       app.kubernetes.io/managed-by: Tiller
 239       app.kubernetes.io/instance: onap-oof
 240     name: onap-oof-zookeeper
 241     namespace: onap
 242   spec:
 243     <...>
 244     replicas: 3
 245     selector:
 246       matchLabels:
 247         app.kubernetes.io/name: zookeeper
 248         app.kubernetes.io/component: server
 249         app.kubernetes.io/instance: onap-oof
 250     serviceName: onap-oof-zookeeper-headless
 251     template:
 252       metadata:
 253         labels:
 254           app.kubernetes.io/name: zookeeper
 255           helm.sh/chart: zookeeper
 256           app.kubernetes.io/component: server
 257           app.kubernetes.io/managed-by: Tiller
 258           app.kubernetes.io/instance: onap-oof
 259       spec:
 260         <...>
 261         affinity:
 262         containers:
 263         - name: zookeeper
 264           <...>
 265           image: gcr.io/google_samples/k8szk:v3
 266           imagePullPolicy: Always
 267           <...>
 268           ports:
 269           - containerPort: 2181
 270             name: client
 271             protocol: TCP
 272           - containerPort: 3888
 273             name: election
 274             protocol: TCP
 275           - containerPort: 2888
 276             name: server
 277             protocol: TCP
 278           <...>
 279
 280 Note that within the statefulset specification, one of the container arguments
 281 is the key/value pair image: gcr.io/google_samples/k8szk:v3 which
 282 specifies the version of the zookeeper software to deploy.  Although the
 283 statefulset specifications greatly simplify statefulset, maintenance of the
 284 statefulset specifications themselves become problematic as software versions
 285 change over time or as different versions are required for different
 286 statefulsets.  For example, if the R&D team needs to deploy a newer version of
 287 mariadb than what is currently used in the production environment, they would
 288 need to clone the statefulset specification and change this value.  Fortunately,
 289 this problem has been solved with the templating capabilities of Helm.
 290
 291 The following example shows how the statefulset specifications are modified to
 292 incorporate Helm templates such that key/value pairs can be defined outside of
 293 the statefulset specifications and passed during instantiation of the component.
 294
 295 .. code-block:: yaml
 296
 297   apiVersion: apps/v1
 298   kind: StatefulSet
 299   metadata:
 300     name: {{ include "common.fullname" . }}
 301     namespace: {{ include "common.namespace" . }}
 302     labels: {{- include "common.labels" . | nindent 4 }}
 303   spec:
 304     replicas: {{ .Values.replicaCount }}
 305     selector:
 306       matchLabels: {{- include "common.matchLabels" . | nindent 6 }}
 307     # serviceName is only needed for StatefulSet
 308     # put the postfix part only if you have add a postfix on the service name
 309     serviceName: {{ include "common.servicename" . }}-{{ .Values.service.postfix }}
 310     <...>
 311     template:
 312       metadata:
 313         labels: {{- include "common.labels" . | nindent 8 }}
 314         annotations: {{- include "common.tplValue" (dict "value" .Values.podAnnotations "context" $) | nindent 8 }}
 315         name: {{ include "common.name" . }}
 316       spec:
 317         <...>
 318         containers:
 319           - name: {{ include "common.name" . }}
 320             image: {{ .Values.image }}
 321             imagePullPolicy: {{ .Values.global.pullPolicy | default .Values.pullPolicy }}
 322             ports:
 323             {{- range $index, $port := .Values.service.ports }}
 324               - containerPort: {{ $port.port }}
 325                 name: {{ $port.name }}
 326             {{- end }}
 327             {{- range $index, $port := .Values.service.headlessPorts }}
 328               - containerPort: {{ $port.port }}
 329                 name: {{ $port.name }}
 330             {{- end }}
 331             <...>
 332
 333 This version of the statefulset specification has gone through the process of
 334 templating values that are likely to change between statefulsets. Note that the
 335 image is now specified as: image: {{ .Values.image }} instead of a
 336 string used previously.  During the statefulset phase, Helm (actually the Helm
 337 sub-component Tiller) substitutes the {{ .. }} entries with a variable defined
 338 in a values.yaml file.  The content of this file is as follows:
 339
 340 .. code-block:: yaml
 341
 342   <...>
 343   image: gcr.io/google_samples/k8szk:v3
 344   replicaCount: 3
 345   <...>
 346
 347
 348 Within the values.yaml file there is an image key with the value
 349 `gcr.io/google_samples/k8szk:v3` which is the same value used in
 350 the non-templated version.  Once all of the substitutions are complete, the
 351 resulting statefulset specification ready to be used by Kubernetes.
 352
 353 When creating a template consider the use of default values if appropriate.
 354 Helm templating has built in support for DEFAULT values, here is
 355 an example:
 356
 357 .. code-block:: yaml
 358
 359   imagePullSecrets:
 360   - name: "{{ .Values.nsPrefix | default "onap" }}-docker-registry-key"
 361
 362 The pipeline operator ("|") used here hints at that power of Helm templates in
 363 that much like an operating system command line the pipeline operator allow
 364 over 60 Helm functions to be embedded directly into the template (note that the
 365 Helm template language is a superset of the Go template language).  These
 366 functions include simple string operations like upper and more complex flow
 367 control operations like if/else.
 368
 369 OOM is mainly helm templating. In order to have consistent deployment of the
 370 different components of ONAP, some rules must be followed.
 371
 372 Templates are provided in order to create Kubernetes resources (Secrets,
 373 Ingress, Services, ...) or part of Kubernetes resources (names, labels,
 374 resources requests and limits, ...).
 375
 376 Service template
 377 ----------------
 378
 379 In order to create a Service for a component, you have to create a file (with
 380 `service` in the name.
 381 For normal service, just put the following line:
 382
 383 .. code-block:: yaml
 384
 385   {{ include "common.service" . }}
 386
 387 For headless service, the line to put is the following:
 388
 389 .. code-block:: yaml
 390
 391   {{ include "common.headlessService" . }}
 392
 393 The configuration of the service is done in component `values.yaml`:
 394
 395 .. code-block:: yaml
 396
 397   service:
 398    name: NAME-OF-THE-SERVICE
 399    postfix: MY-POSTFIX
 400    type: NodePort
 401    annotations:
 402      someAnnotationsKey: value
 403    ports:
 404    - name: tcp-MyPort
 405      port: 5432
 406      nodePort: 88
 407    - name: http-api
 408      port: 8080
 409      nodePort: 89
 410    - name: https-api
 411      port: 9443
 412      nodePort: 90
 413
 414 `annotations` and `postfix` keys are optional.
 415 if `service.type` is `NodePort`, then you have to give `nodePort` value for your
 416 service ports (which is the end of the computed nodePort, see example).
 417
 418 It would render the following Service Resource (for a component named
 419 `name-of-my-component`, with version `x.y.z`, helm deployment name
 420 `my-deployment` and `global.nodePortPrefix` `302`):
 421
 422 .. code-block:: yaml
 423
 424   apiVersion: v1
 425   kind: Service
 426   metadata:
 427     annotations:
 428       someAnnotationsKey: value
 429     name: NAME-OF-THE-SERVICE-MY-POSTFIX
 430     labels:
 431       app.kubernetes.io/name: name-of-my-component
 432       helm.sh/chart: name-of-my-component-x.y.z
 433       app.kubernetes.io/instance: my-deployment-name-of-my-component
 434       app.kubernetes.io/managed-by: Tiller
 435   spec:
 436     ports:
 437       - port: 5432
 438         targetPort: tcp-MyPort
 439         nodePort: 30288
 440       - port: 8080
 441         targetPort: http-api
 442         nodePort: 30289
 443       - port: 9443
 444         targetPort: https-api
 445         nodePort: 30290
 446     selector:
 447       app.kubernetes.io/name: name-of-my-component
 448       app.kubernetes.io/instance:  my-deployment-name-of-my-component
 449     type: NodePort
 450
 451 In the deployment or statefulSet file, you needs to set the good labels in order
 452 for the service to match the pods.
 453
 454 here's an example to be sure it matchs (for a statefulSet):
 455
 456 .. code-block:: yaml
 457
 458   apiVersion: apps/v1
 459   kind: StatefulSet
 460   metadata:
 461     name: {{ include "common.fullname" . }}
 462     namespace: {{ include "common.namespace" . }}
 463     labels: {{- include "common.labels" . | nindent 4 }}
 464   spec:
 465     selector:
 466       matchLabels: {{- include "common.matchLabels" . | nindent 6 }}
 467     # serviceName is only needed for StatefulSet
 468     # put the postfix part only if you have add a postfix on the service name
 469     serviceName: {{ include "common.servicename" . }}-{{ .Values.service.postfix }}
 470     <...>
 471     template:
 472       metadata:
 473         labels: {{- include "common.labels" . | nindent 8 }}
 474         annotations: {{- include "common.tplValue" (dict "value" .Values.podAnnotations "context" $) | nindent 8 }}
 475         name: {{ include "common.name" . }}
 476       spec:
 477        <...>
 478        containers:
 479          - name: {{ include "common.name" . }}
 480            ports:
 481            {{- range $index, $port := .Values.service.ports }}
 482            - containerPort: {{ $port.port }}
 483              name: {{ $port.name }}
 484            {{- end }}
 485            {{- range $index, $port := .Values.service.headlessPorts }}
 486            - containerPort: {{ $port.port }}
 487              name: {{ $port.name }}
 488            {{- end }}
 489            <...>
 490
 491 The configuration of the service is done in component `values.yaml`:
 492
 493 .. code-block:: yaml
 494
 495   service:
 496    name: NAME-OF-THE-SERVICE
 497    headless:
 498      postfix: NONE
 499      annotations:
 500        anotherAnnotationsKey : value
 501      publishNotReadyAddresses: true
 502    headlessPorts:
 503    - name: tcp-MyPort
 504      port: 5432
 505    - name: http-api
 506      port: 8080
 507    - name: https-api
 508      port: 9443
 509
 510 `headless.annotations`, `headless.postfix` and
 511 `headless.publishNotReadyAddresses` keys are optional.
 512
 513 If `headless.postfix` is not set, then we'll add `-headless` at the end of the
 514 service name.
 515
 516 If it set to `NONE`, there will be not postfix.
 517
 518 And if set to something, it will add `-something` at the end of the service
 519 name.
 520
 521 It would render the following Service Resource (for a component named
 522 `name-of-my-component`, with version `x.y.z`, helm deployment name
 523 `my-deployment` and `global.nodePortPrefix` `302`):
 524
 525 .. code-block:: yaml
 526
 527   apiVersion: v1
 528   kind: Service
 529   metadata:
 530     annotations:
 531       anotherAnnotationsKey: value
 532     name: NAME-OF-THE-SERVICE
 533     labels:
 534       app.kubernetes.io/name: name-of-my-component
 535       helm.sh/chart: name-of-my-component-x.y.z
 536       app.kubernetes.io/instance: my-deployment-name-of-my-component
 537       app.kubernetes.io/managed-by: Tiller
 538   spec:
 539     clusterIP: None
 540     ports:
 541       - port: 5432
 542         targetPort: tcp-MyPort
 543         nodePort: 30288
 544       - port: 8080
 545         targetPort: http-api
 546         nodePort: 30289
 547       - port: 9443
 548         targetPort: https-api
 549         nodePort: 30290
 550     publishNotReadyAddresses: true
 551     selector:
 552       app.kubernetes.io/name: name-of-my-component
 553       app.kubernetes.io/instance:  my-deployment-name-of-my-component
 554     type: ClusterIP
 555
 556 Previous example of StatefulSet would also match (except for the `postfix` part
 557 obviously).
 558
 559 Creating Deployment or StatefulSet
 560 ----------------------------------
 561
 562 Deployment and StatefulSet should use the `apps/v1` (which has appeared in
 563 v1.9).
 564 As seen on the service part, the following parts are mandatory:
 565
 566 .. code-block:: yaml
 567
 568   apiVersion: apps/v1
 569   kind: StatefulSet
 570   metadata:
 571     name: {{ include "common.fullname" . }}
 572     namespace: {{ include "common.namespace" . }}
 573     labels: {{- include "common.labels" . | nindent 4 }}
 574   spec:
 575     selector:
 576       matchLabels: {{- include "common.matchLabels" . | nindent 6 }}
 577     # serviceName is only needed for StatefulSet
 578     # put the postfix part only if you have add a postfix on the service name
 579     serviceName: {{ include "common.servicename" . }}-{{ .Values.service.postfix }}
 580     <...>
 581     template:
 582       metadata:
 583         labels: {{- include "common.labels" . | nindent 8 }}
 584         annotations: {{- include "common.tplValue" (dict "value" .Values.podAnnotations "context" $) | nindent 8 }}
 585         name: {{ include "common.name" . }}
 586       spec:
 587         <...>
 588         containers:
 589           - name: {{ include "common.name" . }}
 590
 591 ONAP Application Configuration
 592 ------------------------------
 593
 594 Dependency Management
 595 ---------------------
 596 These Helm charts describe the desired state
 597 of an ONAP deployment and instruct the Kubernetes container manager as to how
 598 to maintain the deployment in this state.  These dependencies dictate the order
 599 in-which the containers are started for the first time such that such
 600 dependencies are always met without arbitrary sleep times between container
 601 startups.  For example, the SDC back-end container requires the Elastic-Search,
 602 Cassandra and Kibana containers within SDC to be ready and is also dependent on
 603 DMaaP (or the message-router) to be ready - where ready implies the built-in
 604 "readiness" probes succeeded - before becoming fully operational.  When an
 605 initial deployment of ONAP is requested the current state of the system is NULL
 606 so ONAP is deployed by the Kubernetes manager as a set of Docker containers on
 607 one or more predetermined hosts.  The hosts could be physical machines or
 608 virtual machines.  When deploying on virtual machines the resulting system will
 609 be very similar to "Heat" based deployments, i.e. Docker containers running
 610 within a set of VMs, the primary difference being that the allocation of
 611 containers to VMs is done dynamically with OOM and statically with "Heat".
 612 Example SO deployment descriptor file shows SO's dependency on its mariadb
 613 data-base component:
 614
 615 SO deployment specification excerpt:
 616
 617 .. code-block:: yaml
 618
 619   apiVersion: apps/v1
 620   kind: Deployment
 621   metadata:
 622     name: {{ include "common.fullname" . }}
 623     namespace: {{ include "common.namespace" . }}
 624     labels: {{- include "common.labels" . | nindent 4 }}
 625   spec:
 626     replicas: {{ .Values.replicaCount }}
 627     selector:
 628       matchLabels: {{- include "common.matchLabels" . | nindent 6 }}
 629     template:
 630       metadata:
 631         labels:
 632           app: {{ include "common.name" . }}
 633           release: {{ .Release.Name }}
 634       spec:
 635         initContainers:
 636         - command:
 637           - /root/ready.py
 638           args:
 639           - --container-name
 640           - so-mariadb
 641           env:
 642   ...
 643
 644 Kubernetes Container Orchestration
 645 ==================================
 646 The ONAP components are managed by the Kubernetes_ container management system
 647 which maintains the desired state of the container system as described by one
 648 or more deployment descriptors - similar in concept to OpenStack HEAT
 649 Orchestration Templates. The following sections describe the fundamental
 650 objects managed by Kubernetes, the network these components use to communicate
 651 with each other and other entities outside of ONAP and the templates that
 652 describe the configuration and desired state of the ONAP components.
 653
 654 Name Spaces
 655 -----------
 656 Within the namespaces are Kubernetes services that provide external
 657 connectivity to pods that host Docker containers.
 658
 659 ONAP Components to Kubernetes Object Relationships
 660 --------------------------------------------------
 661 Kubernetes deployments consist of multiple objects:
 662
 663 - **nodes** - a worker machine - either physical or virtual - that hosts
 664   multiple containers managed by Kubernetes.
 665 - **services** - an abstraction of a logical set of pods that provide a
 666   micro-service.
 667 - **pods** - one or more (but typically one) container(s) that provide specific
 668   application functionality.
 669 - **persistent volumes** - One or more permanent volumes need to be established
 670   to hold non-ephemeral configuration and state data.
 671
 672 The relationship between these objects is shown in the following figure:
 673
 674 .. .. uml::
 675 ..
 676 ..   @startuml
 677 ..   node PH {
 678 ..      component Service {
 679 ..         component Pod0
 680 ..         component Pod1
 681 ..      }
 682 ..   }
 683 ..
 684 ..   database PV
 685 ..   @enduml
 686
 687 .. figure:: kubernetes_objects.png
 688
 689 OOM uses these Kubernetes objects as described in the following sections.
 690
 691 Nodes
 692 ~~~~~
 693 OOM works with both physical and virtual worker machines.
 694
 695 * Virtual Machine Deployments - If ONAP is to be deployed onto a set of virtual
 696   machines, the creation of the VMs is outside of the scope of OOM and could be
 697   done in many ways, such as
 698
 699   * manually, for example by a user using the OpenStack Horizon dashboard or
 700     AWS EC2, or
 701   * automatically, for example with the use of a OpenStack Heat Orchestration
 702     Template which builds an ONAP stack, Azure ARM template, AWS CloudFormation
 703     Template, or
 704   * orchestrated, for example with Cloudify creating the VMs from a TOSCA
 705     template and controlling their life cycle for the life of the ONAP
 706     deployment.
 707
 708 * Physical Machine Deployments - If ONAP is to be deployed onto physical
 709   machines there are several options but the recommendation is to use Rancher
 710   along with Helm to associate hosts with a Kubernetes cluster.
 711
 712 Pods
 713 ~~~~
 714 A group of containers with shared storage and networking can be grouped
 715 together into a Kubernetes pod.  All of the containers within a pod are
 716 co-located and co-scheduled so they operate as a single unit.  Within ONAP
 717 Amsterdam release, pods are mapped one-to-one to docker containers although
 718 this may change in the future.  As explained in the Services section below the
 719 use of Pods within each ONAP component is abstracted from other ONAP
 720 components.
 721
 722 Services
 723 ~~~~~~~~
 724 OOM uses the Kubernetes service abstraction to provide a consistent access
 725 point for each of the ONAP components independent of the pod or container
 726 architecture of that component.  For example, the SDNC component may introduce
 727 OpenDaylight clustering as some point and change the number of pods in this
 728 component to three or more but this change will be isolated from the other ONAP
 729 components by the service abstraction.  A service can include a load balancer
 730 on its ingress to distribute traffic between the pods and even react to dynamic
 731 changes in the number of pods if they are part of a replica set.
 732
 733 Persistent Volumes
 734 ~~~~~~~~~~~~~~~~~~
 735 To enable ONAP to be deployed into a wide variety of cloud infrastructures a
 736 flexible persistent storage architecture, built on Kubernetes persistent
 737 volumes, provides the ability to define the physical storage in a central
 738 location and have all ONAP components securely store their data.
 739
 740 When deploying ONAP into a public cloud, available storage services such as
 741 `AWS Elastic Block Store`_, `Azure File`_, or `GCE Persistent Disk`_ are
 742 options.  Alternatively, when deploying into a private cloud the storage
 743 architecture might consist of Fiber Channel, `Gluster FS`_, or iSCSI. Many
 744 other storage options existing, refer to the `Kubernetes Storage Class`_
 745 documentation for a full list of the options. The storage architecture may vary
 746 from deployment to deployment but in all cases a reliable, redundant storage
 747 system must be provided to ONAP with which the state information of all ONAP
 748 components will be securely stored. The Storage Class for a given deployment is
 749 a single parameter listed in the ONAP values.yaml file and therefore is easily
 750 customized. Operation of this storage system is outside the scope of the OOM.
 751
 752 .. code-block:: yaml
 753
 754   Insert values.yaml code block with storage block here
 755
 756 Once the storage class is selected and the physical storage is provided, the
 757 ONAP deployment step creates a pool of persistent volumes within the given
 758 physical storage that is used by all of the ONAP components. ONAP components
 759 simply make a claim on these persistent volumes (PV), with a persistent volume
 760 claim (PVC), to gain access to their storage.
 761
 762 The following figure illustrates the relationships between the persistent
 763 volume claims, the persistent volumes, the storage class, and the physical
 764 storage.
 765
 766 .. graphviz::
 767
 768    digraph PV {
 769       label = "Persistance Volume Claim to Physical Storage Mapping"
 770       {
 771          node [shape=cylinder]
 772          D0 [label="Drive0"]
 773          D1 [label="Drive1"]
 774          Dx [label="Drivex"]
 775       }
 776       {
 777          node [shape=Mrecord label="StorageClass:ceph"]
 778          sc
 779       }
 780       {
 781          node [shape=point]
 782          p0 p1 p2
 783          p3 p4 p5
 784       }
 785       subgraph clusterSDC {
 786          label="SDC"
 787          PVC0
 788          PVC1
 789       }
 790       subgraph clusterSDNC {
 791          label="SDNC"
 792          PVC2
 793       }
 794       subgraph clusterSO {
 795          label="SO"
 796          PVCn
 797       }
 798       PV0 -> sc
 799       PV1 -> sc
 800       PV2 -> sc
 801       PVn -> sc
 802
 803       sc -> {D0 D1 Dx}
 804       PVC0 -> PV0
 805       PVC1 -> PV1
 806       PVC2 -> PV2
 807       PVCn -> PVn
 808
 809       # force all of these nodes to the same line in the given order
 810       subgraph {
 811          rank = same; PV0;PV1;PV2;PVn;p0;p1;p2
 812          PV0->PV1->PV2->p0->p1->p2->PVn [style=invis]
 813       }
 814
 815       subgraph {
 816          rank = same; D0;D1;Dx;p3;p4;p5
 817          D0->D1->p3->p4->p5->Dx [style=invis]
 818       }
 819
 820    }
 821
 822 In-order for an ONAP component to use a persistent volume it must make a claim
 823 against a specific persistent volume defined in the ONAP common charts.  Note
 824 that there is a one-to-one relationship between a PVC and PV.  The following is
 825 an excerpt from a component chart that defines a PVC:
 826
 827 .. code-block:: yaml
 828
 829   Insert PVC example here
 830
 831 OOM Networking with Kubernetes
 832 ------------------------------
 833
 834 - DNS
 835 - Ports - Flattening the containers also expose port conflicts between the
 836   containers which need to be resolved.
 837
 838 Node Ports
 839 ~~~~~~~~~~
 840
 841 Pod Placement Rules
 842 -------------------
 843 OOM will use the rich set of Kubernetes node and pod affinity /
 844 anti-affinity rules to minimize the chance of a single failure resulting in a
 845 loss of ONAP service. Node affinity / anti-affinity is used to guide the
 846 Kubernetes orchestrator in the placement of pods on nodes (physical or virtual
 847 machines).  For example:
 848
 849 - if a container used Intel DPDK technology the pod may state that it as
 850   affinity to an Intel processor based node, or
 851 - geographical based node labels (such as the Kubernetes standard zone or
 852   region labels) may be used to ensure placement of a DCAE complex close to the
 853   VNFs generating high volumes of traffic thus minimizing networking cost.
 854   Specifically, if nodes were pre-assigned labels East and West, the pod
 855   deployment spec to distribute pods to these nodes would be:
 856
 857 .. code-block:: yaml
 858
 859   nodeSelector:
 860     failure-domain.beta.Kubernetes.io/region: {{ .Values.location }}
 861
 862 - "location: West" is specified in the `values.yaml` file used to deploy
 863   one DCAE cluster and  "location: East" is specified in a second `values.yaml`
 864   file (see OOM Configuration Management for more information about
 865   configuration files like the `values.yaml` file).
 866
 867 Node affinity can also be used to achieve geographic redundancy if pods are
 868 assigned to multiple failure domains. For more information refer to `Assigning
 869 Pods to Nodes`_.
 870
 871 .. note::
 872    One could use Pod to Node assignment to totally constrain Kubernetes when
 873    doing initial container assignment to replicate the Amsterdam release
 874    OpenStack Heat based deployment. Should one wish to do this, each VM would
 875    need a unique node name which would be used to specify a node constaint
 876    for every component.  These assignment could be specified in an environment
 877    specific values.yaml file. Constraining Kubernetes in this way is not
 878    recommended.
 879
 880 Kubernetes has a comprehensive system called Taints and Tolerations that can be
 881 used to force the container orchestrator to repel pods from nodes based on
 882 static events (an administrator assigning a taint to a node) or dynamic events
 883 (such as a node becoming unreachable or running out of disk space). There are
 884 no plans to use taints or tolerations in the ONAP Beijing release.  Pod
 885 affinity / anti-affinity is the concept of creating a spacial relationship
 886 between pods when the Kubernetes orchestrator does assignment (both initially
 887 an in operation) to nodes as explained in Inter-pod affinity and anti-affinity.
 888 For example, one might choose to co-located all of the ONAP SDC containers on a
 889 single node as they are not critical runtime components and co-location
 890 minimizes overhead. On the other hand, one might choose to ensure that all of
 891 the containers in an ODL cluster (SDNC and APPC) are placed on separate nodes
 892 such that a node failure has minimal impact to the operation of the cluster.
 893 An example of how pod affinity / anti-affinity is shown below:
 894
 895 Pod Affinity / Anti-Affinity
 896
 897 .. code-block:: yaml
 898
 899   apiVersion: v1
 900   kind: Pod
 901   metadata:
 902     name: with-pod-affinity
 903   spec:
 904     affinity:
 905       podAffinity:
 906         requiredDuringSchedulingIgnoredDuringExecution:
 907         - labelSelector:
 908             matchExpressions:
 909         - key: security
 910           operator: In
 911           values:
 912           - S1
 913           topologyKey: failure-domain.beta.Kubernetes.io/zone
 914       podAntiAffinity:
 915         preferredDuringSchedulingIgnoredDuringExecution:
 916         - weight: 100
 917           podAffinityTerm:
 918             labelSelector:
 919               matchExpressions:
 920               - key: security
 921                 operator: In
 922                 values:
 923                 - S2
 924             topologyKey: Kubernetes.io/hostname
 925        containers:
 926        - name: with-pod-affinity
 927          image: gcr.io/google_containers/pause:2.0
 928
 929 This example contains both podAffinity and podAntiAffinity rules, the first
 930 rule is is a must (requiredDuringSchedulingIgnoredDuringExecution) while the
 931 second will be met pending other considerations
 932 (preferredDuringSchedulingIgnoredDuringExecution).  Preemption Another feature
 933 that may assist in achieving a repeatable deployment in the presence of faults
 934 that may have reduced the capacity of the cloud is assigning priority to the
 935 containers such that mission critical components have the ability to evict less
 936 critical components.  Kubernetes provides this capability with Pod Priority and
 937 Preemption.  Prior to having more advanced production grade features available,
 938 the ability to at least be able to re-deploy ONAP (or a subset of) reliably
 939 provides a level of confidence that should an outage occur the system can be
 940 brought back on-line predictably.
 941
 942 Health Checks
 943 -------------
 944
 945 Monitoring of ONAP components is configured in the agents within JSON files and
 946 stored in gerrit under the consul-agent-config, here is an example from the AAI
 947 model loader (aai-model-loader-health.json):
 948
 949 .. code-block:: json
 950
 951   {
 952     "service": {
 953       "name": "A&AI Model Loader",
 954       "checks": [
 955         {
 956           "id": "model-loader-process",
 957           "name": "Model Loader Presence",
 958           "script": "/consul/config/scripts/model-loader-script.sh",
 959           "interval": "15s",
 960           "timeout": "1s"
 961         }
 962       ]
 963     }
 964   }
 965
 966 Liveness Probes
 967 ---------------
 968
 969 These liveness probes can simply check that a port is available, that a
 970 built-in health check is reporting good health, or that the Consul health check
 971 is positive.  For example, to monitor the SDNC component has following liveness
 972 probe can be found in the SDNC DB deployment specification:
 973
 974 .. code-block:: yaml
 975
 976   sdnc db liveness probe
 977
 978   livenessProbe:
 979     exec:
 980       command: ["mysqladmin", "ping"]
 981       initialDelaySeconds: 30 periodSeconds: 10
 982       timeoutSeconds: 5
 983
 984 The 'initialDelaySeconds' control the period of time between the readiness
 985 probe succeeding and the liveness probe starting. 'periodSeconds' and
 986 'timeoutSeconds' control the actual operation of the probe.  Note that
 987 containers are inherently ephemeral so the healing action destroys failed
 988 containers and any state information within it.  To avoid a loss of state, a
 989 persistent volume should be used to store all data that needs to be persisted
 990 over the re-creation of a container.  Persistent volumes have been created for
 991 the database components of each of the projects and the same technique can be
 992 used for all persistent state information.
 993
 994
 995
 996 Environment Files
 997 ~~~~~~~~~~~~~~~~~
 998
 999 MSB Integration
1000 ===============
1001
1002 The \ `Microservices Bus
1003 Project <https://wiki.onap.org/pages/viewpage.action?pageId=3246982>`__ provides
1004 facilities to integrate micro-services into ONAP and therefore needs to
1005 integrate into OOM - primarily through Consul which is the backend of
1006 MSB service discovery. The following is a brief description of how this
1007 integration will be done:
1008
1009 A registrator to push the service endpoint info to MSB service
1010 discovery.
1011
1012 -  The needed service endpoint info is put into the kubernetes yaml file
1013    as annotation, including service name, Protocol,version, visual
1014    range,LB method, IP, Port,etc.
1015
1016 -  OOM deploy/start/restart/scale in/scale out/upgrade ONAP components
1017
1018 -  Registrator watch the kubernetes event
1019
1020 -  When an ONAP component instance has been started/destroyed by OOM,
1021    Registrator get the notification from kubernetes
1022
1023 -  Registrator parse the service endpoint info from annotation and
1024    register/update/unregister it to MSB service discovery
1025
1026 -  MSB API Gateway uses the service endpoint info for service routing
1027    and load balancing.
1028
1029 Details of the registration service API can be found at \ `Microservice
1030 Bus API
1031 Documentation <https://wiki.onap.org/display/DW/Microservice+Bus+API+Documentation>`__.
1032
1033 ONAP Component Registration to MSB
1034 ----------------------------------
1035 The charts of all ONAP components intending to register against MSB must have
1036 an annotation in their service(s) template.  A `sdc` example follows:
1037
1038 .. code-block:: yaml
1039
1040   apiVersion: v1
1041   kind: Service
1042   metadata:
1043     labels:
1044       app: sdc-be
1045     name: sdc-be
1046     namespace: "{{ .Values.nsPrefix }}"
1047     annotations:
1048       msb.onap.org/service-info: '[
1049         {
1050             "serviceName": "sdc",
1051             "version": "v1",
1052             "url": "/sdc/v1",
1053             "protocol": "REST",
1054             "port": "8080",
1055             "visualRange":"1"
1056         },
1057         {
1058             "serviceName": "sdc-deprecated",
1059             "version": "v1",
1060             "url": "/sdc/v1",
1061             "protocol": "REST",
1062             "port": "8080",
1063             "visualRange":"1",
1064             "path":"/sdc/v1"
1065         }
1066         ]'
1067   ...
1068
1069
1070 MSB Integration with OOM
1071 ------------------------
1072 A preliminary view of the OOM-MSB integration is as follows:
1073
1074 .. figure:: MSB-OOM-Diagram.png
1075
1076 A message sequence chart of the registration process:
1077
1078 .. uml::
1079
1080   participant "OOM" as oom
1081   participant "ONAP Component" as onap
1082   participant "Service Discovery" as sd
1083   participant "External API Gateway" as eagw
1084   participant "Router (Internal API Gateway)" as iagw
1085
1086   box "MSB" #LightBlue
1087     participant sd
1088     participant eagw
1089     participant iagw
1090   end box
1091
1092   == Deploy Servcie ==
1093
1094   oom -> onap: Deploy
1095   oom -> sd:   Register service endpoints
1096   sd -> eagw:  Services exposed to external system
1097   sd -> iagw:  Services for internal use
1098
1099   == Component Life-cycle Management ==
1100
1101   oom -> onap: Start/Stop/Scale/Migrate/Upgrade
1102   oom -> sd:   Update service info
1103   sd -> eagw:  Update service info
1104   sd -> iagw:  Update service info
1105
1106   == Service Health Check ==
1107
1108   sd -> onap: Check the health of service
1109   sd -> eagw: Update service status
1110   sd -> iagw: Update service status
1111
1112
1113 MSB Deployment Instructions
1114 ---------------------------
1115 MSB is helm installable ONAP component which is often automatically deployed.
1116 To install it individually enter::
1117
1118   > helm install <repo-name>/msb
1119
1120 .. note::
1121   TBD: Vaidate if the following procedure is still required.
1122
1123 Please note that Kubernetes authentication token must be set at
1124 *kubernetes/kube2msb/values.yaml* so the kube2msb registrator can get the
1125 access to watch the kubernetes events and get service annotation by
1126 Kubernetes APIs. The token can be found in the kubectl configuration file
1127 *~/.kube/config*
1128
1129 More details can be found here `MSB installation <http://onap.readthedocs.io/en/latest/submodules/msb/apigateway.git/docs/platform/installation.html>`__.
1130
1131 .. MISC
1132 .. ====
1133 .. Note that although OOM uses Kubernetes facilities to minimize the effort
1134 .. required of the ONAP component owners to implement a successful rolling
1135 .. upgrade strategy there are other considerations that must be taken into
1136 .. consideration.
1137 .. For example, external APIs - both internal and external to ONAP - should be
1138 .. designed to gracefully accept transactions from a peer at a different
1139 .. software version to avoid deadlock situations. Embedded version codes in
1140 .. messages may facilitate such capabilities.
1141 ..
1142 .. Within each of the projects a new configuration repository contains all of
1143 .. the project specific configuration artifacts.  As changes are made within
1144 .. the project, it's the responsibility of the project team to make appropriate
1145 .. changes to the configuration data.