docs/oom_developer_guide.rst

   1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
   2 .. http://creativecommons.org/licenses/by/4.0
   3 .. Copyright 2018 Amdocs, Bell Canada
   4
   5 .. Links
   6 .. _Helm: https://docs.helm.sh/
   7 .. _Helm Charts: https://github.com/kubernetes/charts
   8 .. _Kubernetes: https://Kubernetes.io/
   9 .. _Docker: https://www.docker.com/
  10 .. _Nexus: https://nexus.onap.org/#welcome
  11 .. _AWS Elastic Block Store: https://aws.amazon.com/ebs/
  12 .. _Azure File: https://docs.microsoft.com/en-us/azure/storage/files/storage-files-introduction
  13 .. _GCE Persistent Disk: https://cloud.google.com/compute/docs/disks/
  14 .. _Gluster FS: https://www.gluster.org/
  15 .. _Kubernetes Storage Class: https://Kubernetes.io/docs/concepts/storage/storage-classes/
  16 .. _Assigning Pods to Nodes: https://Kubernetes.io/docs/concepts/configuration/assign-pod-node/
  17
  18
  19 .. _developer-guide-label:
  20
  21 OOM Developer Guide
  22 ###################
  23
  24 .. figure:: oomLogoV2-medium.png
  25    :align: right
  26
  27 ONAP consists of a large number of components, each of which are substantial
  28 projects within themselves, which results in a high degree of complexity in
  29 deployment and management. To cope with this complexity the ONAP Operations
  30 Manager (OOM) uses a Helm_ model of ONAP - Helm being the primary management
  31 system for Kubernetes_ container systems - to drive all user driven life-cycle
  32 management operations. The Helm model of ONAP is composed of a set of
  33 hierarchical Helm charts that define the structure of the ONAP components and
  34 the configuration of these components.  These charts are fully parameterized
  35 such that a single environment file defines all of the parameters needed to
  36 deploy ONAP.  A user of ONAP may maintain several such environment files to
  37 control the deployment of ONAP in multiple environments such as development,
  38 pre-production, and production.
  39
  40 The following sections describe how the ONAP Helm charts are constructed.
  41
  42 .. contents::
  43    :depth: 3
  44    :local:
  45 ..
  46
  47 Container Background
  48 ====================
  49 Linux containers allow for an application and all of its operating system
  50 dependencies to be packaged and deployed as a single unit without including a
  51 guest operating system as done with virtual machines. The most popular
  52 container solution is Docker_ which provides tools for container management
  53 like the Docker Host (dockerd) which can create, run, stop, move, or delete a
  54 container. Docker has a very popular registry of containers images that can be
  55 used by any Docker system; however, in the ONAP context, Docker images are
  56 built by the standard CI/CD flow and stored in Nexus_ repositories. OOM uses
  57 the "standard" ONAP docker containers and three new ones specifically created
  58 for OOM.
  59
  60 Containers are isolated from each other primarily via name spaces within the
  61 Linux kernel without the need for multiple guest operating systems. As such,
  62 multiple containers can be deployed with little overhead such as all of ONAP
  63 can be deployed on a single host. With some optimization of the ONAP components
  64 (e.g. elimination of redundant database instances) it may be possible to deploy
  65 ONAP on a single laptop computer.
  66
  67 Helm Charts
  68 ===========
  69 A Helm chart is a collection of files that describe a related set of Kubernetes
  70 resources. A simple chart might be used to deploy something simple, like a
  71 memcached pod, while a complex chart might contain many micro-service arranged
  72 in a hierarchy as found in the `aai` ONAP component.
  73
  74 Charts are created as files laid out in a particular directory tree, then they
  75 can be packaged into versioned archives to be deployed. There is a public
  76 archive of `Helm Charts`_ on GitHub that includes many technologies applicable
  77 to ONAP. Some of these charts have been used in ONAP and all of the ONAP charts
  78 have been created following the guidelines provided.
  79
  80 The top level of the ONAP charts is shown below:
  81
  82 .. graphviz::
  83
  84    digraph onap_top_chart {
  85       rankdir="LR";
  86       {
  87         node      [shape=folder]
  88         oValues   [label="values.yaml"]
  89         oChart    [label="Chart.yaml"]
  90         dev       [label="dev.yaml"]
  91         prod      [label="prod.yaml"]
  92         crb       [label="clusterrolebindings.yaml"]
  93         secrets   [label="secrets.yaml"]
  94       }
  95       {
  96         node      [style=dashed]
  97         vCom      [label="component"]
  98       }
  99
 100       onap         -> oValues
 101       onap         -> oChart
 102       onap         -> templates
 103       onap         -> resources
 104       oValues      -> vCom
 105       resources    -> environments
 106       environments -> dev
 107       environments -> prod
 108       templates    -> crb
 109       templates    -> secrets
 110    }
 111
 112 Within the `values.yaml` file at the `onap` level, one will find a set of
 113 boolean values that control which of the ONAP components get deployed as shown
 114 below:
 115
 116 .. code-block:: yaml
 117
 118   aaf: # Application Authorization Framework
 119     enabled: false
 120   <...>
 121   so: # Service Orchestrator
 122     enabled: true
 123
 124 By setting these flags a custom deployment can be created and used during
 125 deployment by using the `-f` Helm option as follows::
 126
 127   > helm install local/onap -name development -f dev.yaml
 128
 129 Note that there are one or more example deployment files in the
 130 `onap/resources/environments/` directory. It is best practice to create a
 131 unique deployment file for each environment used to ensure consistent
 132 behaviour.
 133
 134 To aid in the long term supportability of ONAP, a set of common charts have
 135 been created (and will be expanded in subsequent releases of ONAP) that can be
 136 used by any of the ONAP components by including the common component in its
 137 `requirements.yaml` file. The common components are arranged as follows:
 138
 139 .. graphviz::
 140
 141    digraph onap_common_chart {
 142       rankdir="LR";
 143       {
 144          node      [shape=folder]
 145          mValues   [label="values.yaml"]
 146          ccValues  [label="values.yaml"]
 147          comValues [label="values.yaml"]
 148          comChart  [label="Chart.yaml"]
 149          ccChart   [label="Chart.yaml"]
 150          mChart    [label="Chart.yaml"]
 151
 152          mReq      [label="requirements.yaml"]
 153          mService  [label="service.yaml"]
 154          mMap      [label="configmap.yaml"]
 155          ccName    [label="_name.tpl"]
 156          ccNS      [label="_namespace.tpl"]
 157       }
 158       {
 159          cCom       [label="common"]
 160          mTemp      [label="templates"]
 161          ccTemp     [label="templates"]
 162       }
 163       {
 164          more       [label="...",style=dashed]
 165       }
 166
 167       common -> comValues
 168       common -> comChart
 169       common -> cCom
 170       common -> mysql
 171       common -> more
 172
 173       cCom   -> ccChart
 174       cCom   -> ccValues
 175       cCom   -> ccTemp
 176       ccTemp -> ccName
 177       ccTemp -> ccNS
 178
 179       mysql  -> mValues
 180       mysql  -> mChart
 181       mysql  -> mReq
 182       mysql  -> mTemp
 183       mTemp  -> mService
 184       mTemp  -> mMap
 185    }
 186
 187 The common section of charts consists of a set of templates that assist with
 188 parameter substitution (`_name.tpl` and `_namespace.tpl`) and a set of charts
 189 for components used throughout ONAP. Initially `mysql` is in the common area
 190 but this will expand to include other databases like `mariadb-galera`,
 191 `postgres`, and `cassandra`. Other candidates for common components include
 192 `redis` and`kafka`.  When the common components are used by other charts they
 193 are instantiated each time. In subsequent ONAP releases some of the common
 194 components could be a setup as services that are used by multiple ONAP
 195 components thus minimizing the deployment and operational costs.
 196
 197 All of the ONAP components have charts that follow the pattern shown below:
 198
 199 .. graphviz::
 200
 201    digraph onap_component_chart {
 202       rankdir="LR";
 203       {
 204          node      [shape=folder]
 205          cValues   [label="values.yaml"]
 206          cChart    [label="Chart.yaml"]
 207          cService  [label="service.yaml"]
 208          cMap      [label="configmap.yaml"]
 209          cFiles    [label="config file(s)"]
 210       }
 211       {
 212          cCharts   [label="charts"]
 213          cTemp     [label="templates"]
 214          cRes      [label="resources"]
 215
 216       }
 217       {
 218          sCom       [label="component",style=dashed]
 219       }
 220
 221       component -> cValues
 222       component -> cChart
 223       component -> cCharts
 224       component -> cTemp
 225       component -> cRes
 226       cTemp     -> cService
 227       cTemp     -> cMap
 228       cRes      -> config
 229       config    -> cFiles
 230       cCharts   -> sCom
 231    }
 232
 233 Note that the component charts may include a hierarchy of components and in
 234 themselves can be quite complex.
 235
 236 Configuration of the components varies somewhat from component to component but
 237 generally follows the pattern of one or more `configmap.yaml` files which can
 238 directly provide configuration to the containers in addition to processing
 239 configuration files stored in the `config` directory.  It is the responsibility
 240 of each ONAP component team to update these configuration files when changes
 241 are made to the project containers that impact configuration.
 242
 243 The following section describes how the hierarchical ONAP configuration system
 244 is key to management of such a large system.
 245
 246 Configuration Management
 247 ========================
 248
 249 ONAP is a large system composed of many components - each of which are complex
 250 systems in themselves - that needs to be deployed in a number of different
 251 ways.  For example, within a single operator's network there may be R&D
 252 deployments under active development, pre-production versions undergoing system
 253 testing and production systems that are operating live networks.  Each of these
 254 deployments will differ in significant ways, such as the version of the
 255 software images deployed.  In addition, there may be a number of application
 256 specific configuration differences, such as operating system environment
 257 variables.  The following describes how the Helm configuration management
 258 system is used within the OOM project to manage both ONAP infrastructure
 259 configuration as well as ONAP components configuration.
 260
 261 One of the artifacts that OOM/Kubernetes uses to deploy ONAP components is the
 262 deployment specification, yet another yaml file.  Within these deployment specs
 263 are a number of parameters as shown in the following mariadb example:
 264
 265 .. code-block:: yaml
 266
 267   apiVersion: extensions/v1beta1
 268   kind: Deployment
 269   metadata:
 270     name: mariadb
 271   spec:
 272      <...>
 273     template:
 274       <...>
 275       spec:
 276         hostname: mariadb
 277         containers:
 278         - args:
 279           image: nexus3.onap.org:10001/mariadb:10.1.11
 280           name: "mariadb"
 281           env:
 282             - name: MYSQL_ROOT_PASSWORD
 283               value: password
 284             - name: MARIADB_MAJOR
 285               value: "10.1"
 286           <...>
 287         imagePullSecrets:
 288         - name: onap-docker-registry-key
 289
 290 Note that within the deployment specification, one of the container arguments
 291 is the key/value pair image: nexus3.onap.org:10001/mariadb:10.1.11 which
 292 specifies the version of the mariadb software to deploy.  Although the
 293 deployment specifications greatly simplify deployment, maintenance of the
 294 deployment specifications themselves become problematic as software versions
 295 change over time or as different versions are required for different
 296 deployments.  For example, if the R&D team needs to deploy a newer version of
 297 mariadb than what is currently used in the production environment, they would
 298 need to clone the deployment specification and change this value.  Fortunately,
 299 this problem has been solved with the templating capabilities of Helm.
 300
 301 The following example shows how the deployment specifications are modified to
 302 incorporate Helm templates such that key/value pairs can be defined outside of
 303 the deployment specifications and passed during instantiation of the component.
 304
 305 .. code-block:: yaml
 306
 307   apiVersion: extensions/v1beta1
 308   kind: Deployment
 309   metadata:
 310     name: mariadb
 311     namespace: "{{ .Values.nsPrefix }}-mso"
 312   spec:
 313     <...>
 314     template:
 315       <...>
 316       spec:
 317         hostname: mariadb
 318         containers:
 319         - args:
 320           image: {{ .Values.image.mariadb }}
 321           imagePullPolicy: {{ .Values.pullPolicy }}
 322           name: "mariadb"
 323           env:
 324             - name: MYSQL_ROOT_PASSWORD
 325               value: password
 326             - name: MARIADB_MAJOR
 327               value: "10.1"
 328         <...>
 329         imagePullSecrets:
 330         - name: "{{ .Values.nsPrefix }}-docker-registry-key"apiVersion: extensions/v1beta1
 331   kind: Deployment
 332   metadata:
 333     name: mariadb
 334     namespace: "{{ .Values.nsPrefix }}-mso"
 335   spec:
 336     <...>
 337     template:
 338       <...>
 339       spec:
 340         hostname: mariadb
 341         containers:
 342         - args:
 343           image: {{ .Values.image.mariadb }}
 344           imagePullPolicy: {{ .Values.pullPolicy }}
 345           name: "mariadb"
 346           env:
 347             - name: MYSQL_ROOT_PASSWORD
 348               value: password
 349             - name: MARIADB_MAJOR
 350               value: "10.1"
 351         <...>
 352         imagePullSecrets:
 353         - name: "{{ .Values.nsPrefix }}-docker-registry-key"
 354
 355 This version of the deployment specification has gone through the process of
 356 templating values that are likely to change between deployments. Note that the
 357 image is now specified as: image: {{ .Values.image.mariadb }} instead of a
 358 string used previously.  During the deployment phase, Helm (actually the Helm
 359 sub-component Tiller) substitutes the {{ .. }} entries with a variable defined
 360 in a values.yaml file.  The content of this file is as follows:
 361
 362 .. code-block:: yaml
 363
 364   nsPrefix: onap
 365   pullPolicy: IfNotPresent
 366   image:
 367     readiness: oomk8s/readiness-check:2.0.0
 368     mso: nexus3.onap.org:10001/openecomp/mso:1.0-STAGING-latest
 369     mariadb: nexus3.onap.org:10001/mariadb:10.1.11
 370
 371 Within the values.yaml file there is an image section with the key/value pair
 372 mariadb: nexus3.onap.org:10001/mariadb:10.1.11 which is the same value used in
 373 the non-templated version.  Once all of the substitutions are complete, the
 374 resulting deployment specification ready to be used by Kubernetes.
 375
 376 Also note that in this example, the namespace key/value pair is specified in
 377 the values.yaml file.  This key/value pair will be global across the entire
 378 ONAP deployment and is therefore a prime example of where configuration
 379 hierarchy can be very useful.
 380
 381 When creating a deployment template consider the use of default values if
 382 appropriate.  Helm templating has built in support for DEFAULT values, here is
 383 an example:
 384
 385 .. code-block:: yaml
 386
 387   imagePullSecrets:
 388   - name: "{{ .Values.nsPrefix | default "onap" }}-docker-registry-key"
 389
 390 The pipeline operator ("|") used here hints at that power of Helm templates in
 391 that much like an operating system command line the pipeline operator allow
 392 over 60 Helm functions to be embedded directly into the template (note that the
 393 Helm template language is a superset of the Go template language).  These
 394 functions include simple string operations like upper and more complex flow
 395 control operations like if/else.
 396
 397
 398 ONAP Application Configuration
 399 ------------------------------
 400
 401 Dependency Management
 402 ---------------------
 403 These Helm charts describe the desired state
 404 of an ONAP deployment and instruct the Kubernetes container manager as to how
 405 to maintain the deployment in this state.  These dependencies dictate the order
 406 in-which the containers are started for the first time such that such
 407 dependencies are always met without arbitrary sleep times between container
 408 startups.  For example, the SDC back-end container requires the Elastic-Search,
 409 Cassandra and Kibana containers within SDC to be ready and is also dependent on
 410 DMaaP (or the message-router) to be ready - where ready implies the built-in
 411 "readiness" probes succeeded - before becoming fully operational.  When an
 412 initial deployment of ONAP is requested the current state of the system is NULL
 413 so ONAP is deployed by the Kubernetes manager as a set of Docker containers on
 414 one or more predetermined hosts.  The hosts could be physical machines or
 415 virtual machines.  When deploying on virtual machines the resulting system will
 416 be very similar to "Heat" based deployments, i.e. Docker containers running
 417 within a set of VMs, the primary difference being that the allocation of
 418 containers to VMs is done dynamically with OOM and statically with "Heat".
 419 Example SO deployment descriptor file shows SO's dependency on its mariadb
 420 data-base component:
 421
 422 SO deployment specification excerpt:
 423
 424 .. code-block:: yaml
 425
 426   apiVersion: extensions/v1beta1
 427   kind: Deployment
 428   metadata:
 429     name: {{ include "common.name" . }}
 430     namespace: {{ include "common.namespace" . }}
 431     labels:
 432       app: {{ include "common.name" . }}
 433       chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
 434       release: {{ .Release.Name }}
 435       heritage: {{ .Release.Service }}
 436   spec:
 437     replicas: {{ .Values.replicaCount }}
 438     template:
 439       metadata:
 440         labels:
 441           app: {{ include "common.name" . }}
 442           release: {{ .Release.Name }}
 443       spec:
 444         initContainers:
 445         - command:
 446           - /root/ready.py
 447           args:
 448           - --container-name
 449           - so-mariadb
 450           env:
 451   ...
 452
 453 Kubernetes Container Orchestration
 454 ==================================
 455 The ONAP components are managed by the Kubernetes_ container management system
 456 which maintains the desired state of the container system as described by one
 457 or more deployment descriptors - similar in concept to OpenStack HEAT
 458 Orchestration Templates. The following sections describe the fundamental
 459 objects managed by Kubernetes, the network these components use to communicate
 460 with each other and other entities outside of ONAP and the templates that
 461 describe the configuration and desired state of the ONAP components.
 462
 463 Name Spaces
 464 -----------
 465 Within the namespaces are Kubernetes services that provide external
 466 connectivity to pods that host Docker containers.
 467
 468 ONAP Components to Kubernetes Object Relationships
 469 --------------------------------------------------
 470 Kubernetes deployments consist of multiple objects:
 471
 472 - **nodes** - a worker machine - either physical or virtual - that hosts
 473   multiple containers managed by Kubernetes.
 474 - **services** - an abstraction of a logical set of pods that provide a
 475   micro-service.
 476 - **pods** - one or more (but typically one) container(s) that provide specific
 477   application functionality.
 478 - **persistent volumes** - One or more permanent volumes need to be established
 479   to hold non-ephemeral configuration and state data.
 480
 481 The relationship between these objects is shown in the following figure:
 482
 483 .. .. uml::
 484 ..
 485 ..   @startuml
 486 ..   node PH {
 487 ..      component Service {
 488 ..         component Pod0
 489 ..         component Pod1
 490 ..      }
 491 ..   }
 492 ..
 493 ..   database PV
 494 ..   @enduml
 495
 496 .. figure:: kubernetes_objects.png
 497
 498 OOM uses these Kubernetes objects as described in the following sections.
 499
 500 Nodes
 501 ~~~~~
 502 OOM works with both physical and virtual worker machines.
 503
 504 * Virtual Machine Deployments - If ONAP is to be deployed onto a set of virtual
 505   machines, the creation of the VMs is outside of the scope of OOM and could be
 506   done in many ways, such as
 507
 508   * manually, for example by a user using the OpenStack Horizon dashboard or
 509     AWS EC2, or
 510   * automatically, for example with the use of a OpenStack Heat Orchestration
 511     Template which builds an ONAP stack, Azure ARM template, AWS CloudFormation
 512     Template, or
 513   * orchestrated, for example with Cloudify creating the VMs from a TOSCA
 514     template and controlling their life cycle for the life of the ONAP
 515     deployment.
 516
 517 * Physical Machine Deployments - If ONAP is to be deployed onto physical
 518   machines there are several options but the recommendation is to use Rancher
 519   along with Helm to associate hosts with a Kubernetes cluster.
 520
 521 Pods
 522 ~~~~
 523 A group of containers with shared storage and networking can be grouped
 524 together into a Kubernetes pod.  All of the containers within a pod are
 525 co-located and co-scheduled so they operate as a single unit.  Within ONAP
 526 Amsterdam release, pods are mapped one-to-one to docker containers although
 527 this may change in the future.  As explained in the Services section below the
 528 use of Pods within each ONAP component is abstracted from other ONAP
 529 components.
 530
 531 Services
 532 ~~~~~~~~
 533 OOM uses the Kubernetes service abstraction to provide a consistent access
 534 point for each of the ONAP components independent of the pod or container
 535 architecture of that component.  For example, the SDNC component may introduce
 536 OpenDaylight clustering as some point and change the number of pods in this
 537 component to three or more but this change will be isolated from the other ONAP
 538 components by the service abstraction.  A service can include a load balancer
 539 on its ingress to distribute traffic between the pods and even react to dynamic
 540 changes in the number of pods if they are part of a replica set.
 541
 542 Persistent Volumes
 543 ~~~~~~~~~~~~~~~~~~
 544 To enable ONAP to be deployed into a wide variety of cloud infrastructures a
 545 flexible persistent storage architecture, built on Kubernetes persistent
 546 volumes, provides the ability to define the physical storage in a central
 547 location and have all ONAP components securely store their data.
 548
 549 When deploying ONAP into a public cloud, available storage services such as
 550 `AWS Elastic Block Store`_, `Azure File`_, or `GCE Persistent Disk`_ are
 551 options.  Alternatively, when deploying into a private cloud the storage
 552 architecture might consist of Fiber Channel, `Gluster FS`_, or iSCSI. Many
 553 other storage options existing, refer to the `Kubernetes Storage Class`_
 554 documentation for a full list of the options. The storage architecture may vary
 555 from deployment to deployment but in all cases a reliable, redundant storage
 556 system must be provided to ONAP with which the state information of all ONAP
 557 components will be securely stored. The Storage Class for a given deployment is
 558 a single parameter listed in the ONAP values.yaml file and therefore is easily
 559 customized. Operation of this storage system is outside the scope of the OOM.
 560
 561 .. code-block:: yaml
 562
 563   Insert values.yaml code block with storage block here
 564
 565 Once the storage class is selected and the physical storage is provided, the
 566 ONAP deployment step creates a pool of persistent volumes within the given
 567 physical storage that is used by all of the ONAP components. ONAP components
 568 simply make a claim on these persistent volumes (PV), with a persistent volume
 569 claim (PVC), to gain access to their storage.
 570
 571 The following figure illustrates the relationships between the persistent
 572 volume claims, the persistent volumes, the storage class, and the physical
 573 storage.
 574
 575 .. graphviz::
 576
 577    digraph PV {
 578       label = "Persistance Volume Claim to Physical Storage Mapping"
 579       {
 580          node [shape=cylinder]
 581          D0 [label="Drive0"]
 582          D1 [label="Drive1"]
 583          Dx [label="Drivex"]
 584       }
 585       {
 586          node [shape=Mrecord label="StorageClass:ceph"]
 587          sc
 588       }
 589       {
 590          node [shape=point]
 591          p0 p1 p2
 592          p3 p4 p5
 593       }
 594       subgraph clusterSDC {
 595          label="SDC"
 596          PVC0
 597          PVC1
 598       }
 599       subgraph clusterSDNC {
 600          label="SDNC"
 601          PVC2
 602       }
 603       subgraph clusterSO {
 604          label="SO"
 605          PVCn
 606       }
 607       PV0 -> sc
 608       PV1 -> sc
 609       PV2 -> sc
 610       PVn -> sc
 611
 612       sc -> {D0 D1 Dx}
 613       PVC0 -> PV0
 614       PVC1 -> PV1
 615       PVC2 -> PV2
 616       PVCn -> PVn
 617
 618       # force all of these nodes to the same line in the given order
 619       subgraph {
 620          rank = same; PV0;PV1;PV2;PVn;p0;p1;p2
 621          PV0->PV1->PV2->p0->p1->p2->PVn [style=invis]
 622       }
 623
 624       subgraph {
 625          rank = same; D0;D1;Dx;p3;p4;p5
 626          D0->D1->p3->p4->p5->Dx [style=invis]
 627       }
 628
 629    }
 630
 631 In-order for an ONAP component to use a persistent volume it must make a claim
 632 against a specific persistent volume defined in the ONAP common charts.  Note
 633 that there is a one-to-one relationship between a PVC and PV.  The following is
 634 an excerpt from a component chart that defines a PVC:
 635
 636 .. code-block:: yaml
 637
 638   Insert PVC example here
 639
 640 OOM Networking with Kubernetes
 641 ------------------------------
 642
 643 - DNS
 644 - Ports - Flattening the containers also expose port conflicts between the
 645   containers which need to be resolved.
 646
 647 Node Ports
 648 ~~~~~~~~~~
 649
 650 Pod Placement Rules
 651 -------------------
 652 OOM will use the rich set of Kubernetes node and pod affinity /
 653 anti-affinity rules to minimize the chance of a single failure resulting in a
 654 loss of ONAP service. Node affinity / anti-affinity is used to guide the
 655 Kubernetes orchestrator in the placement of pods on nodes (physical or virtual
 656 machines).  For example:
 657
 658 - if a container used Intel DPDK technology the pod may state that it as
 659   affinity to an Intel processor based node, or
 660 - geographical based node labels (such as the Kubernetes standard zone or
 661   region labels) may be used to ensure placement of a DCAE complex close to the
 662   VNFs generating high volumes of traffic thus minimizing networking cost.
 663   Specifically, if nodes were pre-assigned labels East and West, the pod
 664   deployment spec to distribute pods to these nodes would be:
 665
 666 .. code-block:: yaml
 667
 668   nodeSelector:
 669     failure-domain.beta.Kubernetes.io/region: {{ .Values.location }}
 670
 671 - "location: West" is specified in the `values.yaml` file used to deploy
 672   one DCAE cluster and  "location: East" is specified in a second `values.yaml`
 673   file (see OOM Configuration Management for more information about
 674   configuration files like the `values.yaml` file).
 675
 676 Node affinity can also be used to achieve geographic redundancy if pods are
 677 assigned to multiple failure domains. For more information refer to `Assigning
 678 Pods to Nodes`_.
 679
 680 .. note::
 681    One could use Pod to Node assignment to totally constrain Kubernetes when
 682    doing initial container assignment to replicate the Amsterdam release
 683    OpenStack Heat based deployment. Should one wish to do this, each VM would
 684    need a unique node name which would be used to specify a node constaint
 685    for every component.  These assignment could be specified in an environment
 686    specific values.yaml file. Constraining Kubernetes in this way is not
 687    recommended.
 688
 689 Kubernetes has a comprehensive system called Taints and Tolerations that can be
 690 used to force the container orchestrator to repel pods from nodes based on
 691 static events (an administrator assigning a taint to a node) or dynamic events
 692 (such as a node becoming unreachable or running out of disk space). There are
 693 no plans to use taints or tolerations in the ONAP Beijing release.  Pod
 694 affinity / anti-affinity is the concept of creating a spacial relationship
 695 between pods when the Kubernetes orchestrator does assignment (both initially
 696 an in operation) to nodes as explained in Inter-pod affinity and anti-affinity.
 697 For example, one might choose to co-located all of the ONAP SDC containers on a
 698 single node as they are not critical runtime components and co-location
 699 minimizes overhead. On the other hand, one might choose to ensure that all of
 700 the containers in an ODL cluster (SDNC and APPC) are placed on separate nodes
 701 such that a node failure has minimal impact to the operation of the cluster.
 702 An example of how pod affinity / anti-affinity is shown below:
 703
 704 Pod Affinity / Anti-Affinity
 705
 706 .. code-block:: yaml
 707
 708   apiVersion: v1
 709   kind: Pod
 710   metadata:
 711     name: with-pod-affinity
 712   spec:
 713     affinity:
 714       podAffinity:
 715         requiredDuringSchedulingIgnoredDuringExecution:
 716         - labelSelector:
 717             matchExpressions:
 718         - key: security
 719           operator: In
 720           values:
 721           - S1
 722           topologyKey: failure-domain.beta.Kubernetes.io/zone
 723       podAntiAffinity:
 724         preferredDuringSchedulingIgnoredDuringExecution:
 725         - weight: 100
 726           podAffinityTerm:
 727             labelSelector:
 728               matchExpressions:
 729               - key: security
 730                 operator: In
 731                 values:
 732                 - S2
 733             topologyKey: Kubernetes.io/hostname
 734        containers:
 735        - name: with-pod-affinity
 736          image: gcr.io/google_containers/pause:2.0
 737
 738 This example contains both podAffinity and podAntiAffinity rules, the first
 739 rule is is a must (requiredDuringSchedulingIgnoredDuringExecution) while the
 740 second will be met pending other considerations
 741 (preferredDuringSchedulingIgnoredDuringExecution).  Preemption Another feature
 742 that may assist in achieving a repeatable deployment in the presence of faults
 743 that may have reduced the capacity of the cloud is assigning priority to the
 744 containers such that mission critical components have the ability to evict less
 745 critical components.  Kubernetes provides this capability with Pod Priority and
 746 Preemption.  Prior to having more advanced production grade features available,
 747 the ability to at least be able to re-deploy ONAP (or a subset of) reliably
 748 provides a level of confidence that should an outage occur the system can be
 749 brought back on-line predictably.
 750
 751 Health Checks
 752 -------------
 753
 754 Monitoring of ONAP components is configured in the agents within JSON files and
 755 stored in gerrit under the consul-agent-config, here is an example from the AAI
 756 model loader (aai-model-loader-health.json):
 757
 758 .. code-block:: json
 759
 760   {
 761     "service": {
 762       "name": "A&AI Model Loader",
 763       "checks": [
 764         {
 765           "id": "model-loader-process",
 766           "name": "Model Loader Presence",
 767           "script": "/consul/config/scripts/model-loader-script.sh",
 768           "interval": "15s",
 769           "timeout": "1s"
 770         }
 771       ]
 772     }
 773   }
 774
 775 Liveness Probes
 776 ---------------
 777
 778 These liveness probes can simply check that a port is available, that a
 779 built-in health check is reporting good health, or that the Consul health check
 780 is positive.  For example, to monitor the SDNC component has following liveness
 781 probe can be found in the SDNC DB deployment specification:
 782
 783 .. code-block:: yaml
 784
 785   sdnc db liveness probe
 786
 787   livenessProbe:
 788     exec:
 789       command: ["mysqladmin", "ping"]
 790       initialDelaySeconds: 30 periodSeconds: 10
 791       timeoutSeconds: 5
 792
 793 The 'initialDelaySeconds' control the period of time between the readiness
 794 probe succeeding and the liveness probe starting. 'periodSeconds' and
 795 'timeoutSeconds' control the actual operation of the probe.  Note that
 796 containers are inherently ephemeral so the healing action destroys failed
 797 containers and any state information within it.  To avoid a loss of state, a
 798 persistent volume should be used to store all data that needs to be persisted
 799 over the re-creation of a container.  Persistent volumes have been created for
 800 the database components of each of the projects and the same technique can be
 801 used for all persistent state information.
 802
 803
 804
 805 Environment Files
 806 ~~~~~~~~~~~~~~~~~
 807
 808 MSB Integration
 809 ===============
 810
 811 The \ `Microservices Bus
 812 Project <https://wiki.onap.org/pages/viewpage.action?pageId=3246982>`__ provides
 813 facilities to integrate micro-services into ONAP and therefore needs to
 814 integrate into OOM - primarily through Consul which is the backend of
 815 MSB service discovery. The following is a brief description of how this
 816 integration will be done:
 817
 818 A registrator to push the service endpoint info to MSB service
 819 discovery.
 820
 821 -  The needed service endpoint info is put into the kubernetes yaml file
 822    as annotation, including service name, Protocol,version, visual
 823    range,LB method, IP, Port,etc.
 824
 825 -  OOM deploy/start/restart/scale in/scale out/upgrade ONAP components
 826
 827 -  Registrator watch the kubernetes event
 828
 829 -  When an ONAP component instance has been started/destroyed by OOM,
 830    Registrator get the notification from kubernetes
 831
 832 -  Registrator parse the service endpoint info from annotation and
 833    register/update/unregister it to MSB service discovery
 834
 835 -  MSB API Gateway uses the service endpoint info for service routing
 836    and load balancing.
 837
 838 Details of the registration service API can be found at \ `Microservice
 839 Bus API
 840 Documentation <https://wiki.onap.org/display/DW/Microservice+Bus+API+Documentation>`__.
 841
 842 ONAP Component Registration to MSB
 843 ----------------------------------
 844 The charts of all ONAP components intending to register against MSB must have
 845 an annotation in their service(s) template.  A `sdc` example follows:
 846
 847 .. code-block:: yaml
 848
 849   apiVersion: v1
 850   kind: Service
 851   metadata:
 852     labels:
 853       app: sdc-be
 854     name: sdc-be
 855     namespace: "{{ .Values.nsPrefix }}"
 856     annotations:
 857       msb.onap.org/service-info: '[
 858         {
 859             "serviceName": "sdc",
 860             "version": "v1",
 861             "url": "/sdc/v1",
 862             "protocol": "REST",
 863             "port": "8080",
 864             "visualRange":"1"
 865         },
 866         {
 867             "serviceName": "sdc-deprecated",
 868             "version": "v1",
 869             "url": "/sdc/v1",
 870             "protocol": "REST",
 871             "port": "8080",
 872             "visualRange":"1",
 873             "path":"/sdc/v1"
 874         }
 875         ]'
 876   ...
 877
 878
 879 MSB Integration with OOM
 880 ------------------------
 881 A preliminary view of the OOM-MSB integration is as follows:
 882
 883 .. figure:: MSB-OOM-Diagram.png
 884
 885 A message sequence chart of the registration process:
 886
 887 .. uml::
 888
 889   participant "OOM" as oom
 890   participant "ONAP Component" as onap
 891   participant "Service Discovery" as sd
 892   participant "External API Gateway" as eagw
 893   participant "Router (Internal API Gateway)" as iagw
 894
 895   box "MSB" #LightBlue
 896     participant sd
 897     participant eagw
 898     participant iagw
 899   end box
 900
 901   == Deploy Servcie ==
 902
 903   oom -> onap: Deploy
 904   oom -> sd:   Register service endpoints
 905   sd -> eagw:  Services exposed to external system
 906   sd -> iagw:  Services for internal use
 907
 908   == Component Life-cycle Management ==
 909
 910   oom -> onap: Start/Stop/Scale/Migrate/Upgrade
 911   oom -> sd:   Update service info
 912   sd -> eagw:  Update service info
 913   sd -> iagw:  Update service info
 914
 915   == Service Health Check ==
 916
 917   sd -> onap: Check the health of service
 918   sd -> eagw: Update service status
 919   sd -> iagw: Update service status
 920
 921
 922 MSB Deployment Instructions
 923 ---------------------------
 924 MSB is helm installable ONAP component which is often automatically deployed.
 925 To install it individually enter::
 926
 927   > helm install <repo-name>/msb
 928
 929 .. note::
 930   TBD: Vaidate if the following procedure is still required.
 931
 932 Please note that Kubernetes authentication token must be set at
 933 *kubernetes/kube2msb/values.yaml* so the kube2msb registrator can get the
 934 access to watch the kubernetes events and get service annotation by
 935 Kubernetes APIs. The token can be found in the kubectl configuration file
 936 *~/.kube/config*
 937
 938 More details can be found here `MSB installation <http://onap.readthedocs.io/en/latest/submodules/msb/apigateway.git/docs/platform/installation.html>`__.
 939
 940 .. MISC
 941 .. ====
 942 .. Note that although OOM uses Kubernetes facilities to minimize the effort
 943 .. required of the ONAP component owners to implement a successful rolling
 944 .. upgrade strategy there are other considerations that must be taken into
 945 .. consideration.
 946 .. For example, external APIs - both internal and external to ONAP - should be
 947 .. designed to gracefully accept transactions from a peer at a different
 948 .. software version to avoid deadlock situations. Embedded version codes in
 949 .. messages may facilitate such capabilities.
 950 ..
 951 .. Within each of the projects a new configuration repository contains all of
 952 .. the project specific configuration artifacts.  As changes are made within
 953 .. the project, it's the responsibility of the project team to make appropriate
 954 .. changes to the configuration data.