docs/oom_developer_guide.rst

   1 .. This work is licensed under a Creative Commons Attribution 4.0
   2 .. International License.
   3 .. http://creativecommons.org/licenses/by/4.0
   4 .. Copyright 2018-2020 Amdocs, Bell Canada, Orange, Samsung
   5
   6 .. Links
   7 .. _Helm: https://docs.helm.sh/
   8 .. _Helm Charts: https://github.com/kubernetes/charts
   9 .. _Kubernetes: https://Kubernetes.io/
  10 .. _Docker: https://www.docker.com/
  11 .. _Nexus: https://nexus.onap.org/
  12 .. _AWS Elastic Block Store: https://aws.amazon.com/ebs/
  13 .. _Azure File: https://docs.microsoft.com/en-us/azure/storage/files/storage-files-introduction
  14 .. _GCE Persistent Disk: https://cloud.google.com/compute/docs/disks/
  15 .. _Gluster FS: https://www.gluster.org/
  16 .. _Kubernetes Storage Class: https://Kubernetes.io/docs/concepts/storage/storage-classes/
  17 .. _Assigning Pods to Nodes: https://Kubernetes.io/docs/concepts/configuration/assign-pod-node/
  18
  19
  20 .. _developer-guide-label:
  21
  22 OOM Developer Guide
  23 ###################
  24
  25 .. figure:: oomLogoV2-medium.png
  26    :align: right
  27
  28 ONAP consists of a large number of components, each of which are substantial
  29 projects within themselves, which results in a high degree of complexity in
  30 deployment and management. To cope with this complexity the ONAP Operations
  31 Manager (OOM) uses a Helm_ model of ONAP - Helm being the primary management
  32 system for Kubernetes_ container systems - to drive all user driven life-cycle
  33 management operations. The Helm model of ONAP is composed of a set of
  34 hierarchical Helm charts that define the structure of the ONAP components and
  35 the configuration of these components.  These charts are fully parameterized
  36 such that a single environment file defines all of the parameters needed to
  37 deploy ONAP.  A user of ONAP may maintain several such environment files to
  38 control the deployment of ONAP in multiple environments such as development,
  39 pre-production, and production.
  40
  41 The following sections describe how the ONAP Helm charts are constructed.
  42
  43 .. contents::
  44    :depth: 3
  45    :local:
  46 ..
  47
  48 Container Background
  49 ====================
  50 Linux containers allow for an application and all of its operating system
  51 dependencies to be packaged and deployed as a single unit without including a
  52 guest operating system as done with virtual machines. The most popular
  53 container solution is Docker_ which provides tools for container management
  54 like the Docker Host (dockerd) which can create, run, stop, move, or delete a
  55 container. Docker has a very popular registry of containers images that can be
  56 used by any Docker system; however, in the ONAP context, Docker images are
  57 built by the standard CI/CD flow and stored in Nexus_ repositories. OOM uses
  58 the "standard" ONAP docker containers and three new ones specifically created
  59 for OOM.
  60
  61 Containers are isolated from each other primarily via name spaces within the
  62 Linux kernel without the need for multiple guest operating systems. As such,
  63 multiple containers can be deployed with little overhead such as all of ONAP
  64 can be deployed on a single host. With some optimization of the ONAP components
  65 (e.g. elimination of redundant database instances) it may be possible to deploy
  66 ONAP on a single laptop computer.
  67
  68 Helm Charts
  69 ===========
  70 A Helm chart is a collection of files that describe a related set of Kubernetes
  71 resources. A simple chart might be used to deploy something simple, like a
  72 memcached pod, while a complex chart might contain many micro-service arranged
  73 in a hierarchy as found in the `aai` ONAP component.
  74
  75 Charts are created as files laid out in a particular directory tree, then they
  76 can be packaged into versioned archives to be deployed. There is a public
  77 archive of `Helm Charts`_ on GitHub that includes many technologies applicable
  78 to ONAP. Some of these charts have been used in ONAP and all of the ONAP charts
  79 have been created following the guidelines provided.
  80
  81 The top level of the ONAP charts is shown below:
  82
  83 .. code-block:: bash
  84
  85   common
  86   ├── cassandra
  87   │   ├── Chart.yaml
  88   │   ├── requirements.yaml
  89   │   ├── resources
  90   │   │   ├── config
  91   │   │   │   └── docker-entrypoint.sh
  92   │   │   ├── exec.py
  93   │   │   └── restore.sh
  94   │   ├── templates
  95   │   │   ├── backup
  96   │   │   │   ├── configmap.yaml
  97   │   │   │   ├── cronjob.yaml
  98   │   │   │   ├── pv.yaml
  99   │   │   │   └── pvc.yaml
 100   │   │   ├── configmap.yaml
 101   │   │   ├── pv.yaml
 102   │   │   ├── service.yaml
 103   │   │   └── statefulset.yaml
 104   │   └── values.yaml
 105   ├── common
 106   │   ├── Chart.yaml
 107   │   ├── templates
 108   │   │   ├── _createPassword.tpl
 109   │   │   ├── _ingress.tpl
 110   │   │   ├── _labels.tpl
 111   │   │   ├── _mariadb.tpl
 112   │   │   ├── _name.tpl
 113   │   │   ├── _namespace.tpl
 114   │   │   ├── _repository.tpl
 115   │   │   ├── _resources.tpl
 116   │   │   ├── _secret.yaml
 117   │   │   ├── _service.tpl
 118   │   │   ├── _storage.tpl
 119   │   │   └── _tplValue.tpl
 120   │   └── values.yaml
 121   ├── ...
 122   └── postgres-legacy
 123       ├── Chart.yaml
 124       ├── requirements.yaml
 125       ├── charts
 126       └── configs
 127
 128 The common section of charts consists of a set of templates that assist with
 129 parameter substitution (`_name.tpl`, `_namespace.tpl` and others) and a set of
 130 charts for components used throughout ONAP.  When the common components are used
 131 by other charts they are instantiated each time or we can deploy a shared
 132 instances for several components.
 133
 134 All of the ONAP components have charts that follow the pattern shown below:
 135
 136 .. code-block:: bash
 137
 138   name-of-my-component
 139   ├── Chart.yaml
 140   ├── requirements.yaml
 141   ├── component
 142   │   └── subcomponent-folder
 143   ├── charts
 144   │   └── subchart-folder
 145   ├── resources
 146   │   ├── folder1
 147   │   │   ├── file1
 148   │   │   └── file2
 149   │   └── folder1
 150   │       ├── file3
 151   │       └── folder3
 152   │           └── file4
 153   ├── templates
 154   │   ├── NOTES.txt
 155   │   ├── configmap.yaml
 156   │   ├── deployment.yaml
 157   │   ├── ingress.yaml
 158   │   ├── job.yaml
 159   │   ├── secrets.yaml
 160   │   └── service.yaml
 161   └── values.yaml
 162
 163 Note that the component charts / components may include a hierarchy of sub
 164 components and in themselves can be quite complex.
 165
 166 You can use either `charts` or `components` folder for your subcomponents.
 167 `charts` folder means that the subcomponent will always been deployed.
 168
 169 `components` folders means we can choose if we want to deploy the
 170 subcomponent.
 171
 172 This choice is done in root `values.yaml`:
 173
 174 .. code-block:: yaml
 175
 176   ---
 177   global:
 178     key: value
 179
 180   component1:
 181     enabled: true
 182   component2:
 183     enabled: true
 184
 185 Then in `requirements.yaml`, you'll use these values:
 186
 187 .. code-block:: yaml
 188
 189   ---
 190   dependencies:
 191     - name: common
 192       version: ~x.y-0
 193       repository: '@local'
 194     - name: component1
 195       version: ~x.y-0
 196       repository: 'file://components/component1'
 197       condition: component1.enabled
 198     - name: component2
 199       version: ~x.y-0
 200       repository: 'file://components/component2'
 201       condition: component2.enabled
 202
 203 Configuration of the components varies somewhat from component to component but
 204 generally follows the pattern of one or more `configmap.yaml` files which can
 205 directly provide configuration to the containers in addition to processing
 206 configuration files stored in the `config` directory.  It is the responsibility
 207 of each ONAP component team to update these configuration files when changes
 208 are made to the project containers that impact configuration.
 209
 210 The following section describes how the hierarchical ONAP configuration system
 211 is key to management of such a large system.
 212
 213 Configuration Management
 214 ========================
 215
 216 ONAP is a large system composed of many components - each of which are complex
 217 systems in themselves - that needs to be deployed in a number of different
 218 ways.  For example, within a single operator's network there may be R&D
 219 deployments under active development, pre-production versions undergoing system
 220 testing and production systems that are operating live networks.  Each of these
 221 deployments will differ in significant ways, such as the version of the
 222 software images deployed.  In addition, there may be a number of application
 223 specific configuration differences, such as operating system environment
 224 variables.  The following describes how the Helm configuration management
 225 system is used within the OOM project to manage both ONAP infrastructure
 226 configuration as well as ONAP components configuration.
 227
 228 One of the artifacts that OOM/Kubernetes uses to deploy ONAP components is the
 229 deployment specification, yet another yaml file.  Within these deployment specs
 230 are a number of parameters as shown in the following example:
 231
 232 .. code-block:: yaml
 233
 234   apiVersion: apps/v1
 235   kind: StatefulSet
 236   metadata:
 237     labels:
 238       app.kubernetes.io/name: zookeeper
 239       helm.sh/chart: zookeeper
 240       app.kubernetes.io/component: server
 241       app.kubernetes.io/managed-by: Tiller
 242       app.kubernetes.io/instance: onap-oof
 243     name: onap-oof-zookeeper
 244     namespace: onap
 245   spec:
 246     <...>
 247     replicas: 3
 248     selector:
 249       matchLabels:
 250         app.kubernetes.io/name: zookeeper
 251         app.kubernetes.io/component: server
 252         app.kubernetes.io/instance: onap-oof
 253     serviceName: onap-oof-zookeeper-headless
 254     template:
 255       metadata:
 256         labels:
 257           app.kubernetes.io/name: zookeeper
 258           helm.sh/chart: zookeeper
 259           app.kubernetes.io/component: server
 260           app.kubernetes.io/managed-by: Tiller
 261           app.kubernetes.io/instance: onap-oof
 262       spec:
 263         <...>
 264         affinity:
 265         containers:
 266         - name: zookeeper
 267           <...>
 268           image: gcr.io/google_samples/k8szk:v3
 269           imagePullPolicy: Always
 270           <...>
 271           ports:
 272           - containerPort: 2181
 273             name: client
 274             protocol: TCP
 275           - containerPort: 3888
 276             name: election
 277             protocol: TCP
 278           - containerPort: 2888
 279             name: server
 280             protocol: TCP
 281           <...>
 282
 283 Note that within the statefulset specification, one of the container arguments
 284 is the key/value pair image: gcr.io/google_samples/k8szk:v3 which
 285 specifies the version of the zookeeper software to deploy.  Although the
 286 statefulset specifications greatly simplify statefulset, maintenance of the
 287 statefulset specifications themselves become problematic as software versions
 288 change over time or as different versions are required for different
 289 statefulsets.  For example, if the R&D team needs to deploy a newer version of
 290 mariadb than what is currently used in the production environment, they would
 291 need to clone the statefulset specification and change this value.  Fortunately,
 292 this problem has been solved with the templating capabilities of Helm.
 293
 294 The following example shows how the statefulset specifications are modified to
 295 incorporate Helm templates such that key/value pairs can be defined outside of
 296 the statefulset specifications and passed during instantiation of the component.
 297
 298 .. code-block:: yaml
 299
 300   apiVersion: apps/v1
 301   kind: StatefulSet
 302   metadata:
 303     name: {{ include "common.fullname" . }}
 304     namespace: {{ include "common.namespace" . }}
 305     labels: {{- include "common.labels" . | nindent 4 }}
 306   spec:
 307     replicas: {{ .Values.replicaCount }}
 308     selector:
 309       matchLabels: {{- include "common.matchLabels" . | nindent 6 }}
 310     # serviceName is only needed for StatefulSet
 311     # put the postfix part only if you have add a postfix on the service name
 312     serviceName: {{ include "common.servicename" . }}-{{ .Values.service.postfix }}
 313     <...>
 314     template:
 315       metadata:
 316         labels: {{- include "common.labels" . | nindent 8 }}
 317         annotations: {{- include "common.tplValue" (dict "value" .Values.podAnnotations "context" $) | nindent 8 }}
 318         name: {{ include "common.name" . }}
 319       spec:
 320         <...>
 321         containers:
 322           - name: {{ include "common.name" . }}
 323             image: {{ .Values.image }}
 324             imagePullPolicy: {{ .Values.global.pullPolicy | default .Values.pullPolicy }}
 325             ports:
 326             {{- range $index, $port := .Values.service.ports }}
 327               - containerPort: {{ $port.port }}
 328                 name: {{ $port.name }}
 329             {{- end }}
 330             {{- range $index, $port := .Values.service.headlessPorts }}
 331               - containerPort: {{ $port.port }}
 332                 name: {{ $port.name }}
 333             {{- end }}
 334             <...>
 335
 336 This version of the statefulset specification has gone through the process of
 337 templating values that are likely to change between statefulsets. Note that the
 338 image is now specified as: image: {{ .Values.image }} instead of a
 339 string used previously.  During the statefulset phase, Helm (actually the Helm
 340 sub-component Tiller) substitutes the {{ .. }} entries with a variable defined
 341 in a values.yaml file.  The content of this file is as follows:
 342
 343 .. code-block:: yaml
 344
 345   <...>
 346   image: gcr.io/google_samples/k8szk:v3
 347   replicaCount: 3
 348   <...>
 349
 350
 351 Within the values.yaml file there is an image key with the value
 352 `gcr.io/google_samples/k8szk:v3` which is the same value used in
 353 the non-templated version.  Once all of the substitutions are complete, the
 354 resulting statefulset specification ready to be used by Kubernetes.
 355
 356 When creating a template consider the use of default values if appropriate.
 357 Helm templating has built in support for DEFAULT values, here is
 358 an example:
 359
 360 .. code-block:: yaml
 361
 362   imagePullSecrets:
 363   - name: "{{ .Values.nsPrefix | default "onap" }}-docker-registry-key"
 364
 365 The pipeline operator ("|") used here hints at that power of Helm templates in
 366 that much like an operating system command line the pipeline operator allow
 367 over 60 Helm functions to be embedded directly into the template (note that the
 368 Helm template language is a superset of the Go template language).  These
 369 functions include simple string operations like upper and more complex flow
 370 control operations like if/else.
 371
 372 OOM is mainly helm templating. In order to have consistent deployment of the
 373 different components of ONAP, some rules must be followed.
 374
 375 Templates are provided in order to create Kubernetes resources (Secrets,
 376 Ingress, Services, ...) or part of Kubernetes resources (names, labels,
 377 resources requests and limits, ...).
 378
 379 a full list and simple description is done in
 380 `kubernetes/common/common/documentation.rst`.
 381
 382 Service template
 383 ----------------
 384
 385 In order to create a Service for a component, you have to create a file (with
 386 `service` in the name.
 387 For normal service, just put the following line:
 388
 389 .. code-block:: yaml
 390
 391   {{ include "common.service" . }}
 392
 393 For headless service, the line to put is the following:
 394
 395 .. code-block:: yaml
 396
 397   {{ include "common.headlessService" . }}
 398
 399 The configuration of the service is done in component `values.yaml`:
 400
 401 .. code-block:: yaml
 402
 403   service:
 404    name: NAME-OF-THE-SERVICE
 405    postfix: MY-POSTFIX
 406    type: NodePort
 407    annotations:
 408      someAnnotationsKey: value
 409    ports:
 410    - name: tcp-MyPort
 411      port: 5432
 412      nodePort: 88
 413    - name: http-api
 414      port: 8080
 415      nodePort: 89
 416    - name: https-api
 417      port: 9443
 418      nodePort: 90
 419
 420 `annotations` and `postfix` keys are optional.
 421 if `service.type` is `NodePort`, then you have to give `nodePort` value for your
 422 service ports (which is the end of the computed nodePort, see example).
 423
 424 It would render the following Service Resource (for a component named
 425 `name-of-my-component`, with version `x.y.z`, helm deployment name
 426 `my-deployment` and `global.nodePortPrefix` `302`):
 427
 428 .. code-block:: yaml
 429
 430   apiVersion: v1
 431   kind: Service
 432   metadata:
 433     annotations:
 434       someAnnotationsKey: value
 435     name: NAME-OF-THE-SERVICE-MY-POSTFIX
 436     labels:
 437       app.kubernetes.io/name: name-of-my-component
 438       helm.sh/chart: name-of-my-component-x.y.z
 439       app.kubernetes.io/instance: my-deployment-name-of-my-component
 440       app.kubernetes.io/managed-by: Tiller
 441   spec:
 442     ports:
 443       - port: 5432
 444         targetPort: tcp-MyPort
 445         nodePort: 30288
 446       - port: 8080
 447         targetPort: http-api
 448         nodePort: 30289
 449       - port: 9443
 450         targetPort: https-api
 451         nodePort: 30290
 452     selector:
 453       app.kubernetes.io/name: name-of-my-component
 454       app.kubernetes.io/instance:  my-deployment-name-of-my-component
 455     type: NodePort
 456
 457 In the deployment or statefulSet file, you needs to set the good labels in
 458 order for the service to match the pods.
 459
 460 here's an example to be sure it matches (for a statefulSet):
 461
 462 .. code-block:: yaml
 463
 464   apiVersion: apps/v1
 465   kind: StatefulSet
 466   metadata:
 467     name: {{ include "common.fullname" . }}
 468     namespace: {{ include "common.namespace" . }}
 469     labels: {{- include "common.labels" . | nindent 4 }}
 470   spec:
 471     selector:
 472       matchLabels: {{- include "common.matchLabels" . | nindent 6 }}
 473     # serviceName is only needed for StatefulSet
 474     # put the postfix part only if you have add a postfix on the service name
 475     serviceName: {{ include "common.servicename" . }}-{{ .Values.service.postfix }}
 476     <...>
 477     template:
 478       metadata:
 479         labels: {{- include "common.labels" . | nindent 8 }}
 480         annotations: {{- include "common.tplValue" (dict "value" .Values.podAnnotations "context" $) | nindent 8 }}
 481         name: {{ include "common.name" . }}
 482       spec:
 483        <...>
 484        containers:
 485          - name: {{ include "common.name" . }}
 486            ports:
 487            {{- range $index, $port := .Values.service.ports }}
 488            - containerPort: {{ $port.port }}
 489              name: {{ $port.name }}
 490            {{- end }}
 491            {{- range $index, $port := .Values.service.headlessPorts }}
 492            - containerPort: {{ $port.port }}
 493              name: {{ $port.name }}
 494            {{- end }}
 495            <...>
 496
 497 The configuration of the service is done in component `values.yaml`:
 498
 499 .. code-block:: yaml
 500
 501   service:
 502    name: NAME-OF-THE-SERVICE
 503    headless:
 504      postfix: NONE
 505      annotations:
 506        anotherAnnotationsKey : value
 507      publishNotReadyAddresses: true
 508    headlessPorts:
 509    - name: tcp-MyPort
 510      port: 5432
 511    - name: http-api
 512      port: 8080
 513    - name: https-api
 514      port: 9443
 515
 516 `headless.annotations`, `headless.postfix` and
 517 `headless.publishNotReadyAddresses` keys are optional.
 518
 519 If `headless.postfix` is not set, then we'll add `-headless` at the end of the
 520 service name.
 521
 522 If it set to `NONE`, there will be not postfix.
 523
 524 And if set to something, it will add `-something` at the end of the service
 525 name.
 526
 527 It would render the following Service Resource (for a component named
 528 `name-of-my-component`, with version `x.y.z`, helm deployment name
 529 `my-deployment` and `global.nodePortPrefix` `302`):
 530
 531 .. code-block:: yaml
 532
 533   apiVersion: v1
 534   kind: Service
 535   metadata:
 536     annotations:
 537       anotherAnnotationsKey: value
 538     name: NAME-OF-THE-SERVICE
 539     labels:
 540       app.kubernetes.io/name: name-of-my-component
 541       helm.sh/chart: name-of-my-component-x.y.z
 542       app.kubernetes.io/instance: my-deployment-name-of-my-component
 543       app.kubernetes.io/managed-by: Tiller
 544   spec:
 545     clusterIP: None
 546     ports:
 547       - port: 5432
 548         targetPort: tcp-MyPort
 549         nodePort: 30288
 550       - port: 8080
 551         targetPort: http-api
 552         nodePort: 30289
 553       - port: 9443
 554         targetPort: https-api
 555         nodePort: 30290
 556     publishNotReadyAddresses: true
 557     selector:
 558       app.kubernetes.io/name: name-of-my-component
 559       app.kubernetes.io/instance:  my-deployment-name-of-my-component
 560     type: ClusterIP
 561
 562 Previous example of StatefulSet would also match (except for the `postfix` part
 563 obviously).
 564
 565 Creating Deployment or StatefulSet
 566 ----------------------------------
 567
 568 Deployment and StatefulSet should use the `apps/v1` (which has appeared in
 569 v1.9).
 570 As seen on the service part, the following parts are mandatory:
 571
 572 .. code-block:: yaml
 573
 574   apiVersion: apps/v1
 575   kind: StatefulSet
 576   metadata:
 577     name: {{ include "common.fullname" . }}
 578     namespace: {{ include "common.namespace" . }}
 579     labels: {{- include "common.labels" . | nindent 4 }}
 580   spec:
 581     selector:
 582       matchLabels: {{- include "common.matchLabels" . | nindent 6 }}
 583     # serviceName is only needed for StatefulSet
 584     # put the postfix part only if you have add a postfix on the service name
 585     serviceName: {{ include "common.servicename" . }}-{{ .Values.service.postfix }}
 586     <...>
 587     template:
 588       metadata:
 589         labels: {{- include "common.labels" . | nindent 8 }}
 590         annotations: {{- include "common.tplValue" (dict "value" .Values.podAnnotations "context" $) | nindent 8 }}
 591         name: {{ include "common.name" . }}
 592       spec:
 593         <...>
 594         containers:
 595           - name: {{ include "common.name" . }}
 596
 597 ONAP Application Configuration
 598 ------------------------------
 599
 600 Dependency Management
 601 ---------------------
 602 These Helm charts describe the desired state
 603 of an ONAP deployment and instruct the Kubernetes container manager as to how
 604 to maintain the deployment in this state.  These dependencies dictate the order
 605 in-which the containers are started for the first time such that such
 606 dependencies are always met without arbitrary sleep times between container
 607 startups.  For example, the SDC back-end container requires the Elastic-Search,
 608 Cassandra and Kibana containers within SDC to be ready and is also dependent on
 609 DMaaP (or the message-router) to be ready - where ready implies the built-in
 610 "readiness" probes succeeded - before becoming fully operational.  When an
 611 initial deployment of ONAP is requested the current state of the system is NULL
 612 so ONAP is deployed by the Kubernetes manager as a set of Docker containers on
 613 one or more predetermined hosts.  The hosts could be physical machines or
 614 virtual machines.  When deploying on virtual machines the resulting system will
 615 be very similar to "Heat" based deployments, i.e. Docker containers running
 616 within a set of VMs, the primary difference being that the allocation of
 617 containers to VMs is done dynamically with OOM and statically with "Heat".
 618 Example SO deployment descriptor file shows SO's dependency on its mariadb
 619 data-base component:
 620
 621 SO deployment specification excerpt:
 622
 623 .. code-block:: yaml
 624
 625   apiVersion: apps/v1
 626   kind: Deployment
 627   metadata:
 628     name: {{ include "common.fullname" . }}
 629     namespace: {{ include "common.namespace" . }}
 630     labels: {{- include "common.labels" . | nindent 4 }}
 631   spec:
 632     replicas: {{ .Values.replicaCount }}
 633     selector:
 634       matchLabels: {{- include "common.matchLabels" . | nindent 6 }}
 635     template:
 636       metadata:
 637         labels:
 638           app: {{ include "common.name" . }}
 639           release: {{ .Release.Name }}
 640       spec:
 641         initContainers:
 642         - command:
 643           - /app/ready.py
 644           args:
 645           - --container-name
 646           - so-mariadb
 647           env:
 648   ...
 649
 650 Kubernetes Container Orchestration
 651 ==================================
 652 The ONAP components are managed by the Kubernetes_ container management system
 653 which maintains the desired state of the container system as described by one
 654 or more deployment descriptors - similar in concept to OpenStack HEAT
 655 Orchestration Templates. The following sections describe the fundamental
 656 objects managed by Kubernetes, the network these components use to communicate
 657 with each other and other entities outside of ONAP and the templates that
 658 describe the configuration and desired state of the ONAP components.
 659
 660 Name Spaces
 661 -----------
 662 Within the namespaces are Kubernetes services that provide external
 663 connectivity to pods that host Docker containers.
 664
 665 ONAP Components to Kubernetes Object Relationships
 666 --------------------------------------------------
 667 Kubernetes deployments consist of multiple objects:
 668
 669 - **nodes** - a worker machine - either physical or virtual - that hosts
 670   multiple containers managed by Kubernetes.
 671 - **services** - an abstraction of a logical set of pods that provide a
 672   micro-service.
 673 - **pods** - one or more (but typically one) container(s) that provide specific
 674   application functionality.
 675 - **persistent volumes** - One or more permanent volumes need to be established
 676   to hold non-ephemeral configuration and state data.
 677
 678 The relationship between these objects is shown in the following figure:
 679
 680 .. .. uml::
 681 ..
 682 ..   @startuml
 683 ..   node PH {
 684 ..      component Service {
 685 ..         component Pod0
 686 ..         component Pod1
 687 ..      }
 688 ..   }
 689 ..
 690 ..   database PV
 691 ..   @enduml
 692
 693 .. figure:: kubernetes_objects.png
 694
 695 OOM uses these Kubernetes objects as described in the following sections.
 696
 697 Nodes
 698 ~~~~~
 699 OOM works with both physical and virtual worker machines.
 700
 701 * Virtual Machine Deployments - If ONAP is to be deployed onto a set of virtual
 702   machines, the creation of the VMs is outside of the scope of OOM and could be
 703   done in many ways, such as
 704
 705   * manually, for example by a user using the OpenStack Horizon dashboard or
 706     AWS EC2, or
 707   * automatically, for example with the use of a OpenStack Heat Orchestration
 708     Template which builds an ONAP stack, Azure ARM template, AWS CloudFormation
 709     Template, or
 710   * orchestrated, for example with Cloudify creating the VMs from a TOSCA
 711     template and controlling their life cycle for the life of the ONAP
 712     deployment.
 713
 714 * Physical Machine Deployments - If ONAP is to be deployed onto physical
 715   machines there are several options but the recommendation is to use Rancher
 716   along with Helm to associate hosts with a Kubernetes cluster.
 717
 718 Pods
 719 ~~~~
 720 A group of containers with shared storage and networking can be grouped
 721 together into a Kubernetes pod.  All of the containers within a pod are
 722 co-located and co-scheduled so they operate as a single unit.  Within ONAP
 723 Amsterdam release, pods are mapped one-to-one to docker containers although
 724 this may change in the future.  As explained in the Services section below the
 725 use of Pods within each ONAP component is abstracted from other ONAP
 726 components.
 727
 728 Services
 729 ~~~~~~~~
 730 OOM uses the Kubernetes service abstraction to provide a consistent access
 731 point for each of the ONAP components independent of the pod or container
 732 architecture of that component.  For example, the SDNC component may introduce
 733 OpenDaylight clustering as some point and change the number of pods in this
 734 component to three or more but this change will be isolated from the other ONAP
 735 components by the service abstraction.  A service can include a load balancer
 736 on its ingress to distribute traffic between the pods and even react to dynamic
 737 changes in the number of pods if they are part of a replica set.
 738
 739 Persistent Volumes
 740 ~~~~~~~~~~~~~~~~~~
 741 To enable ONAP to be deployed into a wide variety of cloud infrastructures a
 742 flexible persistent storage architecture, built on Kubernetes persistent
 743 volumes, provides the ability to define the physical storage in a central
 744 location and have all ONAP components securely store their data.
 745
 746 When deploying ONAP into a public cloud, available storage services such as
 747 `AWS Elastic Block Store`_, `Azure File`_, or `GCE Persistent Disk`_ are
 748 options.  Alternatively, when deploying into a private cloud the storage
 749 architecture might consist of Fiber Channel, `Gluster FS`_, or iSCSI. Many
 750 other storage options existing, refer to the `Kubernetes Storage Class`_
 751 documentation for a full list of the options. The storage architecture may vary
 752 from deployment to deployment but in all cases a reliable, redundant storage
 753 system must be provided to ONAP with which the state information of all ONAP
 754 components will be securely stored. The Storage Class for a given deployment is
 755 a single parameter listed in the ONAP values.yaml file and therefore is easily
 756 customized. Operation of this storage system is outside the scope of the OOM.
 757
 758 .. code-block:: yaml
 759
 760   Insert values.yaml code block with storage block here
 761
 762 Once the storage class is selected and the physical storage is provided, the
 763 ONAP deployment step creates a pool of persistent volumes within the given
 764 physical storage that is used by all of the ONAP components. ONAP components
 765 simply make a claim on these persistent volumes (PV), with a persistent volume
 766 claim (PVC), to gain access to their storage.
 767
 768 The following figure illustrates the relationships between the persistent
 769 volume claims, the persistent volumes, the storage class, and the physical
 770 storage.
 771
 772 .. graphviz::
 773
 774    digraph PV {
 775       label = "Persistance Volume Claim to Physical Storage Mapping"
 776       {
 777          node [shape=cylinder]
 778          D0 [label="Drive0"]
 779          D1 [label="Drive1"]
 780          Dx [label="Drivex"]
 781       }
 782       {
 783          node [shape=Mrecord label="StorageClass:ceph"]
 784          sc
 785       }
 786       {
 787          node [shape=point]
 788          p0 p1 p2
 789          p3 p4 p5
 790       }
 791       subgraph clusterSDC {
 792          label="SDC"
 793          PVC0
 794          PVC1
 795       }
 796       subgraph clusterSDNC {
 797          label="SDNC"
 798          PVC2
 799       }
 800       subgraph clusterSO {
 801          label="SO"
 802          PVCn
 803       }
 804       PV0 -> sc
 805       PV1 -> sc
 806       PV2 -> sc
 807       PVn -> sc
 808
 809       sc -> {D0 D1 Dx}
 810       PVC0 -> PV0
 811       PVC1 -> PV1
 812       PVC2 -> PV2
 813       PVCn -> PVn
 814
 815       # force all of these nodes to the same line in the given order
 816       subgraph {
 817          rank = same; PV0;PV1;PV2;PVn;p0;p1;p2
 818          PV0->PV1->PV2->p0->p1->p2->PVn [style=invis]
 819       }
 820
 821       subgraph {
 822          rank = same; D0;D1;Dx;p3;p4;p5
 823          D0->D1->p3->p4->p5->Dx [style=invis]
 824       }
 825
 826    }
 827
 828 In-order for an ONAP component to use a persistent volume it must make a claim
 829 against a specific persistent volume defined in the ONAP common charts.  Note
 830 that there is a one-to-one relationship between a PVC and PV.  The following is
 831 an excerpt from a component chart that defines a PVC:
 832
 833 .. code-block:: yaml
 834
 835   Insert PVC example here
 836
 837 OOM Networking with Kubernetes
 838 ------------------------------
 839
 840 - DNS
 841 - Ports - Flattening the containers also expose port conflicts between the
 842   containers which need to be resolved.
 843
 844 Node Ports
 845 ~~~~~~~~~~
 846
 847 Pod Placement Rules
 848 -------------------
 849 OOM will use the rich set of Kubernetes node and pod affinity /
 850 anti-affinity rules to minimize the chance of a single failure resulting in a
 851 loss of ONAP service. Node affinity / anti-affinity is used to guide the
 852 Kubernetes orchestrator in the placement of pods on nodes (physical or virtual
 853 machines).  For example:
 854
 855 - if a container used Intel DPDK technology the pod may state that it as
 856   affinity to an Intel processor based node, or
 857 - geographical based node labels (such as the Kubernetes standard zone or
 858   region labels) may be used to ensure placement of a DCAE complex close to the
 859   VNFs generating high volumes of traffic thus minimizing networking cost.
 860   Specifically, if nodes were pre-assigned labels East and West, the pod
 861   deployment spec to distribute pods to these nodes would be:
 862
 863 .. code-block:: yaml
 864
 865   nodeSelector:
 866     failure-domain.beta.Kubernetes.io/region: {{ .Values.location }}
 867
 868 - "location: West" is specified in the `values.yaml` file used to deploy
 869   one DCAE cluster and  "location: East" is specified in a second `values.yaml`
 870   file (see OOM Configuration Management for more information about
 871   configuration files like the `values.yaml` file).
 872
 873 Node affinity can also be used to achieve geographic redundancy if pods are
 874 assigned to multiple failure domains. For more information refer to `Assigning
 875 Pods to Nodes`_.
 876
 877 .. note::
 878    One could use Pod to Node assignment to totally constrain Kubernetes when
 879    doing initial container assignment to replicate the Amsterdam release
 880    OpenStack Heat based deployment. Should one wish to do this, each VM would
 881    need a unique node name which would be used to specify a node constaint
 882    for every component.  These assignment could be specified in an environment
 883    specific values.yaml file. Constraining Kubernetes in this way is not
 884    recommended.
 885
 886 Kubernetes has a comprehensive system called Taints and Tolerations that can be
 887 used to force the container orchestrator to repel pods from nodes based on
 888 static events (an administrator assigning a taint to a node) or dynamic events
 889 (such as a node becoming unreachable or running out of disk space). There are
 890 no plans to use taints or tolerations in the ONAP Beijing release.  Pod
 891 affinity / anti-affinity is the concept of creating a spacial relationship
 892 between pods when the Kubernetes orchestrator does assignment (both initially
 893 an in operation) to nodes as explained in Inter-pod affinity and anti-affinity.
 894 For example, one might choose to co-located all of the ONAP SDC containers on a
 895 single node as they are not critical runtime components and co-location
 896 minimizes overhead. On the other hand, one might choose to ensure that all of
 897 the containers in an ODL cluster (SDNC and APPC) are placed on separate nodes
 898 such that a node failure has minimal impact to the operation of the cluster.
 899 An example of how pod affinity / anti-affinity is shown below:
 900
 901 Pod Affinity / Anti-Affinity
 902
 903 .. code-block:: yaml
 904
 905   apiVersion: v1
 906   kind: Pod
 907   metadata:
 908     name: with-pod-affinity
 909   spec:
 910     affinity:
 911       podAffinity:
 912         requiredDuringSchedulingIgnoredDuringExecution:
 913         - labelSelector:
 914             matchExpressions:
 915         - key: security
 916           operator: In
 917           values:
 918           - S1
 919           topologyKey: failure-domain.beta.Kubernetes.io/zone
 920       podAntiAffinity:
 921         preferredDuringSchedulingIgnoredDuringExecution:
 922         - weight: 100
 923           podAffinityTerm:
 924             labelSelector:
 925               matchExpressions:
 926               - key: security
 927                 operator: In
 928                 values:
 929                 - S2
 930             topologyKey: Kubernetes.io/hostname
 931        containers:
 932        - name: with-pod-affinity
 933          image: gcr.io/google_containers/pause:2.0
 934
 935 This example contains both podAffinity and podAntiAffinity rules, the first
 936 rule is is a must (requiredDuringSchedulingIgnoredDuringExecution) while the
 937 second will be met pending other considerations
 938 (preferredDuringSchedulingIgnoredDuringExecution).  Preemption Another feature
 939 that may assist in achieving a repeatable deployment in the presence of faults
 940 that may have reduced the capacity of the cloud is assigning priority to the
 941 containers such that mission critical components have the ability to evict less
 942 critical components.  Kubernetes provides this capability with Pod Priority and
 943 Preemption.  Prior to having more advanced production grade features available,
 944 the ability to at least be able to re-deploy ONAP (or a subset of) reliably
 945 provides a level of confidence that should an outage occur the system can be
 946 brought back on-line predictably.
 947
 948 Health Checks
 949 -------------
 950
 951 Monitoring of ONAP components is configured in the agents within JSON files and
 952 stored in gerrit under the consul-agent-config, here is an example from the AAI
 953 model loader (aai-model-loader-health.json):
 954
 955 .. code-block:: json
 956
 957   {
 958     "service": {
 959       "name": "A&AI Model Loader",
 960       "checks": [
 961         {
 962           "id": "model-loader-process",
 963           "name": "Model Loader Presence",
 964           "script": "/consul/config/scripts/model-loader-script.sh",
 965           "interval": "15s",
 966           "timeout": "1s"
 967         }
 968       ]
 969     }
 970   }
 971
 972 Liveness Probes
 973 ---------------
 974
 975 These liveness probes can simply check that a port is available, that a
 976 built-in health check is reporting good health, or that the Consul health check
 977 is positive.  For example, to monitor the SDNC component has following liveness
 978 probe can be found in the SDNC DB deployment specification:
 979
 980 .. code-block:: yaml
 981
 982   sdnc db liveness probe
 983
 984   livenessProbe:
 985     exec:
 986       command: ["mysqladmin", "ping"]
 987       initialDelaySeconds: 30 periodSeconds: 10
 988       timeoutSeconds: 5
 989
 990 The 'initialDelaySeconds' control the period of time between the readiness
 991 probe succeeding and the liveness probe starting. 'periodSeconds' and
 992 'timeoutSeconds' control the actual operation of the probe.  Note that
 993 containers are inherently ephemeral so the healing action destroys failed
 994 containers and any state information within it.  To avoid a loss of state, a
 995 persistent volume should be used to store all data that needs to be persisted
 996 over the re-creation of a container.  Persistent volumes have been created for
 997 the database components of each of the projects and the same technique can be
 998 used for all persistent state information.
 999
1000
1001
1002 Environment Files
1003 ~~~~~~~~~~~~~~~~~
1004
1005 MSB Integration
1006 ===============
1007
1008 The \ `Microservices Bus
1009 Project <https://wiki.onap.org/pages/viewpage.action?pageId=3246982>`__ provides
1010 facilities to integrate micro-services into ONAP and therefore needs to
1011 integrate into OOM - primarily through Consul which is the backend of
1012 MSB service discovery. The following is a brief description of how this
1013 integration will be done:
1014
1015 A registrator to push the service endpoint info to MSB service
1016 discovery.
1017
1018 -  The needed service endpoint info is put into the kubernetes yaml file
1019    as annotation, including service name, Protocol,version, visual
1020    range,LB method, IP, Port,etc.
1021
1022 -  OOM deploy/start/restart/scale in/scale out/upgrade ONAP components
1023
1024 -  Registrator watch the kubernetes event
1025
1026 -  When an ONAP component instance has been started/destroyed by OOM,
1027    Registrator get the notification from kubernetes
1028
1029 -  Registrator parse the service endpoint info from annotation and
1030    register/update/unregister it to MSB service discovery
1031
1032 -  MSB API Gateway uses the service endpoint info for service routing
1033    and load balancing.
1034
1035 Details of the registration service API can be found at \ `Microservice
1036 Bus API
1037 Documentation <https://wiki.onap.org/display/DW/Microservice+Bus+API+Documentation>`__.
1038
1039 ONAP Component Registration to MSB
1040 ----------------------------------
1041 The charts of all ONAP components intending to register against MSB must have
1042 an annotation in their service(s) template.  A `sdc` example follows:
1043
1044 .. code-block:: yaml
1045
1046   apiVersion: v1
1047   kind: Service
1048   metadata:
1049     labels:
1050       app: sdc-be
1051     name: sdc-be
1052     namespace: "{{ .Values.nsPrefix }}"
1053     annotations:
1054       msb.onap.org/service-info: '[
1055         {
1056             "serviceName": "sdc",
1057             "version": "v1",
1058             "url": "/sdc/v1",
1059             "protocol": "REST",
1060             "port": "8080",
1061             "visualRange":"1"
1062         },
1063         {
1064             "serviceName": "sdc-deprecated",
1065             "version": "v1",
1066             "url": "/sdc/v1",
1067             "protocol": "REST",
1068             "port": "8080",
1069             "visualRange":"1",
1070             "path":"/sdc/v1"
1071         }
1072         ]'
1073   ...
1074
1075
1076 MSB Integration with OOM
1077 ------------------------
1078 A preliminary view of the OOM-MSB integration is as follows:
1079
1080 .. figure:: MSB-OOM-Diagram.png
1081
1082 A message sequence chart of the registration process:
1083
1084 .. uml::
1085
1086   participant "OOM" as oom
1087   participant "ONAP Component" as onap
1088   participant "Service Discovery" as sd
1089   participant "External API Gateway" as eagw
1090   participant "Router (Internal API Gateway)" as iagw
1091
1092   box "MSB" #LightBlue
1093     participant sd
1094     participant eagw
1095     participant iagw
1096   end box
1097
1098   == Deploy Servcie ==
1099
1100   oom -> onap: Deploy
1101   oom -> sd:   Register service endpoints
1102   sd -> eagw:  Services exposed to external system
1103   sd -> iagw:  Services for internal use
1104
1105   == Component Life-cycle Management ==
1106
1107   oom -> onap: Start/Stop/Scale/Migrate/Upgrade
1108   oom -> sd:   Update service info
1109   sd -> eagw:  Update service info
1110   sd -> iagw:  Update service info
1111
1112   == Service Health Check ==
1113
1114   sd -> onap: Check the health of service
1115   sd -> eagw: Update service status
1116   sd -> iagw: Update service status
1117
1118
1119 MSB Deployment Instructions
1120 ---------------------------
1121 MSB is helm installable ONAP component which is often automatically deployed.
1122 To install it individually enter::
1123
1124   > helm install <repo-name>/msb
1125
1126 .. note::
1127   TBD: Vaidate if the following procedure is still required.
1128
1129 Please note that Kubernetes authentication token must be set at
1130 *kubernetes/kube2msb/values.yaml* so the kube2msb registrator can get the
1131 access to watch the kubernetes events and get service annotation by
1132 Kubernetes APIs. The token can be found in the kubectl configuration file
1133 *~/.kube/config*
1134
1135 More details can be found here `MSB installation <https://docs.onap.org/projects/onap-msb-apigateway/en/latest/platform/installation.html>`_.
1136
1137 .. MISC
1138 .. ====
1139 .. Note that although OOM uses Kubernetes facilities to minimize the effort
1140 .. required of the ONAP component owners to implement a successful rolling
1141 .. upgrade strategy there are other considerations that must be taken into
1142 .. consideration.
1143 .. For example, external APIs - both internal and external to ONAP - should be
1144 .. designed to gracefully accept transactions from a peer at a different
1145 .. software version to avoid deadlock situations. Embedded version codes in
1146 .. messages may facilitate such capabilities.
1147 ..
1148 .. Within each of the projects a new configuration repository contains all of
1149 .. the project specific configuration artifacts.  As changes are made within
1150 .. the project, it's the responsibility of the project team to make appropriate
1151 .. changes to the configuration data.