1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. Copyright 2019 Samsung Electronics Co., Ltd.
7 OOM ONAP Offline Installer - Installation Guide
8 ===============================================
10 This document describes the correct offline installation procedure for `OOM ONAP`_, which is done by the ansible based `offline-installer <https://gerrit.onap.org/r/#/admin/projects/oom/offline-installer>`_.
12 Before you dive into the installation you should prepare the offline installer itself - the installer consists of at least two packages/resources. You can read about it in the `Build Guide`_, which provides the instructions for creating them.
14 This current version of the *Installation Guide* supports `El Alto release`_.
18 .. _oooi_installguide_preparations:
23 OOM ONAP deployment has certain hardware resource requirements - `El Alto requirements`_:
25 Community recommended footprint from `El Alto requirements`_ page is 16 VMs ``224 GB RAM`` and ``112 vCPUs``. We will not follow strictly this setup due to such demanding resource consumption and so we will deploy our installation across four nodes (VMs) instead of sixteen. Our simplified setup is definitively not supported or recommended - you are free to diverge - you can follow the official guidelines or make completely different layout, but the minimal count of nodes should not drop below three - otherwise you may have to do some tweaking to make it work, which is not covered here (there is a pod count limit for a single kubernetes node - you can read more about it in this `discussion <https://lists.onap.org/g/onap-discuss/topic/oom_110_kubernetes_pod/25213556>`_).
27 .. _oooi_installguide_preparations_k8s_cluster:
32 The four nodes/VMs will be running these services:
40 - kubernetes-control-plane
42 **NOTE:** kubernetes-* roles can be collocated directly with kubernetes nodes and not necessarily on infra node.
44 - **kubernetes node 1-3**::
48 You don't need to care about these services now - that is the responsibility of the installer (described below). Just start four VMs as seen in this table (or according to your needs as we hinted above):
50 .. _Overview table of the kubernetes cluster:
52 Kubernetes cluster overview
53 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
55 In El Alto we are using RKE as k8s orchestrator method, however everyone is free to diverge from this example and can set it up in own way omitting our rke playbook execution.
57 =================== ================== ==================== ============== ============ ===============
58 KUBERNETES NODE OS NETWORK CPU RAM STORAGE
59 =================== ================== ==================== ============== ============ ===============
60 **infra-node** RHEL/CentOS 7.6 ``10.8.8.100/24`` ``8 vCPUs`` ``8 GB`` ``100 GB``
61 **kube-node1** RHEL/CentOS 7.6 ``10.8.8.101/24`` ``16 vCPUs`` ``56+ GB`` ``100 GB``
62 **kube-node2** RHEL/CentOS 7.6 ``10.8.8.102/24`` ``16 vCPUs`` ``56+ GB`` ``100 GB``
63 **kube-node3** RHEL/CentOS 7.6 ``10.8.8.103/24`` ``16 vCPUs`` ``56+ GB`` ``100 GB``
64 SUM ``56 vCPUs`` ``176+ GB`` ``400 GB``
65 =========================================================== ============== ============ ===============
67 As of now, the offline installer supports only **RHEL 7.x** and **CentOS 7.6** distributions, with at least *@core* and *@base* package groups installed including *Mandatory* and *Default* package sets. So, your VMs should be preinstalled with this operating system - the hypervisor and platform can be of your choosing.
69 We will expect from now on that you installed four VMs and they are connected to the shared network. All VMs must be reachable from our *install-server* (below), which can be the hypervisor, *infra-node* or completely different machine. But in either of these cases the *install-server* must be able to connect over ssh to all of these nodes.
71 .. _oooi_installguide_preparations_installserver:
76 We will use distinct *install-server* and keep it separate from the four-node cluster. But if you wish so, you can use *infra-node* for this goal (if you use the default ``'chroot'`` option of the installer), but in that case double the size of the storage requirement!
78 Prerequisites for the *install-server*:
80 - packages described in `Build Guide`_
81 - extra ``100 GB`` storage (to have space where to store these packages)
82 - installed ``'chroot'`` and/or ``'docker'`` system commands
83 - network connection to the nodes - especially functioning ssh client
85 Our *install-server* will have ip: ``10.8.8.4``.
87 **NOTE:** All the subsequent commands below, are executed from within this *install-server*.
91 .. _oooi_installguide_config:
93 Part 2. Preparation and configuration
94 -------------------------------------
96 We *MUST* do all the following instructions from the *install-server* and also we will be running them as a user ``root``. But that is not necessary - you can without any problem pick and use a regular user. The ssh/ansible connection to the nodes will also expect that we are connecting as a ``root`` - you need to elevate privileges to be able to install on them. Although it can be achieved by other means (sudo), we decided here to keep instructions simple.
98 .. _oooi_installguide_config_packages:
103 As was stated above you must have prepared the installer packages (names will differ - check out the `Build Guide`_):
106 - resources_package.tar
109 **NOTE:** ``'aux_package.tar'`` is optional and if you don't have use for it, you can ignore it.
111 We will store them in the ``/data`` directory on the *install-server* and then we will unpack the ``'sw'`` package to your home directory for example::
113 $ mkdir ~/onap-offline-installer
114 $ tar -C ~/onap-offline-installer -xf /data/sw_package.tar
116 .. _oooi_installguide_config_app:
118 Application directory
119 ~~~~~~~~~~~~~~~~~~~~~
121 Change the current directory to the ``'ansible'``::
123 $ cd ~/onap-offline-installer/ansible
125 You can see multiple files and directories inside - this is the *offline-installer*. It is implemented as a set of ansible playbooks.
127 If you created the ``'sw'`` package according to the *Build Guide* then you should have had the *offline-installer* populated with at least the following files:
129 - ``application/application_configuration.yml``
130 - ``inventory/hosts.yml``
132 Following paragraphs describe fine-tuning of ``'inventory.yml'`` and ``'application_configuration.yml'`` to reflect your VMs setup.
134 .. _oooi_installguide_config_hosts:
139 We need to setup the ``'hosts.yml'`` first, the template looks like this::
142 # This group contains hosts with all resources (binaries, packages, etc.)
146 # this key is supposed to be generated during setup.yml playbook execution
147 # change it just when you have better one working for all nodes
148 ansible_ssh_private_key_file: /root/.ssh/offline_ssh_key
149 ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
155 ansible_host: 10.8.8.5
157 # This is group of hosts where nexus, nginx, dns and all other required
158 # services are running.
161 infrastructure-server:
162 ansible_host: 10.8.8.13
163 #IP used for communication between infra and kubernetes nodes, must be specified.
164 cluster_ip: 10.8.8.13
166 # This is group of hosts which are/will be part of Kubernetes cluster.
169 # This is a group of hosts containing kubernetes worker nodes.
173 ansible_host: 10.8.8.19
174 #ip of the node that it uses for communication with k8s cluster.
175 cluster_ip: 10.8.8.19
177 # Group of hosts containing etcd cluster nodes.
181 infrastructure-server
183 # This is a group of hosts that are to be used as kubernetes control plane nodes.
184 # This means they host kubernetes api server, controller manager and scheduler.
185 # This example uses infra for this purpose, however note that any
186 # other host could be used including kubernetes nodes.
187 # cluster_ip needs to be set for hosts used as control planes.
188 kubernetes-control-plane:
190 infrastructure-server
196 There is some ssh configuration under the ``'vars'`` section - we will deal with ssh setup a little bit later in the `SSH authentication`_.
198 We need to first correct the ip addresses and add a couple of kubernetes nodes to match our four-node cluster:
200 - Under the ``'resource-host'`` set the ``'ansible_host'`` address to the ip of your server, where the packages are stored - it must be reachable by ssh from the *install-server* (for ansible to run playbooks on it) **AND** *infra-node* (to extract resource data from *resource-host* to *infra-node* over ssh). In our scenario the *resource-host* is the same as the *install-server*: ``'10.8.8.4'``
201 - Similarly, set the ``'ansible_host'`` to the address of the *infra-node* under the ``'infrastructure-server'``.
202 - Copy the whole ``'kubernetes-node-1'`` subsection and paste it twice directly after. Change the numbers to ``'kubernetes-node-2'`` and ``'kubernetes-node-3'`` respectively and fix the addresses in the ``'ansible_host'`` variables again to match *kube-node1*, *kube-node2* and *kube-node3*.
204 As you can see, there is another ``'cluster_ip'`` variable for each node - this serve as a designated node address in the kubernetes cluster. Make it the same as the respective ``'ansible_host'``.
206 **NOTE:** In our simple setup we have only one interface per node, but that does not need to be a case for some other deployment - especially if we start to deal with a production usage. Basically, an ``'ansible_host'`` is an entry point for the *install-server's* ansible (*offline-installer*), but the kubernetes cluster can be communicating on a separate network to which *install-server* has no access. That is why we have this distinctive variable, so we can tell the installer that there is a different network, where we want to run the kubernetes traffic and what address each node has on such a network.
208 After all the changes, the ``'hosts.yml'`` should look similar to this::
211 # This group contains hosts with all resources (binaries, packages, etc.)
215 # this key is supposed to be generated during setup.yml playbook execution
216 # change it just when you have better one working for all nodes
217 ansible_ssh_private_key_file: /root/.ssh/offline_ssh_key
218 ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
224 ansible_host: 10.8.8.4
226 # This is group of hosts where nexus, nginx, dns and all other required
227 # services are running.
230 infrastructure-server:
231 ansible_host: 10.8.8.100
232 #IP used for communication between infra and kubernetes nodes, must be specified.
233 cluster_ip: 10.8.8.100
235 # This is group of hosts which are/will be part of Kubernetes cluster.
238 # This is a group of hosts containing kubernetes worker nodes.
242 ansible_host: 10.8.8.101
243 #ip of the node that it uses for communication with k8s cluster.
244 cluster_ip: 10.8.8.101
246 ansible_host: 10.8.8.102
247 #ip of the node that it uses for communication with k8s cluster.
248 cluster_ip: 10.8.8.102
250 ansible_host: 10.8.8.103
251 #ip of the node that it uses for communication with k8s cluster.
252 cluster_ip: 10.8.8.103
254 # Group of hosts containing etcd cluster nodes.
258 infrastructure-server
260 # This is a group of hosts that are to be used as kubernetes control plane nodes.
261 # This means they host kubernetes api server, controller manager and scheduler.
262 # This example uses infra for this purpose, however note that any
263 # other host could be used including kubernetes nodes.
264 # cluster_ip needs to be set for hosts used as control planes.
265 kubernetes-control-plane:
267 infrastructure-server
273 .. _oooi_installguide_config_appconfig:
275 application_configuration.yml
276 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
278 Here, we will be interested in the following variables:
281 - ``resources_filename``
282 - ``aux_resources_filename``
288 ``'resource_dir'``, ``'resources_filename'`` and ``'aux_resources_filename'`` must correspond to the file paths on the *resource-host* (variable ``'resource_host'``), which is in our case the *install-server*.
290 The ``'resource_dir'`` should be set to ``'/data'``, ``'resources_filename'`` to ``'resources_package.tar'`` and ``'aux_resources_filename'`` to ``'aux_package.tar'``. The values should be the same as are in the `Installer packages`_ section.
292 ``'app_data_path'`` is the absolute path on the *infra-node* to where the package ``'resources_package.tar'`` will be extracted and similarly ``'aux_data_path'`` is another absolute path for ``'aux_package.tar'``. Both the paths are fully arbitrary, but they should point to the filesystem with enough space - the storage requirement in `Overview table of the kubernetes cluster`_.
294 **NOTE:** As we mentioned in `Installer packages`_ - the auxiliary package is not mandatory and we will not utilize it in here either.
296 The ``'app_name'`` variable should be short and descriptive. We will set it simply to: ``onap``.
298 The ``'timesync'`` variable is optional and controls synchronisation of the system clock on hosts. It should be configured only if a custom NTP server is available and needed. Such a time authority should be on a host reachable from all installation nodes. If this setting is not provided then the default behavior is to setup NTP daemon on infra-node and sync all kube-nodes' time with it.
300 If you wish to provide your own NTP servers configure their IPs as follows::
304 - <ip address of NTP_1>
306 - <ip address of NTP_N>
308 Another time adjustment related variables are ``'timesync.slewclock'`` and ``'timesync.timezone'`` .
309 First one can have value of ``'true'`` or ``'false'`` (default). It controls whether (in case of big time difference compared to server) time should be adjusted gradually by slowing down or speeding up the clock as required (``'true'``) or in one step (``'false'``)::
314 Second one controls time zone setting on host. It's value should be time zone name according to tz database names with ``'Universal'`` being the default one::
319 ``'timesync.servers'``, ``'timesync.slewclock'`` and ``'timesync.timezone'`` settings can be used independently.
321 Final configuration can resemble the following::
324 resources_filename: resources_package.tar
325 app_data_path: /opt/onap
334 .. _oooi_installguide_config_appconfig_overrides:
336 Helm chart value overrides
337 ^^^^^^^^^^^^^^^^^^^^^^^^^^
339 In El Alto OOM charts are coming with all ONAP components disabled, this setting is also prepackaged within our sw_package.tar. Luckily there are multiple ways supported how to override this setting. It's also necessary for setting-up VIM specific entries and basically to configure any stuff with non default values.
341 First option is to use ``overrides`` key in ``application_configuration.yml``.
342 These settings will override helm values originally stored in ``values.yaml`` files in helm chart directories.
344 For example, the following lines could be appended to ``application_configuration.yml`` to set up managed openstack credentials for onap's so component::
349 openStackUserName: "os_user"
350 openStackRegion: "region_name"
351 openStackKeyStoneUrl: "keystone_url"
352 openStackEncryptedPasswordHere: "encrypted_password"
354 In addition or alternatively to that one can configure ``helm_override_files`` key, which is new feature implemented in Change-Id: I8b8ded38b39aa9a75e55fc63fa0e11b986556cb8.
356 .. _oooi_installguide_config_ssh:
361 We are almost finished with the configuration and we are close to start the installation, but we need to setup password-less login from *install-server* to the nodes.
363 You can use the ansible playbook ``'setup.yml'`` like this::
365 $ ./run_playbook.sh -i inventory/hosts.yml setup.yml -u root --ask-pass
367 You will be asked for password per each node and the playbook will generate a unprotected ssh key-pair ``'~/.ssh/offline_ssh_key'``, which will be distributed to the nodes.
369 Another option is to generate a ssh key-pair manually. We strongly advise you to protect it with a passphrase, but for simplicity we will showcase generating of a private key without any such protection::
371 $ ssh-keygen -N "" -f ~/.ssh/identity
373 The next step will be to distribute the public key to these nodes and from that point no password is needed::
375 $ for ip in 100 101 102 103 ; do ssh-copy-id -i ~/.ssh/identity.pub root@10.8.8.${ip} ; done
377 This command behaves almost identically to the ``'setup.yml'`` playbook.
379 If you generated the ssh key manually then you can now run the ``'setup.yml'`` playbook like this and achieve the same result as in the first execution::
381 $ ./run_playbook.sh -i inventory/hosts.yml setup.yml
383 This time it should not ask you for any password - of course this is very redundant, because you just distributed two ssh keys for no good reason.
385 We can finally edit and finish the configuration of the ``'hosts.yml'``:
387 - if you used the ``'setup.yml'`` playbook then you can just leave this line as it is::
389 ansible_ssh_private_key_file: /root/.ssh/offline_ssh_key
391 - if you created a ssh key manually then change it like this::
393 ansible_ssh_private_key_file: /root/.ssh/identity
397 .. _oooi_installguide_install:
402 We should have the configuration complete and be ready to start the installation. The installation is done via ansible playbooks, which are run either inside a **chroot** environment (default) or from the **docker** container. If for some reason you want to run playbooks from the docker instead of chroot then you cannot use *infra-node* or any other *kube-node* as the *install-server* - otherwise you risk that installation will fail due to restarting of the docker service.
404 If you built your ``'sw'`` package well then there should be the file ``'ansible_chroot.tgz'`` inside the ``'docker'`` directory. If not then you must create it - to learn how to do that and to get more info about the scripts dealing with docker and chroot, go to `Appendix 1. Ansible execution/bootstrap`_
406 We will use the default chroot option so we don't need any docker service to be running.
408 Installation is actually very straightforward now::
410 $ ./run_playbook.sh -i inventory/hosts.yml -e @application/application_configuration.yml site.yml
412 This will take a while so be patient.
414 ``'site.yml'`` playbook actually runs in the order the following playbooks:
416 - ``upload_resources.yml``
417 - ``infrastructure.yml``
419 - ``application.yml``
423 .. _oooi_installguide_postinstall:
425 Part 4. Post-installation and troubleshooting
426 ---------------------------------------------
428 After all of the playbooks are run successfully, it will still take a lot of time until all pods are up and running. You can monitor your newly created kubernetes cluster for example like this::
430 $ ssh -i ~/.ssh/offline_ssh_key root@10.8.8.100 # tailor this command to connect to your infra-node
431 $ watch -d -n 5 'kubectl get pods --all-namespaces'
433 Alternatively you can monitor progress with ``helm_deployment_status.py`` script located in offline-installer directory. Transfer it to infra-node and run::
435 $ python helm_deployment_status.py -n <namespace_name> # namespace defaults to onap
437 To automatically verify functionality with healthchecks after deployment becomes ready or after timeout period expires, append ``-hp`` switch followed by the full path to the healthcheck script and ``--health-mode`` optional switch with appropriate mode supported by that script (``health`` by default, ``--help`` displays available modes)::
439 $ python helm_deployment_status.py -hp <app_data_path>/<app_name>/helm_charts/robot/ete-k8s.sh --health-mode <healthcheck mode>
441 It is strongly recommended to tailor ``helm_deployment_status.py`` to your needs since default values might not be what you'd expect. The defaults can be displayed with ``--help`` switch.
443 Final result of installation varies based on number of k8s nodes used and distribution of pods. In some dev envs we quite frequently hit problems with not all pods properly deployed. In successful deployments all jobs should be in successful state.
444 This can be verified using ::
446 $ kubectl get jobs -n <namespace>
448 If some of the job is hanging in some wrong end-state like ``'BackoffLimitExceeded'`` manual intervention is required to heal this and make also dependent jobs passing. More details about particular job state can be obtained using ::
450 $ kubectl describe job -n <namespace> <job_name>
452 If manual intervention is required, one can remove failing job and retry helm install command directly, which will not launch full deployment but rather check current state of the system and rebuild parts which are not up & running. Exact commands are as follows ::
454 $ kubectl delete job -n <namespace> <job_name>
455 $ helm deploy <env_name> <helm_chart_name> --namespace <namespace_name>
457 E.g. helm deploy dev local/onap --namespace onap
459 Once all pods are properly deployed and in running state, one can verify functionality e.g. by running onap healthchecks ::
461 $ cd <app_data_path>/<app_name>/helm_charts/robot
462 $ ./ete-k8s.sh onap health
464 For better work with terminal screen and jq packages were added . It can be installed from resources directory.
466 Screen is a terminal multiplexer. With screen it is possible to have more terminal instances active. Screen as well keeps active SSH connections even terminal is closed.
468 Jq can be used for editing json data format as output of kubectl. For example jq was used to troubleshoot `SDNC-739 (UEB - Listener in Crashloopback) <https://jira.onap.org/browse/SDNC-739/>`_ ::
470 $ kubectl -n onap get job onap-sdc-sdc-be-config-backend -o json | jq "del(.spec.selector)" | jq "del(.spec.template.metadata.labels)" | kubectl -n onap replace --force -f -
474 .. _oooi_installguide_appendix1:
476 Appendix 1. Ansible execution/bootstrap
477 ---------------------------------------
479 There are two ways how to easily run the installer's ansible playbooks:
481 - If you already have or can install a docker then you can build the provided ``'Dockerfile'`` for the ansible and run playbooks in the docker container.
482 - Another way to deploy ansible is via chroot environment which is bundled together within this directory.
484 (Re)build docker image and/or chroot archive
485 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
487 Inside the ``'docker'`` directory is the ``'Dockerfile'`` and ``'build_ansible_image.sh'`` script. You can run ``'build_ansible_image.sh'`` script on some machine with the internet connectivity and it will download all required packages needed for building the ansible docker image and for exporting it into a flat chroot environment.
489 Built image is exported into ``'ansible_chroot.tgz'`` archive in the same (``'docker'``) directory.
491 This script has two optional arguments:
496 **Note:** if optional arguments are not used, docker image name will be set to ``'ansible'`` by default.
498 Launching ansible playbook using chroot environment
499 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
501 This is the default and preferred way of running ansible playbooks in an offline environment as there is no dependency on docker to be installed on the system. Chroot environment is already provided by included archive ``'ansible_chroot.tgz'``.
503 It should be available in the ``'docker'`` directory as the end-result of the packaging script or after manual run of the ``'build_ansible_image.sh'`` script referenced above.
505 All playbooks can be executed via ``'./run_playbook.sh'`` wrapper script.
507 To get more info about the way how the ``'./run_playbook.sh'`` wrapper script should be used, run::
511 The main purpose of this wrapper script is to provide the ansible framework to a machine where it was bootstrapped without need of installing additional packages. The user can run this to display ``'ansible-playbook'`` command help::
513 $ ./run_playbook.sh --help
518 * There are two scripts which work in tandem for creating and running chroot
519 * First one can convert docker image into chroot directory
520 * Second script will automate chrooting (necessary steps for chroot to work and cleanup)
521 * Both of them have help - just run::
524 $ ./create_docker_chroot.sh help
525 $ ./run_chroot.sh help
530 $ docker/create_docker_chroot.sh convert some_docker_image ./new_name_for_chroot
531 $ cat ./new_name_for_chroot/README.md
532 $ docker/run_chroot.sh execute ./new_name_for_chroot cat /etc/os-release 2>/dev/null
534 Launching ansible playbook using docker container (ALTERNATIVE APPROACH)
535 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
537 This option is here just to keep support for the older method which relies on a running docker service. For the offline deployment use the chroot option as indicated above.
539 You will not need ``'ansible_chroot.tgz'`` archive anymore, but the new requirement is a prebuilt docker image of ansible (based on the provided ``'Dockerfile'``). It should be available in your local docker repository (otherwise the default name ``'ansible'`` may fetch unwanted image from default registry!).
541 To trigger this functionality and to run ``'ansible-playbook'`` inside a docker container instead of the chroot environment, you must first set the ``ANSIBLE_DOCKER_IMAGE`` variable. The value must be a name of the built ansible docker image.
543 Usage is basically the same as with the default chroot way - the only difference is the existence of the environment variable::
545 $ ANSIBLE_DOCKER_IMAGE=ansible ./run_playbook.sh --help
549 .. _Build Guide: ./BuildGuide.rst
550 .. _El Alto requirements: https://onap.readthedocs.io/en/elalto/guides/onap-developer/settingup/index.html#installing-onap
551 .. _El Alto release: https://docs.onap.org/en/elalto/release/
552 .. _OOM ONAP: https://wiki.onap.org/display/DW/ONAP+Operations+Manager+Project