Merge "[GENERAL] Add Andreas Geissler as committer."

[oom/offline-installer.git] / docs / InstallGuide.rst
diff --git a/docs/InstallGuide.rst b/docs/InstallGuide.rst

index e097792..9edcbf6 100644 (file)
--- a/docs/InstallGuide.rst
+++ b/docs/InstallGuide.rst
@@ -48,14 +48,14 @@ Kubernetes cluster overview
  =================== ================== ==================== ============== ============ ===============
  KUBERNETES NODE     OS                 NETWORK              CPU            RAM          STORAGE
  =================== ================== ==================== ============== ============ ===============
-**infra-node**      RHEL/CentOS 7.6    ``10.8.8.100/24``    ``8 vCPUs``    ``8 GB``     ``100 GB``
-**kube-node1**      RHEL/CentOS 7.6    ``10.8.8.101/24``    ``16 vCPUs``   ``56+ GB``   ``100 GB``
-**kube-node2**      RHEL/CentOS 7.6    ``10.8.8.102/24``    ``16 vCPUs``   ``56+ GB``   ``100 GB``
-**kube-node3**      RHEL/CentOS 7.6    ``10.8.8.103/24``    ``16 vCPUs``   ``56+ GB``   ``100 GB``
+**infra-node**      RHEL/CentOS 7.9    ``10.8.8.100/24``    ``8 vCPUs``    ``8 GB``     ``100 GB``
+**kube-node1**      RHEL/CentOS 7.9    ``10.8.8.101/24``    ``16 vCPUs``   ``56+ GB``   ``100 GB``
+**kube-node2**      RHEL/CentOS 7.9    ``10.8.8.102/24``    ``16 vCPUs``   ``56+ GB``   ``100 GB``
+**kube-node3**      RHEL/CentOS 7.9    ``10.8.8.103/24``    ``16 vCPUs``   ``56+ GB``   ``100 GB``
  SUM                                                         ``56 vCPUs``   ``176+ GB``  ``400 GB``
  =========================================================== ============== ============ ===============
  
-As of now, the offline installer supports only **RHEL 7.x** and **CentOS 7.6** distributions, with at least *@core* and *@base* package groups installed including *Mandatory* and *Default* package sets. So, your VMs should be preinstalled with this operating system - the hypervisor and platform can be of your choosing.
+As of now, the offline installer supports only **RHEL 7.x** and **CentOS 7.9** distributions, with at least *@core* and *@base* package groups installed including *Mandatory* and *Default* package sets. So, your VMs should be preinstalled with this operating system - the hypervisor and platform can be of your choosing.
  
  We will expect from now on that you installed four VMs and they are connected to the shared network. All VMs must be reachable from *install-server* (below), which can be the hypervisor, *infra-node* or completely different host. But in either of these cases the *install-server* must be able to connect over ssh to all of these nodes.
  
@@ -390,21 +390,24 @@ Part 3. Installation
  
  We should have the configuration complete and be ready to start the installation. The installation is done via ansible playbooks, which are run either inside a **chroot** environment (default) or from the **docker** container. If for some reason you want to run playbooks from the docker instead of chroot then you cannot use *infra-node* or any other *kube-node* as the *install-server* - otherwise you risk that installation will fail due to restarting of the docker service.
  
-If you built your ``'sw'`` package well then there should be the file ``'ansible_chroot.tgz'`` inside the ``'docker'`` directory. If not then you must create it - to learn how to do that and to get more info about the scripts dealing with docker and chroot, go to `Appendix 1. Ansible execution/bootstrap`_
+``'sw_package.tar'`` should contain ``'ansible_chroot.tgz'`` file inside the ``'docker'`` directory. Detailed instructions on how to create it manually and to get more info about the scripts dealing with docker and chroot, go to `Appendix 1. Ansible execution/bootstrap`_.
  
  We will use the default chroot option so we don't need any docker service to be running.
  
-Installation is actually very straightforward now::
+Commence the installation process by running following command::
  
      $ ./run_playbook.sh -i inventory/hosts.yml -e @application/application_configuration.yml site.yml
  
-This will take a while so be patient.
+This will take a while so be patient. The whole provisioning process is idempotent so you may safely re-run it if required.
  
-``'site.yml'`` playbook actually runs in the order the following playbooks:
+``'site.yml'`` playbook will run following playbooks in the given order:
  
-- ``upload_resources.yml``
+- ``resources.yml``
  - ``infrastructure.yml``
  - ``rke.yml``
+- ``nfs.yml``
+- ``kube_prometheus.yml``
+- ``cert_manager.yml``
  - ``application.yml``
  
  ----
@@ -412,7 +415,7 @@ This will take a while so be patient.
  Part 4. Post-installation and troubleshooting
  ---------------------------------------------
  
-After all of the playbooks are run successfully, it will still take a lot of time until all pods are up and running. You can monitor your newly created kubernetes cluster for example like this::
+After all of the playbooks are run successfully the ONAP kubernetes application will be still deploying and it might take some time until all pods are up and running. You can monitor your newly created kubernetes cluster with this command::
  
      $ ssh -i ~/.ssh/offline_ssh_key root@10.8.8.100 # tailor this command to connect to your infra-node
      $ watch -d -n 5 'kubectl get pods --all-namespaces'
@@ -427,8 +430,9 @@ To automatically verify functionality with healthchecks after deployment becomes
  
  It is strongly recommended to tailor ``helm_deployment_status.py`` to your needs since default values might not be what you'd expect. The defaults can be displayed with ``--help`` switch.
  
-Final result of installation varies based on number of k8s nodes used and distribution of pods. In some dev envs we quite frequently hit problems with not all pods properly deployed. In successful deployments all jobs should be in successful state.
-This can be verified using ::
+Final result of installation varies based on number of k8s nodes used and distribution of pods. In successful deployments all jobs should be in successful state. This can be verified with:
+
+::
  
      $ kubectl get jobs -n <namespace>
  
@@ -448,9 +452,9 @@ Once all pods are properly deployed and in running state, one can verify functio
      $ cd <app_data_path>/<app_name>/helm_charts/robot
      $ ./ete-k8s.sh onap health
  
-For better work with terminal screen and jq packages were added . It can be installed from resources directory.
+You can install ``screen`` and ``jq`` packages to aid troubleshooting. Those can be installed from resources directory.
  
-Screen is a terminal multiplexer. With screen it is possible to have more terminal instances active. Screen as well keeps active SSH connections even terminal is closed.
+Screen is a terminal multiplexer and allows running multiple virtual terminal sessions as well as keep active SSH connections even when terminal is closed.
  
  Jq can be used for editing json data format as output of kubectl. For example jq was used to troubleshoot `SDNC-739 (UEB - Listener in Crashloopback) <https://jira.onap.org/browse/SDNC-739/>`_ ::
  
@@ -469,9 +473,9 @@ There are two ways how to easily run the installer's ansible playbooks:
  (Re)build docker image and/or chroot archive
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  
-Inside the ``'docker'`` directory is the ``'Dockerfile'`` and ``'build_ansible_image.sh'`` script. You can run ``'build_ansible_image.sh'`` script on some machine with the internet connectivity and it will download all required packages needed for building the ansible docker image and for exporting it into a flat chroot environment.
+Inside the ``'ansible/docker'`` directory you'll find the ``'Dockerfile'`` and ``'build_ansible_image.sh'`` script. You can run ``'build_ansible_image.sh'`` script on some machine with the internet connectivity and it will download all required packages needed for building the ansible docker image and for exporting it into a flat chroot environment.
  
-Built image is exported into ``'ansible_chroot.tgz'`` archive in the same (``'docker'``) directory.
+Built image is exported into ``'ansible_chroot.tgz'`` archive in the same (``'ansible/docker'``) directory.
  
  This script has two optional arguments:
  
@@ -485,7 +489,7 @@ Launching ansible playbook using chroot environment
  
  This is the default and preferred way of running ansible playbooks in an offline environment as there is no dependency on docker to be installed on the system. Chroot environment is already provided by included archive ``'ansible_chroot.tgz'``.
  
-It should be available in the ``'docker'`` directory as the end-result of the packaging script or after manual run of the ``'build_ansible_image.sh'`` script referenced above.
+It should be available in the ``'ansible/docker'`` directory as the end-result of the packaging script or after manual run of the ``'build_ansible_image.sh'`` script referenced above.
  
  All playbooks can be executed via ``'./run_playbook.sh'`` wrapper script.
  
@@ -505,16 +509,16 @@ Developers notes
  * Second script will automate chrooting (necessary steps for chroot to work and cleanup)
  * Both of them have help - just run::
  
-    $ cd docker
+    $ cd ansible/docker
      $ ./create_docker_chroot.sh help
      $ ./run_chroot.sh help
  
  Example usage::
  
      $ sudo su
-    $ docker/create_docker_chroot.sh convert some_docker_image ./new_name_for_chroot
+    $ ansible/docker/create_docker_chroot.sh convert some_docker_image ./new_name_for_chroot
      $ cat ./new_name_for_chroot/README.md
-    $ docker/run_chroot.sh execute ./new_name_for_chroot cat /etc/os-release 2>/dev/null
+    $ ansible/docker/run_chroot.sh execute ./new_name_for_chroot cat /etc/os-release 2>/dev/null
  
  Launching ansible playbook using docker container (ALTERNATIVE APPROACH)
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -531,9 +535,156 @@ Usage is basically the same as with the default chroot way - the only difference
  
  -----
  
+Appendix 2. Running Kubernetes Dashboard
+----------------------------------------
+
+Kubernetes Dashboard is a web-based, general purpose user interface for managing a k8s cluster.
+
+Some of its capabilities are:
+
+* workloads/services management (troubleshooting, scaling, editing, restarting pods)
+* deploying new workloads/applications to the cluster
+* managing the cluster itself
+
+Dashboard also provides information on the state of the cluster resources and on any errors that may have occurred.
+
+Kubernetes Dashboard itself is a kubernetes application. For user convenience the Offline platform has it already pre-installed:
+
+::
+
+    $ kubectl -n kubernetes-dashboard get deployment
+    NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
+    dashboard-metrics-scraper   1/1     1            1           76m
+    kubernetes-dashboard        1/1     1            1           76m
+
+Accessing the dashboard
+~~~~~~~~~~~~~~~~~~~~~~~
+
+There are multiple ways to access the application's web UI. Here we'll assume usage of local port forwarding on a box where you have access to a browser since the dashboard in Offline platform is exposed via a node port by default.
+
+First get the node port number that the dashboard service is exposed on:
+
+::
+
+    $ kubectl -n kubernetes-dashboard get svc kubernetes-dashboard -o custom-columns=PORTS:.spec.ports[].nodePort
+    PORTS
+    30825
+
+Now establish an ssh session to the infra node from your box from which you'll be accessing the dashboard:
+
+::
+
+    $ ssh -L 8080:127.0.0.1:30825 root@<infra host ip>
+
+Point your browser at https://localhost:8080/ and you should see the login page:
+
+.. image:: images/kubernetes-dashboard-signin.png
+   :alt: Kubernetes Dashboard signin
+
+Here, we'll leverage the Bearer Token to log in. Offline platform comes with dashboard admin user already created, we just need to extract its token. On the infra node issue following command:
+
+::
+
+    $ kubectl -n kubernetes-dashboard get secret $(kubectl -n kubernetes-dashboard get sa/admin-user -o jsonpath="{.secrets[0].name}") -o go-template="{{.data.token | base64decode}}"
+
+It will return the token string on stdout. Copy-paste it into the sign-in form, selecting the "Token" option first. Upon successful login you'll be presented the cluster resources from ``default`` namespace. In the drop down box at the top select the namespace into which you installed the Onap application (namespace name equals the value of ``app_name`` variable from offline-installer setup) and you should see the cluster resources for Onap:
+
+.. image:: images/kubernetes-dashboard-main.png
+   :alt: Kubernetes Dashboard main page
+
+For additional information concerning the Kubernetes Dashboard please refer to the `official documentation <https://github.com/kubernetes/dashboard/tree/master/docs>`_.
+
+-----
+
+Appendix 3. Running kube-prometheus stack
+-----------------------------------------
+
+`Kube-prometheus stack`_ is a collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the `Prometheus Operator`_.
+
+The Stack is not deployed by default in Offline ONAP Platform, but all artifacts which it requires are downloaded by relevant scripts in the package build phase (see `Build Guide`_).
+
+Setup (optional)
+~~~~~~~~~~~~~~~~
+
+Kube-prometheus stack itself is a Kubernetes native application provisioned using Helm Charts. As such it can be configured using Helm values. Offline Installer provides a handy way for passing those values to the helm installation process.
+
+Any values for the Stack should be defined as subkeys of **kube_prometheus_helm_values** variable in **application_configuration.yml**. For instance, in order to override the default Grafana password, insert below structure into application_configuration.yml::
+
+    kube_prometheus_helm_values:
+      grafana:
+        adminPassword: <password>
+
+Another example - to set custom storage size for Prometheus tsdb::
+
+
+    kube_prometheus_helm_values:
+      prometheus:
+        prometheusSpec:
+          storageSpec:
+            volumeClaimTemplate:
+              spec:
+                resources:
+                  requests:
+                    storage: 6Gi
+
+A comprehensive list of Helm values for the Stack can be obtained on the `Kube-prometheus stack`_ project site, in the `values.yaml`_ file. Additional values for the Grafana can be checked on the `Grafana`_ project site in the *charts/grafana/values.yaml* file.
+
+Installation
+~~~~~~~~~~~~
+
+In order to actually install this tool it's required to set the following variable in application_configuration.yml::
+
+    kube_prometheus_stack_enabled: true
+
+After the Offline Platform installation process is complete, the Stack will be deployed into its own kubernetes and helm namespace **kube-prometheus**.
+
+ONAP Services Monitoring
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Some ONAP services export application metrics which can be scraped by Prometheus by leveraging the ServiceMonitor objects. Offline Platform provides a curated set of Grafana panels for monitoring ONAP's mariadb-galera chart. To enable mariadb-galera monitoring provide the following helm values in ``application_configuration.yml``::
+
+    overrides:
+      mariadb-galera:
+        metrics:
+          serviceMonitor:
+            enabled: true
+            basicAuth:
+              enabled: false
+
+To access the Galera/MariaDB dashboard navigate to *Dashboards -> Manage -> ONAP -> Galera/MariaDB* in Grafana UI.
+
+Accessing Grafana dashboard
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The most straightforward way to access the Grafana UI is by leveraging the *port-forward* k8s facility. Issue following command on the Infra host::
+
+    kubectl -n kube-prometheus port-forward --address 0.0.0.0 svc/kube-prometheus-stack-grafana 8081:80
+
+Then navigate to http://<infra IP>:8081 to access the UI:
+
+.. image:: images/grafana-signin.png
+   :alt: Grafana Login page
+
+Default username is *admin* and the default password is *grafana*.
+
+In the left pane navigate to *Dashboards -> Manage* to see the various pre-defined dashboards that come bundled with kube-prometheus stack. There is also the *Custom* folder which holds few additional dashes defined by the Offline Installer authors:
+
+.. image:: images/grafana-dashboards.png
+   :alt: Grafana dashboards
+
+Alternative way of accessing the UI is by leveraging the NodePort type service which exposes Grafana UI on the Infra host public port directly. To do so get the port number first::
+
+    kubectl -n kube-prometheus get service/kube-prometheus-stack-grafana -o custom-columns=PORTS:.spec.ports[].nodePort
+
+Then navigate to http://<infra IP>:<nodePort> to access the UI.
+
  .. _Build Guide: ./BuildGuide.rst
  .. _Software requirements: https://docs.onap.org/projects/onap-oom/en/latest/oom_cloud_setup_guide.html#software-requirements
  .. _Hardware requirements: https://docs.onap.org/projects/onap-oom/en/latest/oom_cloud_setup_guide.html#minimum-hardware-configuration
  .. _OOM ONAP: https://docs.onap.org/projects/onap-oom/en/latest/index.html
  .. _Offline installer: https://gerrit.onap.org/r/q/oom/offline-installer
  .. _RKE: https://rancher.com/products/rke/
+.. _Kube-prometheus stack: https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
+.. _Prometheus Operator: https://github.com/prometheus-operator/prometheus-operator
+.. _values.yaml: https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml
+.. _Grafana: https://github.com/grafana/helm-charts