Updated Kohn release notes for 5G SON use case

[integration.git] / docs / integration-s3p.rst
diff --git a/docs/integration-s3p.rst b/docs/integration-s3p.rst

index 2c7e134..3675c56 100644 (file)
--- a/docs/integration-s3p.rst
+++ b/docs/integration-s3p.rst
@@ -1,81 +1,209 @@
+.. This work is licensed under a
+   Creative Commons Attribution 4.0 International License.
  .. _integration-s3p:
  
-ONAP Maturity Testing Notes
----------------------------
+:orphan:
  
-For the Casablanca release, ONAP continues to improve in multiple
-areas of Scalability, Security, Stability and Performance (S3P)
-metrics.
+Stability
+=========
  
+.. important::
+    The Release stability has been evaluated by:
  
+    - The daily Jakarta CI/CD chain
+    - Stability tests
  
-Stability
-=========
-Integration Stability Testing verifies that the ONAP platform remains fully functional after running for an extended amounts of time.  This is done by repeated running tests against an ONAP instance for a period of 72 hours.
+.. note:
+    The scope of these tests remains limited and does not provide a full set of
+    KPIs to determinate the limits and the dimensioning of the ONAP solution.
  
-Methodology
-~~~~~~~~~~~
+CI results
+----------
  
-The Stability Test has two main components:
+As usual, a daily CI chain dedicated to the release is created after RC0.
  
-- Running "ete stability72hr" Robot suite periodically.  This test suite verifies that ONAP can instantiate vDNS, vFWCL, and VVG.
-- Set up vFW Closed Loop to remain running, then check periodically that the closed loop functionality is still working.
+The daily results can be found in `LF Orange lab daily results web site
+<https://logs.onap.org/onap-integration/daily/onap_daily_pod4_master/>`_ and
+`LF DT lab daily results web site <https://logs.onap.org/onap-integration/daily/onap-daily-dt-oom-master/>`_.
  
-Detailed instructions on how these tests are run can be found at https://wiki.onap.org/display/DW/Casablanca+Stability+Testing+Instructions .
+.. image:: files/s3p/jakarta-dashboard.png
+   :align: center
  
-Results: 100% PASS
-~~~~~~~~~~~~~~~~~~
-=================== ======== ========= =========
-Test Case           Attempts Successes Pass Rate
-=================== ======== ========= =========
-Stability 72 hours   65       65        100%
-vFW Closed Loop      71       71        100%
-**Total**            136      136       **100%**
-=================== ======== ========= =========
  
-Detailed results can be found at https://wiki.onap.org/display/DW/Casablanca+Release+Stability+Testing+Status .
+Infrastructure Healthcheck Tests
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+These tests deal with the Kubernetes/Helm tests on ONAP cluster.
+
+The global expected criteria is **100%**.
+
+The onap-k8s and onap-k8s-teardown, providing a snapshop of the onap namespace
+in Kubernetes, as well as the onap-helm tests are expected to be PASS.
+
+.. image:: files/s3p/istanbul_daily_infrastructure_healthcheck.png
+   :align: center
  
-.. note::
- - The Wind River lab OpenStack instance sporadically returns authentication failures or dropped network connections under load.  The 
-   Stability 72 hours test runs that failed due to these known infrastructure issues were discarded.
- - The Packet Generator VNF used in the vFW Closed Loop test becomes unstable after long run-times.  The vFWCL test runs that failed 
-   due to Packet Generator failures (which are not ONAP platform failures) were discarded.
+Healthcheck Tests
+~~~~~~~~~~~~~~~~~
  
+These tests are the traditionnal robot healthcheck tests and additional tests
+dealing with a single component.
  
-Resilience
-==========
+The expectation is **100% OK**.
  
-Integration Resilience Testing verifies that ONAP can automatically recover from failures of any of its components.  This is done by deleting the ONAP pods that are involved in each particular Use Case flow and then checking that the Use Case flow can again be executed successfully after ONAP recovers.
+.. image:: files/s3p/istanbul_daily_healthcheck.png
+  :align: center
  
-Methodology
+Smoke Tests
  ~~~~~~~~~~~
-For each Use Case, a list of the ONAP components involved is identified.  The pods of each of those components are systematically deleted one-by-one; after each pod deletion, we wait for the pods to recover, then execute the Use Case again to verify successful ONAP platform recovery.
  
+These tests are end to end and automated use case tests.
+See the :ref:`the Integration Test page <integration-tests>` for details.
  
-Results: 96.9% PASS
-~~~~~~~~~~~~~~~~~~~
-=============================== ======== ========= =========
-Use Case                        Attempts Successes Pass Rate
-=============================== ======== ========= =========
-VNF Onboarding and Distribution 45       44        97.8%
-VNF Instantiation               54       52        96.3%
-vFW Closed Loop                 61       59        96.7%
-**Total**                       160      155       **96.9%**
-=============================== ======== ========= =========
+The expectation is **100% OK**.
  
-Detailed results can be found at https://wiki.onap.org/display/DW/Casablanca+Release+Stability+Testing+Status .
+.. figure:: files/s3p/istanbul_daily_smoke.png
+  :align: center
  
+Security Tests
+~~~~~~~~~~~~~~
  
-Deployability
-=============
+These tests are tests dealing with security.
+See the  :ref:`the Integration Test page <integration-tests>` for details.
  
-Smaller ONAP container images footprint reduces resource consumption,
-time to deploy, time to heal, as well as scale out resources.
+Waivers have been granted on different projects for the different tests.
+The list of waivers can be found in
+https://git.onap.org/integration/seccom/tree/waivers?h=jakarta.
+
+nodeport_check_certs test is expected to fail. Even tremendous progress have
+been done in this area, some certificates (unmaintained, upstream or integration
+robot pods) are still not correct due to bad certificate issuers (Root CA
+certificate non valid) or extra long validity. Most of the certificates have
+been installed using cert-manager and will be easily renewable.
+
+The expectation is **80% OK**. The criteria is met.
+
+.. figure:: files/s3p/istanbul_daily_security.png
+  :align: center
+
+Stability tests
+---------------
+
+Stability tests have been performed on Istanbul release:
+
+- SDC stability test
+- Parallel instantiation test
+
+The results can be found in the weekly backend logs
+https://logs.onap.org/onap-integration/weekly/onap_weekly_pod4_istanbul.
+
+SDC stability test
+~~~~~~~~~~~~~~~~~~
  
-Minimizing the footprint of ONAP container images reduces resource
-consumption, time to deploy, time and time to heal. It also reduces
-the resources needed to scale out and time to scale in. For those
-reasons footprint minimization postively impacts the scalability of
-the ONAP platform.  Smaller ONAP container images footprint reduces
-resource consumption, time to deploy, time to heal, as well as scale
-out resources.
+In this test, we consider the basic_onboard automated test and we run 5
+simultaneous onboarding procedures in parallel during 24h.
+
+The basic_onboard test consists in the following steps:
+
+- [SDC] VendorOnboardStep: Onboard vendor in SDC.
+- [SDC] YamlTemplateVspOnboardStep: Onboard vsp described in YAML file in SDC.
+- [SDC] YamlTemplateVfOnboardStep: Onboard vf described in YAML file in SDC.
+- [SDC] YamlTemplateServiceOnboardStep: Onboard service described in YAML file
+  in SDC.
+
+The test has been initiated on the Istanbul weekly lab on the 14th of November.
+
+As already observed in daily|weekly|gating chain, we got race conditions on
+some tests (https://jira.onap.org/browse/INT-1918).
+
+The success rate is expected to be above 95% on the 100 first model upload
+and above 80% until we onboard more than 500 models.
+
+We may also notice that the function test_duration=f(time) increases
+continuously. At the beginning the test takes about 200s, 24h later the same
+test will take around 1000s.
+Finally after 36h, the SDC systematically answers with a 500 HTTP answer code
+explaining the linear decrease of the success rate.
+
+The following graphs provides a good view of the SDC stability test.
+
+.. image:: files/s3p/istanbul_sdc_stability.png
+  :align: center
+
+.. csv-table:: S3P Onboarding stability results
+    :file: ./files/csv/s3p-sdc.csv
+    :widths: 60,20,20,20
+    :delim: ;
+    :header-rows: 1
+
+.. important::
+   The onboarding duration increases linearly with the number of on-boarded
+   models, which is already reported and may be due to the fact that models
+   cannot be deleted. In fact the test client has to retrieve the list of
+   models, which is continuously increasing. No limit tests have been
+   performed.
+   However 1085 on-boarded models is already a vry high figure regarding the
+   possible ONAP usage.
+   Moreover the mean duration time is much lower in Istanbul.
+   It explains why it was possible to run 35% more tests within the same
+   time frame.
+
+Parallel instantiations stability test
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The test is based on the single test (basic_vm) that can be described as follows:
+
+- [SDC] VendorOnboardStep: Onboard vendor in SDC.
+- [SDC] YamlTemplateVspOnboardStep: Onboard vsp described in YAML file in SDC.
+- [SDC] YamlTemplateVfOnboardStep: Onboard vf described in YAML file in SDC.
+- [SDC] YamlTemplateServiceOnboardStep: Onboard service described in YAML file
+  in SDC.
+- [AAI] RegisterCloudRegionStep: Register cloud region.
+- [AAI] ComplexCreateStep: Create complex.
+- [AAI] LinkCloudRegionToComplexStep: Connect cloud region with complex.
+- [AAI] CustomerCreateStep: Create customer.
+- [AAI] CustomerServiceSubscriptionCreateStep: Create customer's service
+  subscription.
+- [AAI] ConnectServiceSubToCloudRegionStep: Connect service subscription with
+  cloud region.
+- [SO] YamlTemplateServiceAlaCarteInstantiateStep: Instantiate service described
+  in YAML using SO a'la carte method.
+- [SO] YamlTemplateVnfAlaCarteInstantiateStep: Instantiate vnf described in YAML
+  using SO a'la carte method.
+- [SO] YamlTemplateVfModuleAlaCarteInstantiateStep: Instantiate VF module
+  described in YAML using SO a'la carte method.
+
+10 instantiation attempts are done simultaneously on the ONAP solution during 24h.
+
+The results can be described as follows:
+
+.. image:: files/s3p/istanbul_instantiation_stability_10.png
+ :align: center
+
+.. csv-table:: S3P Instantiation stability results
+    :file: ./files/csv/s3p-instantiation.csv
+    :widths: 60,20,20,20
+    :delim: ;
+    :header-rows: 1
+
+The results are good with a success rate above 95%. After 24h more than 1300
+VNF have been created and deleted.
+
+As for SDC, we can observe a linear increase of the test duration. This issue
+has been reported since Guilin. For SDC as it is not possible to delete the
+models, it is possible to imagine that the duration increases due to the fact
+that the database of models continuously increases. Therefore the client has
+to retrieve an always bigger list of models.
+But for the instantiations, it is not the case as the references
+(module, VNF, service) are cleaned at the end of each test and all the tests
+use the same model. Then the duration of an instantiation test should be
+almost constant, which is not the case. Further investigations are needed.
+
+.. important::
+  The test has been executed with the mariadb-galera replicaset set to 1
+  (3 by default). With this configuration the results during 24h are very
+  good. When set to 3, the error rate is higher and after some hours
+  most of the instantiation are failing.
+  However, even with a replicaset set to 1, a test on Master weekly chain
+  showed that the system is hitting another limit after about 35h
+  (https://jira.onap.org/browse/SO-3791).