1 .. This work is licensed under a
2 Creative Commons Attribution 4.0 International License.
9 The Release stability has been evaluated by:
11 - The daily Istanbul CI/CD chain
16 The scope of these tests remains limited and does not provide a full set of
17 KPIs to determinate the limits and the dimensioning of the ONAP solution.
22 As usual, a daily CI chain dedicated to the release is created after RC0.
23 An Istanbul chain has been created on the 5th of November 2021.
25 The daily results can be found in `LF daily results web site
26 <https://logs.onap.org/onap-integration/daily/onap_daily_pod4_istanbul/>`_.
28 .. image:: files/s3p/istanbul-dashboard.png
32 Infrastructure Healthcheck Tests
33 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35 These tests deal with the Kubernetes/Helm tests on ONAP cluster.
37 The global expected criteria is **75%**.
39 The onap-k8s and onap-k8s-teardown, providing a snapshop of the onap namespace
40 in Kubernetes, as well as the onap-helm tests are expected to be PASS.
42 nodeport_check_certs test is expected to fail. Even tremendous progress have
43 been done in this area, some certificates (unmaintained, upstream or integration
44 robot pods) are still not correct due to bad certificate issuers (Root CA
45 certificate non valid) or extra long validity. Most of the certificates have
46 been installed using cert-manager and will be easily renewable.
48 .. image:: files/s3p/istanbul_daily_infrastructure_healthcheck.png
54 These tests are the traditionnal robot healthcheck tests and additional tests
55 dealing with a single component.
57 The expectation is **100% OK**.
59 .. image:: files/s3p/istanbul_daily_healthcheck.png
65 These tests are end to end and automated use case tests.
66 See the :ref:`the Integration Test page <integration-tests>` for details.
68 The expectation is **100% OK**.
70 .. figure:: files/s3p/istanbul_daily_smoke.png
73 An error has been reported since Guilin (https://jira.onap.org/browse/SDC-3508) on
74 a possible race condition in SDC preventing the completion of the certification in
75 SDC and leading to onboarding errors.
76 This error may occur in case of parallel processing.
81 These tests are tests dealing with security.
82 See the :ref:`the Integration Test page <integration-tests>` for details.
84 Waivers have been granted on different projects for the different tests.
85 The list of waivers can be found in
86 https://git.onap.org/integration/seccom/tree/waivers?h=istanbul.
88 The expectation is **100% OK**. The criteria is met.
90 .. figure:: files/s3p/istanbul_daily_security.png
96 The goal of the resiliency testing was to evaluate the capability of the
97 Istanbul solution to survive a stop or restart of a Kubernetes worker node.
99 This test has been automated thanks to the
100 Litmus chaos framework(https://litmuschaos.io/) and automated in the CI on the
103 2 additional tests based on Litmus chaos scenario have been added but will be tuned
106 - node cpu hog (temporary increase of CPU on 1 kubernetes node)
107 - node memory hog (temporary increase of Memory on 1 kubernetes node)
109 The main test for Istanbul is node drain corresponding to the resiliency scenario
110 previously managed manually.
112 The system under test is defined in OOM.
113 The resources are described in the table below:
115 .. code-block:: shell
117 +-------------------------+-------+--------+--------+
118 | Name | vCPUs | Memory | Disk |
119 +-------------------------+-------+--------+--------+
120 | compute12-onap-istanbul | 16 | 24Go | 10 Go |
121 | compute11-onap-istanbul | 16 | 24Go | 10 Go |
122 | compute10-onap-istanbul | 16 | 24Go | 10 Go |
123 | compute09-onap-istanbul | 16 | 24Go | 10 Go |
124 | compute08-onap-istanbul | 16 | 24Go | 10 Go |
125 | compute07-onap-istanbul | 16 | 24Go | 10 Go |
126 | compute06-onap-istanbul | 16 | 24Go | 10 Go |
127 | compute05-onap-istanbul | 16 | 24Go | 10 Go |
128 | compute04-onap-istanbul | 16 | 24Go | 10 Go |
129 | compute03-onap-istanbul | 16 | 24Go | 10 Go |
130 | compute02-onap-istanbul | 16 | 24Go | 10 Go |
131 | compute01-onap-istanbul | 16 | 24Go | 10 Go |
132 | etcd03-onap-istanbul | 4 | 6Go | 10 Go |
133 | etcd02-onap-istanbul | 4 | 6Go | 10 Go |
134 | etcd01-onap-istanbul | 4 | 6Go | 10 Go |
135 | control03-onap-istanbul | 4 | 6Go | 10 Go |
136 | control02-onap-istanbul | 4 | 6Go | 10 Go |
137 | control01-onap-istanbul | 4 | 6Go | 10 Go |
138 +-------------------------+-------+--------+--------+
141 The test sequence can be defined as follows:
143 - Cordon a compute node (prevent any new scheduling)
144 - Launch node drain chaos scenario, all the pods on the given compute node
147 Once all the pods have been evicted:
149 - Uncordon the compute node
150 - Replay a basic_vm test
152 This test has been successfully executed.
154 .. image:: files/s3p/istanbul_resiliency.png
159 Please note that the chaos framework select one compute node (the first one by
161 The distribution of the pods is random, on our target architecture about 15
162 pods are scheduled on each node. The chaos therefore affects only a limited
165 For the Istanbul tests, the evicted pods (compute01) were:
168 .. code-block:: shell
170 NAME READY STATUS RESTARTS AGE
171 onap-aaf-service-dbd8fc76b-vnmqv 1/1 Running 0 2d19h
172 onap-aai-graphadmin-5799bfc5bb-psfvs 2/2 Running 0 2d19h
173 onap-cassandra-1 1/1 Running 0 2d19h
174 onap-dcae-ves-collector-856fcb67bd-lb8sz 2/2 Running 0 2d19h
175 onap-dcaemod-distributor-api-85df84df49-zj9zn 1/1 Running 0 2d19h
176 onap-msb-consul-86975585d9-8nfs2 1/1 Running 0 2d19h
177 onap-multicloud-pike-88bb965f4-v2qc8 2/2 Running 0 2d19h
178 onap-netbox-nginx-5b9b57d885-hjv84 1/1 Running 0 2d19h
179 onap-portal-app-66d9f54446-sjhld 2/2 Running 0 2d19h
180 onap-sdnc-ueb-listener-5b6bb95c68-d24xr 1/1 Running 0 2d19h
181 onap-sdnc-web-8f5c9fbcc-2l8sp 1/1 Running 0 2d19h
182 onap-so-779655cb6b-9tzq4 2/2 Running 1 2d19h
183 onap-so-oof-adapter-54b5b99788-x7rlk 2/2 Running 0 2d19h
185 In the future, it would be interesting to elaborate a resiliency testing strategy
186 in order to check the eviction of all the critical components.
191 Stability tests have been performed on Istanbul release:
194 - Parallel instantiation test
196 The results can be found in the weekly backend logs
197 https://logs.onap.org/onap-integration/weekly/onap_weekly_pod4_istanbul.
202 In this test, we consider the basic_onboard automated test and we run 5
203 simultaneous onboarding procedures in parallel during 24h.
205 The basic_onboard test consists in the following steps:
207 - [SDC] VendorOnboardStep: Onboard vendor in SDC.
208 - [SDC] YamlTemplateVspOnboardStep: Onboard vsp described in YAML file in SDC.
209 - [SDC] YamlTemplateVfOnboardStep: Onboard vf described in YAML file in SDC.
210 - [SDC] YamlTemplateServiceOnboardStep: Onboard service described in YAML file
213 The test has been initiated on the Istanbul weekly lab on the 14th of November.
215 As already observed in daily|weekly|gating chain, we got race conditions on
216 some tests (https://jira.onap.org/browse/INT-1918).
218 The success rate is expected to be above 95% on the 100 first model upload
219 and above 80% until we onboard more than 500 models.
221 We may also notice that the function test_duration=f(time) increases
222 continuously. At the beginning the test takes about 200s, 24h later the same
223 test will take around 1000s.
224 Finally after 36h, the SDC systematically answers with a 500 HTTP answer code
225 explaining the linear decrease of the success rate.
227 The following graphs provides a good view of the SDC stability test.
229 .. image:: files/s3p/istanbul_sdc_stability.png
232 .. csv-table:: S3P Onboarding stability results
233 :file: ./files/csv/s3p-sdc.csv
239 The onboarding duration increases linearly with the number of on-boarded
240 models, which is already reported and may be due to the fact that models
241 cannot be deleted. In fact the test client has to retrieve the list of
242 models, which is continuously increasing. No limit tests have been
244 However 1085 on-boarded models is already a vry high figure regarding the
246 Moreover the mean duration time is much lower in Istanbul.
247 It explains why it was possible to run 35% more tests within the same
250 Parallel instantiations stability test
251 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
253 The test is based on the single test (basic_vm) that can be described as follows:
255 - [SDC] VendorOnboardStep: Onboard vendor in SDC.
256 - [SDC] YamlTemplateVspOnboardStep: Onboard vsp described in YAML file in SDC.
257 - [SDC] YamlTemplateVfOnboardStep: Onboard vf described in YAML file in SDC.
258 - [SDC] YamlTemplateServiceOnboardStep: Onboard service described in YAML file
260 - [AAI] RegisterCloudRegionStep: Register cloud region.
261 - [AAI] ComplexCreateStep: Create complex.
262 - [AAI] LinkCloudRegionToComplexStep: Connect cloud region with complex.
263 - [AAI] CustomerCreateStep: Create customer.
264 - [AAI] CustomerServiceSubscriptionCreateStep: Create customer's service
266 - [AAI] ConnectServiceSubToCloudRegionStep: Connect service subscription with
268 - [SO] YamlTemplateServiceAlaCarteInstantiateStep: Instantiate service described
269 in YAML using SO a'la carte method.
270 - [SO] YamlTemplateVnfAlaCarteInstantiateStep: Instantiate vnf described in YAML
271 using SO a'la carte method.
272 - [SO] YamlTemplateVfModuleAlaCarteInstantiateStep: Instantiate VF module
273 described in YAML using SO a'la carte method.
275 10 instantiation attempts are done simultaneously on the ONAP solution during 24h.
277 The results can be described as follows:
279 .. image:: files/s3p/istanbul_instantiation_stability_10.png
282 .. csv-table:: S3P Instantiation stability results
283 :file: ./files/csv/s3p-instantiation.csv
288 The results are good with a success rate above 95%. After 24h more than 1300
289 VNF have been created and deleted.
291 As for SDC, we can observe a linear increase of the test duration. This issue
292 has been reported since Guilin. For SDC as it is not possible to delete the
293 models, it is possible to imagine that the duration increases due to the fact
294 that the database of models continuously increases. Therefore the client has
295 to retrieve an always bigger list of models.
296 But for the instantiations, it is not the case as the references
297 (module, VNF, service) are cleaned at the end of each test and all the tests
298 use the same model. Then the duration of an instantiation test should be
299 almost constant, which is not the case. Further investigations are needed.
302 The test has been executed with the mariadb-galera replicaset set to 1
303 (3 by default). With this configuration the results during 24h are very
304 good. When set to 3, the error rate is higher and after some hours
305 most of the instantiation are failing.
306 However, even with a replicaset set to 1, a test on Master weekly chain
307 showed that the system is hitting another limit after about 35h
308 (https://jira.onap.org/browse/SO-3791).