[RND-533] Technical design for get all performance benchmarking (#251)

* [RND-533] Initial Instructions * Updating ticket link * [RND-533] Adding instructions to run the mongo profiler * Adding missing column in results example * Adding profiler section collapsed * Change ticket name * [RND-533] Adding script to run performance multiple tests, updating instructions * Add spacing * Changing page title to handle read and write separately
Ed-Fi-Exchange-OSS · Jun 8, 2023 · 87a95de · 87a95de
1 parent a1495f8
commit 87a95de
Show file tree

Hide file tree

Showing 3 changed files with 167 additions and 0 deletions.
diff --git a/docs/design/README.md b/docs/design/README.md
@@ -3,3 +3,4 @@
 ## Contents
 
 * [Support for Offline Cascading Updates](offline-cascading-updates/)
+* [Read Performance Benchmarking](read-performance-benchmarking/)
diff --git a/docs/design/read-performance-benchmarking/README.md b/docs/design/read-performance-benchmarking/README.md
@@ -0,0 +1,124 @@
+# Read Performance Benchmarking
+
+## Running Read Performance for Meadowlark
+
+To do the performance tests to retrieve endpoints, use the
+[Suite3-Performance-Testing](https://github.com/Ed-Fi-Exchange-OSS/Suite-3-Performance-Testing).
+
+### Setup
+
+- Configure Meadowlark and verify that it's running.
+- Load Sample data with the Invoke-LoadGrandBend or Invoke-LoadPartialGrandBend scripts.
+- Verify that the data has been loaded correctly through the API or from the database.
+- Clone the [Suite3-Performance-Testing](https://github.com/Ed-Fi-Exchange-OSS/Suite-3-Performance-Testing) repository.
+- Install [python](https://www.python.org/downloads/) and [poetry](https://python-poetry.org/docs/#installation).
+- Go to /src/edfi_paging_test folder and run `poetry install`.
+- Create a .env file based on /src/edfi_paging_test/edfi_paging_test/.env.example with your endpoint, key and secret.
+
+### Executing a single run of performance tests
+
+- Add the desired endpoints to the .env file or comment line to get all (get all functionality blocked by PERF-294).
+- Run `poetry run python edfi_paging_test`
+- Verify results located in the specified path or in the /out folder in CSV format.
+
+### Comparing performance of multiple runs
+
+To get a detailed comparison of Mean time and Standard Deviation, run the script
+[GetAll-Performance.ps1](../../../eng/performance/GetAll-Performance.ps1). This will print the details of the execution,
+additionally, it will generate a report per each run executed in CSV format, that can be analyzed.
+
+The script receives two parameters:
+
+PagingTestsPath: Path of the Suite3-Performance-Testing edfi-paging-tests location.
+
+NumTrials: Number of times to run the tests. Defaults to *5*.
+
+Example
+
+```pwsh
+.\GetAll-Performance.ps1 -PagingTestsPath c:\Repos\Suite-3-Performance-Testing\src\edfi-paging-test  -NumTrials 10
+
+```
+
+Running this script will print the mean and standard deviation of the executions.
+
+`Mean: 12847.73292`
+
+`Standard Deviation: 2217.78953625164`
+
+To check the CSV results, browse to the performance folders where the results will be saved by execution time.
+
+### Analyzing CSV Performance Results
+
+The execution will generate two CSV files, one with the statistics and another one with the details.
+
+#### Details
+
+Results example:
+
+| Resource |                         URL                         | PageNumber | PageSize | NumberOfRecords | ElapsedTime | StatusCode |
+|----------|-----------------------------------------------------|------------|----------|-----------------|-------------|------------|
+| accounts | http://{meadowlark_url}/accounts?offset=0&limit=100 |  1         |    100   |       100       | 0.020013055 |     200    |
+| accounts | http://{meadowlark_url}/accounts?offset=0&limit=100 |  2         |    100   |       100       | 0.040413055 |     200    |
+| accounts | http://{meadowlark_url}/accounts?offset=0&limit=100 |  1         |    100   |       100       | 0.089013055 |     200    |
+
+#### Statistics
+
+Results example:
+
+| Resource | PageSize | NumberOfPages | NumberOfRecords | TotalTime |   MeanTime  | StDeviation | NumberOfErrors |
+|----------|----------|---------------|-----------------|-----------|-------------|-------------|----------------|
+| accounts |   100    |       2       |         5       |   0.1031  | 0.020013055 |    0.0215   |        0       |
+| accounts |   100    |       2       |         5       |   0.1031  | 0.020013055 |    0.0215   |        0       |
+| accounts |   100    |       2       |         5       |   0.1031  | 0.020013055 |    0.0215   |        0       |
+
+### Comparing with ODS/API
+
+Run the same tests against and ODS/API instance with the same data set and filtering out the tpdm, sample and homograph
+resources since those are not handled by Meadowlark. Changing the url and variables in the .env file inside edfi_paging_test.
+
+### Profiler
+
+<details>
+  <summary>Running MongoDB profiler</summary>
+
+MongoDB comes with a built in profiler, disabled by default.
+
+To enable, connect to the docker container with `mongosh` and execute `db.setProfilingLevel(2)` to track all traffic.
+
+This must be done before running the paging tests to track the next instructions. To see the latest tracked data, run `show
+profile`.
+
+This will display something similar to:
+
+```json
+query   meadowlark.documents 1ms Wed Jun 07 2023 15:20:33
+command:{
+  find: 'documents',
+  filter: {
+    aliasIds: {
+      '$in': [
+        'KcsqHWHlSrAHP0LyDuChFK-C3NuO_tH5NF2YRA',
+        'auET2M3A7eg92ChrMaFL6vkmjHtx83fCs3kt_w',
+        'h0E08by8zxQHVXAblfHfXX4gU4l2-0AKcLWbGA'
+      ]
+    }
+  },
+  projection: { _id: 1 },
+  txnNumber: Long("754"),
+  autocommit: false,
+  '$clusterTime': {
+    clusterTime: Timestamp({ t: 1686172829, i: 1 }),
+    signature: {
+      hash: "",
+      keyId: Long("7241292544405929986")
+    }
+  },
+  '$db': 'meadowlark'
+} keysExamined:5 docsExamined:2 cursorExhausted numYield:0 nreturned:2 locks:{} storage:{} responseLength:346 protocol:op_msg
+```
+
+From the results, you can analyze the timeStamp and the number of docs and keys examined to get the results. [Read
+more](https://www.mongodb.com/docs/manual/reference/database-profiler/).
+
+</details>
diff --git a/eng/performance/GetAll-Performance.ps1 b/eng/performance/GetAll-Performance.ps1
@@ -0,0 +1,42 @@
+# SPDX-License-Identifier: Apache-2.0
+# Licensed to the Ed-Fi Alliance under one or more agreements.
+# The Ed-Fi Alliance licenses this file to you under the Apache License, Version 2.0.
+# See the LICENSE and NOTICES files in the project root for more information.
+<#
+.DESCRIPTION
+    Run performance tests multiple times
+#>
+param(
+    [Parameter(Mandatory=$true)]
+    [string]
+    $PagingTestsPath,
+
+    [int]
+    $NumTrials = 5
+
+)
+
+$times = @()
+
+$originalLocation = Get-Location
+
+Set-Location -Path $PagingTestsPath
+for ($i = 0; $i -lt $NumTrials; $i++) {
+    $timing = Measure-Command { poetry run python edfi_paging_test }
+    $times += $timing.TotalMilliseconds
+}
+
+$sum = 0.0
+$times | ForEach-Object { $sum += $_ }
+$mean= $sum / $NumTrials
+
+$sumSquareError = 0.0
+$times | ForEach-Object { $sumSquareError += [Math]::Pow($_ - $mean, 2) }
+$standardDeviation = [Math]::Sqrt($sumSquareError / $NumTrials)
+
+Write-Output @"
+Mean: $mean
+Standard Deviation: $standardDeviation
+"@
+
+Set-Location -Path $originalLocation
Original file line number	Diff line number	Diff line change
Expand Up		@@ -3,3 +3,4 @@
		## Contents

		* [Support for Offline Cascading Updates](offline-cascading-updates/)
		* [Read Performance Benchmarking](read-performance-benchmarking/)