Skip to content
This repository was archived by the owner on Mar 11, 2025. It is now read-only.

Commit 51c6673

Browse files
authored
[RND-380] MongoDB Connection Pooling Experiments (#282)
* Support for bulk loading into ODS/API 5.3 for test comparisons * Performance test results * clarify the bulk load settings * Updated main readme
1 parent 0d4f7d6 commit 51c6673

File tree

12 files changed

+516
-7
lines changed

12 files changed

+516
-7
lines changed

README.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,18 +25,16 @@ information on the background and design decisions for this project.
2525
* [Configuration](https://github.com/Ed-Fi-Exchange-OSS/Meadowlark/blob/main/docs/CONFIGURATION.md)
2626
* [Developer getting started notes](https://github.com/Ed-Fi-Exchange-OSS/Meadowlark/blob/main/docs/README.md)
2727
* [Additional technical details](https://github.com/Ed-Fi-Exchange-OSS/Meadowlark/blob/main/docs/TECHNICAL.md)
28+
* [Docker for Local Meadowlark Development](https://github.com/Ed-Fi-Exchange-OSS/Meadowlark/blob/main/docs/DOCKER-LOCAL-DEV.md)
2829
* [How to Submit an Issue](https://techdocs.ed-fi.org/x/Y8uIBg) (Tech Docs)
2930
* [How Submit a Feature Request](https://techdocs.ed-fi.org/x/0YADAQ) (Tech
3031
Docs)
3132

32-
### 😕 Cloud Deployment
33+
## Deployment and Operations
3334

34-
You may be asking yourself, "where are the instructions for cloud deployment?"
35-
We're working on it. Milestone 0.1.0 had severless deploy to AWS built-in, but
36-
there were several aspects that no longer fit well with our refined strategy in
37-
milestone 0.2.0, so the current release does not have any intrinsic cloud
38-
deployment support. The upcoming 0.3.0 release will have basic deployment
39-
capabilities on Azure, the preferred platform for our first pilot project.
35+
* [Using Docker with Meadowlark](https://github.com/Ed-Fi-Exchange-OSS/Meadowlark/blob/main/docs/DOCKER.md)
36+
* [Azure Deployment](https://github.com/Ed-Fi-Exchange-OSS/Meadowlark/blob/main/eng/deploy/azure/)
37+
* [Performance Testing](https://github.com/Ed-Fi-Exchange-OSS/Meadowlark/blob/main/docs/performance-testing/)
4038

4139
## Contributing
4240

docs/performance-testing/README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Ed-Fi Meadowlark Performance Testing Results
2+
3+
* [RND-380: MongoDB Connection Pooling](mongo-connection-pooling.md). Summary:
4+
when using the clustered mode, no discernible benefit to tuning MongoDB
5+
connection pooling while loading the "partial Grand Bend" dataset. Also
6+
includes comparison with ODS/API v5.3-patch4.

docs/performance-testing/RND-380.csv

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
Scenario, Timing
2+
8 threads pool size 1, 156.9
3+
8 threads pool size 1, 196.12
4+
8 threads pool size 1, 185.14
5+
8 threads pool size 1, 198.87
6+
8 threads pool size 1, 173.17
7+
8 threads pool size 5, 177.76
8+
8 threads pool size 5, 180.32
9+
8 threads pool size 5, 159.98
10+
8 threads pool size 5, 163.16
11+
8 threads pool size 5, 166.52
12+
8 threads pool size 100, 171.39
13+
8 threads pool size 100, 166.76
14+
8 threads pool size 100, 188.29
15+
8 threads pool size 100, 165.87
16+
8 threads pool size 100, 174.04
17+
8 threads pool size 150, 156.92
18+
8 threads pool size 150, 151.31
19+
8 threads pool size 150, 180.37
20+
8 threads pool size 150, 180.28
21+
8 threads pool size 150, 162.29
22+
1 threads pool size 1, 535.6
23+
1 threads pool size 1, 526.85
24+
1 threads pool size 1, 534.49
25+
1 threads pool size 1, 534.49
26+
1 threads pool size 1, 552.34
27+
1 threads pool size 150, 379.5
28+
1 threads pool size 150, 355.11
29+
1 threads pool size 150, 365.01
30+
1 threads pool size 150, 370.14
31+
1 threads pool size 150, 365.91
32+
4 threads pool size 150, 166.55
33+
4 threads pool size 150, 160.14
34+
4 threads pool size 150, 157.98
35+
4 threads pool size 150, 183.37
36+
4 threads pool size 150, 162.24
37+
ODS/API, 92.23
38+
ODS/API, 94.53
39+
ODS/API, 89.43
40+
ODS/API, 85.09
41+
ODS/API, 84.73
Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
# RND-380: MongoDB Connection Pooling
2+
3+
## Goal
4+
5+
Experiment with [MongoDB connection
6+
pooling](https://www.mongodb.com/docs/drivers/node/v4.15/fundamentals/connection/connection-options/
7+
) to evaluate impact on application performance.
8+
9+
## Methodology
10+
11+
1. Start Meadowlark fully in Docker, using MongoDB as the backend and OpenSearch
12+
as the search provider. (See below for environment settings).
13+
14+
```pwsh
15+
cd Meadowlark-js
16+
./reset-docker-compose.ps1
17+
```
18+
19+
2. Bulk upload the "partial grand bend" data set, capturing the time taken.
20+
21+
```pwsh
22+
cd ../eng/bulkLoad
23+
Measure-Command { .\Invoke-LoadPartialGrandBend.ps1 }
24+
```
25+
26+
3. Repeat for a total of 5 measurements with the same settings
27+
4. Tune the connection pooling via the `MONGO_URI` setting in the `.env` file.
28+
5. Repeat the measurement process.
29+
30+
An Ed-Fi ODS/API v5.3-patch4 environment was configured on the same VM in order
31+
to make a comparison between the two platforms. In this repository's
32+
`eng/ods-api` directory, the reader will find a PowerShell script `reset.ps1`
33+
that builds a fresh Docker container environment running the two Ed-Fi database
34+
images and the API image. Since this is for raw testing and head-to-head
35+
comparison, this solution does not use NGiNX or PG Bouncer. To run against the
36+
ODS/API, alter step 2 above to use `Invoke-LoadPartialGrandbend-ODSAPI.ps1`
37+
38+
## Environment
39+
40+
All tests run on a Windows Server 2019 virtual machine as Docker host, running
41+
the latest version of Docker Desktop, using WSL2. The VM has 12 cores assigned
42+
to it using Intel Xeon Gold 6150 @ 2.70 GHz with 24.0 GB of memory and plenty of
43+
disk space. Docker is configured to use up to 8 CPUs, 12 GB of memory, 2 GB of
44+
swap space, and limit of 64 GB on virtual disk.
45+
46+
Baseline `.env` configuration file:
47+
48+
```none
49+
OAUTH_SIGNING_KEY=<omitted>
50+
OWN_OAUTH_CLIENT_ID_FOR_CLIENT_AUTH=meadowlark_verify-only_key_1
51+
OWN_OAUTH_CLIENT_SECRET_FOR_CLIENT_AUTH=meadowlark_verify-only_secret_1
52+
OAUTH_SERVER_ENDPOINT_FOR_OWN_TOKEN_REQUEST=http://localhost:3000/local/oauth/token
53+
OAUTH_SERVER_ENDPOINT_FOR_TOKEN_VERIFICATION=http://localhost:3000/local/oauth/verify
54+
OAUTH_HARD_CODED_CREDENTIALS_ENABLED=true
55+
56+
OPENSEARCH_USERNAME=admin
57+
OPENSEARCH_PASSWORD=admin
58+
OPENSEARCH_ENDPOINT=http://localhost:9200
59+
OPENSEARCH_REQUEST_TIMEOUT=10000
60+
61+
AUTHORIZATION_STORE_PLUGIN=@edfi/meadowlark-mongodb-backend
62+
DOCUMENT_STORE_PLUGIN=@edfi/meadowlark-mongodb-backend
63+
QUERY_HANDLER_PLUGIN=@edfi/meadowlark-opensearch-backend
64+
LISTENER1_PLUGIN=@edfi/meadowlark-opensearch-backend
65+
66+
MONGODB_USER=mongo
67+
MONGODB_PASS=<omitted>
68+
MONGO_URI=mongodb://${MONGODB_USER}:${MONGODB_PASS}@mongo1:27017,mongo2:27018,mongo3:27019/?replicaSet=rs0&maxPoolSize=100
69+
70+
FASTIFY_RATE_LIMIT=false
71+
FASTIFY_PORT=3000
72+
# Next line commented out, therefore it will auto-cluster to match number of
73+
# available CPUs.
74+
# FASTIFY_NUM_THREADS=4
75+
76+
MEADOWLARK_STAGE=local
77+
LOG_LEVEL=debug
78+
IS_LOCAL=true
79+
80+
BEGIN_ALLOWED_SCHOOL_YEAR=2022
81+
END_ALLOWED_SCHOOL_YEAR=2034
82+
ALLOW_TYPE_COERCION=true
83+
ALLOW__EXT_PROPERTY=true
84+
85+
SAVE_LOG_TO_FILE=true
86+
LOG_FILE_LOCATION=c:/temp/
87+
```
88+
89+
The API bulk client loader runs on the VM host, connecting to the Docker
90+
network. It is configured to use maximum of 100 connections, 50 tasks buffered,
91+
and 500 max simultaneous requests. Retries are disabled. All of the XML files
92+
load without error at this time.
93+
94+
## Results
95+
96+
Times below are given in seconds. In the default settings, there was one extreme
97+
outlier that significantly impacted the average time, as seen by the high
98+
standard deviation.
99+
100+
| Scenario | Avg | St Dev |
101+
| ------------------------ | ------ | ------ |
102+
| 8 threads, pool size 1 | 182.04 | 17.33 |
103+
| 8 threads, pool size 5 | 169.55 | 9.01 |
104+
| 8 threads, pool size 100 | 173.27 | 9.04 |
105+
| 8 threads, pool size 150 | 166.23 | 13.44 |
106+
| 1 threads, pool size 1 | 536.75 | 9.39 |
107+
| 1 threads, pool size 150 | 367.13 | 8.84 |
108+
| 4 threads, pool size 150 | 166.06 | 10.18 |
109+
| ODS/API | 89.20 | 4.32 |
110+
111+
See [RND-380.csv](RND-38.csv) for raw data.
112+
113+
## Analysis
114+
115+
In the default configuration, the Meadowlark API startup process forks itself as
116+
many times as there are CPU's available. Thus, in default settings, there are
117+
eight API processes running in parallel. Although these were initiated by the
118+
same NodeJs process, each process is isolated with respect to memory. Thus, each
119+
of the eight processes has a separate pool of connections. Within each forked
120+
process there is still potential for connection pool re-use, thanks to the use
121+
of asynchronous processing. However, it is clear that the connection pool
122+
settings have little impact compared to the threading. Even a pool size of five
123+
proved adequate when running with eight CPUs. Interestingly, the pool size of
124+
150 with only four CPU's also yields consistent results compared to the tests
125+
with eight CPU's.
126+
127+
The only time we see a discernible difference in results is when we reduce the
128+
number of threads used by the API (`FASTIFY_NUM_THREADS`). For this data set,
129+
the performance is discernibly worse with only one thread, whether using one or
130+
150 connections in the pool. However, the connection pooling in such a low CPU
131+
scenario does clearly yield an improved experience, reducing the average time to
132+
complete the test by roughly 69%.
133+
134+
> **Note** Five executions of each test appears to be useful, but where timings
135+
> are very close to one another, the number of data points is insufficient for
136+
> giving a useful statistical significance.
137+
138+
The difference between Meadowlark and the ODS/API is obviously significant: the
139+
ODS/API is almost 50% faster.
140+
141+
## Conclusions
142+
143+
Under the environment conditions described above, this research spike does not
144+
find significant benefit to tuning the size of the MongoDB connection pool,
145+
given there are at least four process threads running.
146+
147+
If a Meadowlark API container has only one or two virtual CPUs available, then
148+
tuning the connection pooling could theoretically be beneficial. However, out of
149+
the box, the MongoDB client has a default value of 100 connections available,
150+
which may be appropriate for many situations.
151+
152+
Those with expertise in MongoDB might find that there are other connection pool
153+
settings, such as timeouts, that could be relevant for a given situation.
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Licensed to the Ed-Fi Alliance under one or more agreements.
3+
# The Ed-Fi Alliance licenses this file to you under the Apache License, Version 2.0.
4+
# See the LICENSE and NOTICES files in the project root for more information.
5+
6+
# Runs part of the bulk upload of the Grand Bend dataset, aka "populated
7+
# template" - restricted to the data needed to run the performance testing kit.
8+
# This enables a faster setup, at the expense of having less data in the system.
9+
10+
# Tuned for use with the ODS/API in shared instance mode. Before running this
11+
# script, make sure that the Docker containers are running and that the
12+
# bootstrap key/secret have been setup. Both of these steps are handled by
13+
# running `./reset.ps1`.
14+
15+
#Requires -Version 7
16+
17+
param(
18+
[string]
19+
$Key = "sampleKey",
20+
21+
[string]
22+
$Secret = "sampleSecret",
23+
24+
[string]
25+
$BaseUrl = "http://localhost"
26+
)
27+
28+
$ErrorActionPreference = "Stop"
29+
30+
Import-Module ./modules/Package-Management.psm1 -Force
31+
Import-Module ./modules/Get-XSD.psm1 -Force
32+
Import-Module ./modules/BulkLoad.psm1 -Force
33+
$sampleDataVersion = "3.3.1-b"
34+
35+
$paths = Initialize-ToolsAndDirectories
36+
$paths.SampleDataDirectory = Import-SampleData -Template "GrandBend" -Version $sampleDataVersion
37+
38+
$parameters = @{
39+
BaseUrl = $BaseUrl
40+
Key = $Key
41+
Secret = $Secret
42+
Paths = $paths
43+
}
44+
45+
Write-Descriptors @parameters
46+
Write-PartialGrandBend @parameters

eng/ods-api/.env.example

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
POSTGRES_USER=postgres
2+
POSTGRES_PASSWORD=fghjkyuiok3

eng/ods-api/Dockerfile

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Licensed to the Ed-Fi Alliance under one or more agreements.
3+
# The Ed-Fi Alliance licenses this file to you under the Apache License, Version 2.0.
4+
# See the LICENSE and NOTICES files in the project root for more information.
5+
6+
FROM edfialliance/ods-api-web-api:v2.1.5@sha256:2e6c04b1821f3584a58a993d65b62105b62a0323a4c99acbf1ee70f88f433c10
7+
COPY appsettings.template.json /app/appsettings.template.json
8+
9+
ENTRYPOINT ["/app/run.sh"]

eng/ods-api/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# ods-api Directory
2+
3+
This directory supports starting the ODS/API v5.3-patch4 in sandbox mode, with
4+
change queries, profiles, and composites disabled. Useful for head-to-head test
5+
comparisons with Meadowlark.
6+
7+
> **Warning** do not publish to Docker Hub.

eng/ods-api/appsettings.template.json

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
{
2+
"ApplicationInsights": {
3+
"InstrumentationKey": "",
4+
"LogLevel": {
5+
"Default": "Warning"
6+
}
7+
},
8+
"ConnectionStrings": {
9+
"EdFi_Ods": "host=${ODS_POSTGRES_HOST};port=${POSTGRES_PORT};username=${POSTGRES_USER};password=${POSTGRES_PASSWORD};database=EdFi_{0};pooling=false;application name=EdFi.Ods.WebApi",
10+
"EdFi_Security": "host=${ADMIN_POSTGRES_HOST};port=${POSTGRES_PORT};username=${POSTGRES_USER};password=${POSTGRES_PASSWORD};database=EdFi_Security;pooling=false;application name=EdFi.Ods.WebApi",
11+
"EdFi_Admin": "host=${ADMIN_POSTGRES_HOST};port=${POSTGRES_PORT};username=${POSTGRES_USER};password=${POSTGRES_PASSWORD};database=EdFi_Admin;pooling=false;application name=EdFi.Ods.WebApi",
12+
"EdFi_Master": "host=${ADMIN_POSTGRES_HOST};port=${POSTGRES_PORT};username=${POSTGRES_USER};password=${POSTGRES_PASSWORD};database=postgres;pooling=false;application name=EdFi.Ods.WebApi"
13+
},
14+
"BearerTokenTimeoutMinutes": "30",
15+
"DefaultPageSizeLimit": 500,
16+
"ApiSettings": {
17+
"Mode": "$API_MODE",
18+
"MinimalTemplateSuffix": "Ods_Minimal_Template",
19+
"UsePlugins": false,
20+
"PopulatedTemplateSuffix": "Ods_Populated_Template",
21+
"PlainTextSecrets": true,
22+
"MinimalTemplateScript": "PostgreSQLMinimalTemplate",
23+
"Engine": "PostgreSQL",
24+
"OdsTokens": [],
25+
"PopulatedTemplateScript": "PostgreSQLPopulatedTemplate",
26+
"UseReverseProxyHeaders": true,
27+
"Features": [
28+
{
29+
"Name": "OpenApiMetadata",
30+
"IsEnabled": true
31+
},
32+
{
33+
"Name": "AggregateDependencies",
34+
"IsEnabled": true
35+
},
36+
{
37+
"Name": "TokenInfo",
38+
"IsEnabled": true
39+
},
40+
{
41+
"Name": "Extensions",
42+
"IsEnabled": true
43+
},
44+
{
45+
"Name": "Composites",
46+
"IsEnabled": false
47+
},
48+
{
49+
"Name": "Profiles",
50+
"IsEnabled": false
51+
},
52+
{
53+
"Name": "ChangeQueries",
54+
"IsEnabled": false
55+
},
56+
{
57+
"Name": "IdentityManagement",
58+
"IsEnabled": false
59+
},
60+
{
61+
"Name": "OwnershipBasedAuthorization",
62+
"IsEnabled": false
63+
},
64+
{
65+
"Name": "UniqueIdValidation",
66+
"IsEnabled": false
67+
},
68+
{
69+
"Name": "XsdMetadata",
70+
"IsEnabled": true
71+
}
72+
],
73+
"ExcludedExtensions": []
74+
},
75+
"Plugin": {
76+
"Folder": "./Plugin",
77+
"Scripts": [
78+
"tpdm"
79+
]
80+
},
81+
"Caching": {
82+
"Descriptors": {
83+
"AbsoluteExpirationSeconds": 1800
84+
},
85+
"PersonUniqueIdToUsi": {
86+
"AbsoluteExpirationSeconds": 0,
87+
"SlidingExpirationSeconds": 14400
88+
}
89+
},
90+
"Logging": {
91+
"LogLevel": {
92+
"Default": "Information",
93+
"Microsoft": "Warning"
94+
}
95+
}
96+
}

0 commit comments

Comments
 (0)