Amazon Kinesis Load Testing with Locust

Easy Amazon Kinesis load testing with Locust - a modern load testing framework.

Overview

For high-traffic Kinesis based application, it's often a challenge to simulate the necessary load in a load test. Locust is a powerful Python based framework to execute load test. This project leverages a Boto3 wrapper for Locust.

This project emits temperature sensor readings via Locust to Kinesis. The EC2 Locust instance is setup via CDK to load test Kinesis based applications. Users can connect via Systems Manager Session Manager (SSM) for configuration changes. For your convenience, it deploys a Kinesis Data Stream called DemoStream as well, to get you started. If you want to adapt or disable it, you can find the resource here.

In our testing with the largest recommended instance - c7g.16xlarge - the setup was capable of emitting over 1 million events per second to Kinesis, with a batch size of 500. See also here for more information on how to achieve peak performance.

Alternatively your can use the Amazon Kinesis Data Generator, which provides you with a hosted UI to set up Kinesis load tests. As this approach is browser based, you are limited by the bandwidth of your current connection, the round trip latency and have to keep the browser tab open to continue sending events.

Test locally

To test Locust out locally first, before deploying it to the cloud, you have to install the necessary Python dependencies. You can find more information in the Pip Getting Started Guide.

Navigate to the load-test directory and run:

pip install -r requirements.txt

In order to send events to Kinesis from your local machine, you have to have AWS credentials, see also the documentation on Configuration and credential file settings

To perform now the test locally, stay in the load-test directory and run:

locust -f locust-load-test.py

You can now access the Locust dashboard via http://0.0.0.0:8089/. Enter the number of Locust users, the spawn rate (users added per second) and the target Amazon Kinesis data stream name as host.

In order to get the generated events logged out, run this command, it will filter only locust and root logs (e.g. no botocore logs):

locust -f locust-load-test.py --loglevel DEBUG 2>&1 | grep -E "(locust|root)"

Cloud setup

This project relies on AWS CDK and TypeScript, for installation instructions look here. For further information you can also checkout this Workshop and this Getting Started.

Bootstrap your environment

cdk bootstrap

For more details, see AWS Cloud Development Kit - Bootstrapping.

Install dependencies

npm install

CDK Deployment

Run the following to deploy the stacks of the cdk template:

cdk deploy

Accessing the Locust Dashboard

To get started you first have to do the setup, as described here. You can access the dashboard by using the CDK Output KinesisLoadTestingWithLocustStack.locustdashboardurl to open the dashboard, e.g. http://1.2.3.4:8089.

The Locust dashboard is password protected, by default it is set to Username locust-user and Password locust-dashboard-pwd, you can change it here.

With the default configuration, you can achieve up to 15.000 emitted events per second. Enter the number of Locust users (times the batch size), the spawn rate (users added per second) and the target Kinesis stream name as host.

The default settings:

Instance: c7g.xlarge
Number of secondaries: 4
Batch size: 10
Host (Kinesis stream name): kinesis-load-test-input-stream, you can change the default host here

If you want to achieve more load, checkout Large scale load testing documentation.

Adopting the payload

By default, this project generates random temperature sensor readings for every sensor with this format:

 {
    "sensorId": "bfbae19c-2f0f-41c2-952b-5d5bc6e001f1_1",
    "temperature": 147.24,
    "status": "OK",
    "timestamp": 1675686126310
 }

To adopt it to your needs, the project comes packaged with Faker, that you can use to adopt the payload to your needs. If you want to use a different payload or payload format, please change it here.

Configuration changes

Locust is created on the EC2 instance as a systemd service and can therefore be controlled with systemctl. If you want to change the configuration of Locust on-the-fly without redeploying the stack, you can connect to the instance via Systems Manager Session Manager (SSM), navigate to the project directory on /usr/local/load-test, change the locust.env file and restart the service by running sudo systemctl restart locust.

Destroy the stack

In order to not incur any unnecessary costs, you can simply delete the stack.

cdk destroy

Large scale load testing

In order to achieve peak performance with Locust and Kinesis, there are a couple of things to keep in mind.

Instance size

Your performance is bound by the underlying EC2 instance, so check recommended instance for more information about scaling. In order to set the correct instance size, you can configure the instanze size here.

Number of secondaries

Locust benefits from a distributed setup. Therefore the setup spins up multiple secondaries, that do the actual work, and a primary, that does the coordination. In order to leverage the cores up to maximum, you should specify 1 secondary per core, you can configure the number here.

Batch size

The amount of Kinesis events you can send per Locust user is limited, due to the resource overhead of switching Locust users and by this threads. To overcome this, you can configure a batch size, to define how much users are simulated, per Locust user. These are send as a Kinesis put_records call. You can configure the number here.

The load test was achieved with this settings:

Region: eu-central-1 (Frankfurt)
Instance: c7g.16xlarge
Number of secondaries: 64
Batch size: 500

EC2 instance type

For Kinesis load testing with Locust, Locust is a compute and network intensive workload, that can also run on ARM for better cost-efficiency. C7G is the latest compute optimized instance type, that runs on ARM with Graviton.

Below you can find an overview of possible EC2 instances, that you could select, based on the peak load requirements of your test. If you want you can change the instance type in CDK.

API Name	vCPUs	Memory	Network Performance	On Demand cost	Spot Minimum cost
c7g.large	2 vCPUs	4.0 GiB	Up to 12.5 Gigabit	$0.0725 hourly	$0.0466 hourly
c7g.xlarge	4 vCPUs	8.0 GiB	Up to 12.5 Gigabit	$0.1450 hourly	$0.0946 hourly
c7g.2xlarge	8 vCPUs	16.0 GiB	Up to 15 Gigabit	$0.2900 hourly	$0.1841 hourly
c7g.4xlarge	16 vCPUs	32.0 GiB	Up to 15 Gigabit	$0.5800 hourly	$0.4188 hourly
c7g.8xlarge	32 vCPUs	64.0 GiB	15 Gigabit	$1.1600 hourly	$0.7469 hourly
c7g.12xlarge	48 vCPUs	96.0 GiB	22.5 Gigabit	$1.7400 hourly	$1.0743 hourly
c7g.16xlarge	64 vCPUs	128.0 GiB	30 Gigabit	$2.3200 hourly	$1.4842 hourly

The pricing information is according to us-east-1 (North Virginia). To find the most accurate pricing information, see here.

Please keep in mind, that not only the EC2 instance costs, but especially the data transfer costs contribute to your costs.

Project structure

docs/                                 -- Contains project documentation
infrastructure/                       -- Contains the CDK infrastructure definition
load-test/                            -- Contains all the load testing scripts
├── locust-load-test.py               -- Locust load testing definition
├── locust.env                        -- Locust environment variables
├── requirements.txt                  -- Defines all needed Python dependencies
├── start-locust.sh                   -- Script to start Locust to run in a distruted mode
└── stop-locust.sh                    -- Script to stop all Locust workers and the master

Remarks

For simplicity of the setup and to reduce waiting times in the initial setup, the default VPC is used. If you want to create your custom VPC or reuse an already defined VPC, you can change it here.

The setup also requires, that the EC2 instance is accessible on port 8089 inside this VPC, so make sure an Internet Gateway is set up, and nothing restricts the port access (e.g. NACLs, ...).

Notice

This is a sample solution intended as a starting point and should not be used in a productive setting without thorough analysis and considerations on the user's side.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Amazon Kinesis Load Testing with Locust

Table of contents

Overview

Test locally

Cloud setup

Bootstrap your environment

Install dependencies

CDK Deployment

Accessing the Locust Dashboard

Adopting the payload

Configuration changes

Destroy the stack

Large scale load testing

Instance size

Number of secondaries

Batch size

EC2 instance type

Project structure

Remarks

Notice

Security

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Amazon Kinesis Load Testing with Locust

Table of contents

Overview

Test locally

Cloud setup

Bootstrap your environment

Install dependencies

CDK Deployment

Accessing the Locust Dashboard

Adopting the payload

Configuration changes

Destroy the stack

Large scale load testing

Instance size

Number of secondaries

Batch size

EC2 instance type

Project structure

Remarks

Notice

Security

License