Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capacity planning is missing from documentation #103

Open
mkgvt opened this issue Nov 16, 2021 · 3 comments
Open

Capacity planning is missing from documentation #103

mkgvt opened this issue Nov 16, 2021 · 3 comments

Comments

@mkgvt
Copy link

mkgvt commented Nov 16, 2021

The topic of capacity planning is missing from the documentation. Or at least I cannot find it. Right after "getting started", the next thing someone needs to do is to decide what resources to purchase in order to process some rate of network bandwidth: "how many workers (cores) and how much RAM is needed to process 10 Gbps without dropped packets?". Also, what are the gotchas and best practices that should be followed (with some discussion of why so intuition can be quickly built up). I expect the information is out in the community but it isn't readily accessible. The documentation needs to be improved by those who have experience so that those of us without experience can avoid costly mistakes and successfully deploy.

@0xxon
Copy link
Member

0xxon commented Nov 17, 2021

I certainly agree that this is a missing topic in the documentation. However - there is a reason why this is topic is missing in the manual. And that reason is that it is very hard (or even impossible) to make generalized recommendations. There are a lot of moving parts that can heavily influence the hardware requirements.

For one, this depends on your network traffic - and the protocol mix that you see on it. Some network traffic is easy to analyze, other network traffic requires complicated parsing for each packet (e.g. DNS). So - if you have, for example, a DNS-heavy network - you might suddenly need significantly more hardware. If you have something like HTTP - average request size makes a difference. And, obviously, additional scripts that you might want to add have a performance impact. The amount of RAM that you require similarly has a couple of dependencies,

Then - different hardware has different performance characteristics - so you can't even just go by CPU clock speed (e.g.) - older CPUs with the same clock speed can be significantly slower, enabled/disabled attack mitigations can impact this, etc. Similarly, the way in which you acquire your packets has some impact on speed - and not all ways of acquiring packets are suitable for all environments. The OS you are deploying also can impact the speed.

On top of everything - with new hardware coming out, drivers being updated, etc. - all of this changes all the time - so it has to be updated regularly. Since we (as the developers of Zeek) mostly also don't run huge deployments of Zeek we also are often unaware of hardware requirements on modern hardware with "average" traffic - so it's hard for us to do that.

In the past, some people have posted performance numbers on our mailing list - however, I am not aware of that happening recently.

Actually @rsmmr - since we now have the testing subgroup - do you think it would be possible for them to anonymously post their hardware setups somewhere, to give people an indication of what is used in the real world? That we could point to from the documentation.

@mkgvt
Copy link
Author

mkgvt commented Nov 17, 2021 via email

@timwoj timwoj transferred this issue from zeek/zeek Feb 22, 2022
@timwoj
Copy link
Member

timwoj commented Feb 22, 2022

I transferred this issue over to the zeek-docs repo, since it makes more sense over here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants