-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capacity planning is missing from documentation #103
Comments
I certainly agree that this is a missing topic in the documentation. However - there is a reason why this is topic is missing in the manual. And that reason is that it is very hard (or even impossible) to make generalized recommendations. There are a lot of moving parts that can heavily influence the hardware requirements. For one, this depends on your network traffic - and the protocol mix that you see on it. Some network traffic is easy to analyze, other network traffic requires complicated parsing for each packet (e.g. DNS). So - if you have, for example, a DNS-heavy network - you might suddenly need significantly more hardware. If you have something like HTTP - average request size makes a difference. And, obviously, additional scripts that you might want to add have a performance impact. The amount of RAM that you require similarly has a couple of dependencies, Then - different hardware has different performance characteristics - so you can't even just go by CPU clock speed (e.g.) - older CPUs with the same clock speed can be significantly slower, enabled/disabled attack mitigations can impact this, etc. Similarly, the way in which you acquire your packets has some impact on speed - and not all ways of acquiring packets are suitable for all environments. The OS you are deploying also can impact the speed. On top of everything - with new hardware coming out, drivers being updated, etc. - all of this changes all the time - so it has to be updated regularly. Since we (as the developers of Zeek) mostly also don't run huge deployments of Zeek we also are often unaware of hardware requirements on modern hardware with "average" traffic - so it's hard for us to do that. In the past, some people have posted performance numbers on our mailing list - however, I am not aware of that happening recently. Actually @rsmmr - since we now have the testing subgroup - do you think it would be possible for them to anonymously post their hardware setups somewhere, to give people an indication of what is used in the real world? That we could point to from the documentation. |
On Wed, Nov 17, 2021 at 5:34 AM Johanna ***@***.***> wrote:
Actually @rsmmr <https://github.com/rsmmr> - since we now have the
testing subgroup - do you think it would be possible for them to
anonymously post their hardware setups somewhere, to give people an
indication of what is used in the real world? That we could point to from
the documentation.
Thanks for the thoughtful reply Johanna. A collection of rules of thumb and
example cases would be really nice. Or deployment white papers describing
various installations, the decision process used, and the outcome. A body
of practical, rubber meeting the road use cases like that would help
illuminate the decision process.
I struggled to specify hardware four years ago when we set up equipment at
my institution and I am struggling again now that I've been tasked with
specifying equipment for deployment at sister institutions. Gross
over-building is one obvious solution but as always the budget is not
infinite. Under-building is also bad. Seeing what worked (or not) for other
deployments would be very useful. I'd be glad to document what we did and
why if there was a place to put it. Hopefully others will do the same so we
can over time accrue the information that is currently lacking.
Mark
… |
I transferred this issue over to the zeek-docs repo, since it makes more sense over here. |
The topic of capacity planning is missing from the documentation. Or at least I cannot find it. Right after "getting started", the next thing someone needs to do is to decide what resources to purchase in order to process some rate of network bandwidth: "how many workers (cores) and how much RAM is needed to process 10 Gbps without dropped packets?". Also, what are the gotchas and best practices that should be followed (with some discussion of why so intuition can be quickly built up). I expect the information is out in the community but it isn't readily accessible. The documentation needs to be improved by those who have experience so that those of us without experience can avoid costly mistakes and successfully deploy.
The text was updated successfully, but these errors were encountered: