Hardware requirements #3852
baek-sang
started this conversation in
Anything Else
Replies: 1 comment
-
The following is based on 2023 event data for https://plausible.io/plausible.io (taken from the staging mirror) There are two main tables that store event data, events_v2 (which is close to the raw events) and sessions_v2. Everything else is mostly irrelevant as it doesn't grow as events come in. select hostname from events_v2 limit 1;
-- ┌─hostname─────┐
-- │ plausible.io │
-- └──────────────┘
select min(timestamp), max(timestamp) from events_v2;
-- ┌──────min(timestamp)─┬──────max(timestamp)─┐
-- │ 2023-02-09 13:54:27 │ 2024-01-23 12:25:04 │
-- └─────────────────────┴─────────────────────┘
with e as (select toYYYYMM(timestamp) as month, count(*) as events from events_v2 group by month),
p as (select partition, bytes_on_disk from system.parts where table='events_v2' and active)
select month, events, formatReadableSize(bytes_on_disk), formatReadableSize(bytes_on_disk / events) from e
left join p on toString(e.month) = p.partition
order by month asc;
-- ┌──month─┬──events─┬─formatReadableSize(bytes_on_disk)─┬─formatReadableSize(divide(bytes_on_disk, events))─┐
-- │ 202302 │ 446740 │ 5.01 MiB │ 11.77 B │
-- │ 202303 │ 835341 │ 10.66 MiB │ 13.38 B │
-- │ 202304 │ 601907 │ 7.92 MiB │ 13.80 B │
-- │ 202305 │ 825908 │ 8.99 MiB │ 11.41 B │
-- │ 202306 │ 773004 │ 9.55 MiB │ 12.95 B │
-- │ 202307 │ 757485 │ 9.61 MiB │ 13.31 B │
-- │ 202308 │ 1009596 │ 10.64 MiB │ 11.06 B │
-- │ 202310 │ 3635510 │ 19.54 MiB │ 5.64 B │
-- │ 202311 │ 5158640 │ 27.36 MiB │ 5.56 B │
-- │ 202312 │ 4320582 │ 22.42 MiB │ 5.44 B │
-- │ 202401 │ 2591880 │ 12.61 MiB │ 5.10 B │
-- └────────┴─────────┴───────────────────────────────────┴───────────────────────────────────────────────────┘
with e as (select toYYYYMM(timestamp) as month, count(*) as sessions from sessions_v2 group by month),
p as (select partition, bytes_on_disk from system.parts where table='sessions_v2' and active)
select month, sessions, formatReadableSize(bytes_on_disk), formatReadableSize(bytes_on_disk / sessions) from e
left join p on toString(e.month) = p.partition
order by month asc;
-- ┌──month─┬─sessions─┬─formatReadableSize(bytes_on_disk)─┬─formatReadableSize(divide(bytes_on_disk, sessions))─┐
-- │ 202302 │ 96036 │ 3.73 MiB │ 40.78 B │
-- │ 202303 │ 222797 │ 8.69 MiB │ 40.90 B │
-- │ 202304 │ 170025 │ 6.59 MiB │ 40.66 B │
-- │ 202305 │ 177303 │ 6.90 MiB │ 40.79 B │
-- │ 202306 │ 196727 │ 7.66 MiB │ 40.83 B │
-- │ 202307 │ 201426 │ 7.88 MiB │ 41.03 B │
-- │ 202308 │ 198089 │ 7.81 MiB │ 41.35 B │
-- │ 202310 │ 363769 │ 13.52 MiB │ 38.96 B │
-- │ 202311 │ 518834 │ 19.12 MiB │ 38.64 B │
-- │ 202312 │ 417271 │ 15.13 MiB │ 38.03 B │
-- │ 202401 │ 221666 │ 8.26 MiB │ 39.07 B │
-- └────────┴──────────┴───────────────────────────────────┴─────────────────────────────────────────────────────┘ As for your question, I'd need to extrapolate. It seems like MAU would roughly correspond to sessions (see https://plausible.io/docs/metrics-definitions#unique-visitors for the difference) 100M sessions would be ~4GiB with avg as (
select (select sum(bytes_on_disk) from system.parts where table='sessions_v2' and active) /
(select count(*) from sessions_v2) as bytes)
select formatReadableSize(100000000 * bytes) from avg;
-- ┌─formatReadableSize(multiply(100000000, bytes))─┐
-- │ 3.69 GiB │
-- └────────────────────────────────────────────────┘ and since each session has on average (for plausible.io) 8 events select (select count(*) from events_v2) / (select count(*) from sessions_v2);
-- ┌─divide(_subquery31, _subquery32)─┐
-- │ 7.527665975919765 │
-- └──────────────────────────────────┘ it would add an additional ~5GiB with avg as (
select (select sum(bytes_on_disk) from system.parts where table='events_v2' and active) /
(select count(*) from events_v2) as bytes)
select formatReadableSize(7.5 * 100000000 * bytes) from avg;
-- ┌─formatReadableSize(multiply(multiply(7.5, 100000000), bytes))─┐
-- │ 5.04 GiB │
-- └───────────────────────────────────────────────────────────────┘ So MAU 100M = 10GB |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
this
Is there a guide for SSD (HDD) capacity?(postgresql, clickhouse)
I was wondering how much capacity I would need roughly depending on how many users I have on my site. ex) MAU 100M = 100G
Beta Was this translation helpful? Give feedback.
All reactions