You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The net log writer is great for infrastructure where logs across all systems sink into a centralized system. However, the current implementation is bitten by the following fallacies of distributed systems1:
The network is reliable
Latency is zero
Bandwidth is infinite
Transport cost is zero
We've heard at least 1 report of pain when the network misbehaves, #4083. While timeouts and redialing resolved the earlier issue, it isn't robust against the relevant fallacies. We can still suffer from slowness. We're still reliant on the network's bandwidth. The network misbehavior can impact Caddy's performance.
To get around the fallacies and not impact Caddy's performance, I propose we introduce WAL to the net writer. The WAL will be placed in the data directory. Log writes are synchronously written to the WAL, and an asynchronous reader (from WAL) picks up the entries to write them to the network. On first open, the net writer opens the standard WAL, and it should check for unwritten entries to be synchronized to upstream. On close, the writer should attempt to flush/drain all the entries in the WAL.
With this implementation, Caddy will not suffer due to external network issues pertaining to the log sink, e.g. network is slow, log ingester is slow, or any of the fallacies.
The
net
log writer is great for infrastructure where logs across all systems sink into a centralized system. However, the current implementation is bitten by the following fallacies of distributed systems1:We've heard at least 1 report of pain when the network misbehaves, #4083. While timeouts and redialing resolved the earlier issue, it isn't robust against the relevant fallacies. We can still suffer from slowness. We're still reliant on the network's bandwidth. The network misbehavior can impact Caddy's performance.
To get around the fallacies and not impact Caddy's performance, I propose we introduce WAL to the
net
writer. The WAL will be placed in the data directory. Log writes are synchronously written to the WAL, and an asynchronous reader (from WAL) picks up the entries to write them to the network. On first open, thenet
writer opens the standard WAL, and it should check for unwritten entries to be synchronized to upstream. On close, the writer should attempt to flush/drain all the entries in the WAL.With this implementation, Caddy will not suffer due to external network issues pertaining to the log sink, e.g. network is slow, log ingester is slow, or any of the fallacies.
Have I missed anything?
Footnotes
Fallacies of Distributed Systems ↩
The text was updated successfully, but these errors were encountered: