Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: What happens if a spegel node is down? #440

Closed
guettli opened this issue Apr 17, 2024 · 6 comments · Fixed by #443
Closed

Question: What happens if a spegel node is down? #440

guettli opened this issue Apr 17, 2024 · 6 comments · Fixed by #443
Labels
enhancement New feature or request

Comments

@guettli
Copy link
Contributor

guettli commented Apr 17, 2024

Describe the problem to be solved

If a spegel node goes down, then the images stored on it will be unreachable.

Does spegel distribute the images on several hosts, so that an outage of one node does not have an impact?

I read the docs, forgive me if I was blind, but I found nothing about redundant storage of images.

Proposed solution to the problem

A new part in the documentation which explains what happens if a spegel node is down would be great.

If the answer is "there is no redundancy. If one node goes down, then these image need to be fetched from upstream again", then it is also fine (and should be part of the docs).

@guettli guettli added the enhancement New feature or request label Apr 17, 2024
@danielloader
Copy link

From my understanding you're correct, if a node goes (spot instance reclaimed, goes offline, daemonset dies and loses a pod) then spegel will just pull via the spegel pod on the same node, and then advertise it's available to the other instances if you initiate a pull on another node.

@bittrance
Copy link
Contributor

bittrance commented Apr 17, 2024

UPDATE: Note, better version below.

Indeed, spegel does not currently do any proactive replication of pulled images, see #375. And as danielloader says, the consequence of peer failure is relatively benign. Here is an attempt at a FAQ answer for this question:

In the interval between a spegel peer failing (e.g. node death) and the consensus algorithm agrees the peer is dead, other spegel peers may try to forward requests to the failed peer, delaying the response to the pulling client. In benign scenarios, this delay is the length of an intra-cluster round trip, likely <1ms. Of course, there are less benign scenarios (e.g. inter-node packet loss) where no replies will come back and spegel's forwarder will eventually time out before moving on to the next available instance. Spegel does not specify the various options (primarily timeout and dialOptions) to its internal containerd client and depends on defaults as set in https://github.com/containerd/containerd/blob/8317959018015f6a1756ec8cd08be1093fd630a2/client/client.go#L87. Similarly, spegel depends on libp2p default algorithm and options for detecting dead peers. The exact length of the window between failure and consensus is too dependent on failure mode to give with confidence, but I would expect it to be 1-60s in 95% of cases.

Please note that a client is likely to request several layers in parallel and spegel will try to spread its forwards across those peers that announces a particular layer, the benign scenario is unlikely to impact pod startup time. Only when multiple spegel instances fail simultaneously or when an image dominated by one large layer is affected is pod startup time materially increased.

spegel's documentation currently does not have a detailed text description of the pull flow. @guettli do you feel we should have this detail level in the README or would you have found it if we put it in the FAQ?

@bittrance
Copy link
Contributor

bittrance commented Apr 17, 2024

Sorry, the above description is slightly confused. Actually, one spegel peer forwarding to another spegel peer will use a httputil.ReverseProxy (not the containerd.Client as the text above implies) which uses a http.DefaultTransport (see https://cs.opensource.google/go/go/+/master:src/net/http/transport.go;l=43) and will time out accordingly. The scenario above may of course also occur if spegel cannot talk to its local containerd domain socket, but that failure is likely to be instant, unless containerd misbehaves in some inspired way. Better version:

In the interval between a spegel peer failing (e.g. node death) and deciding that the peer is dead, other spegel peers may try to forward requests to the failed peer, delaying the response to the pulling client. In benign scenarios, this delay is the length of an intra-cluster round trip (the HTTP request and an ICMP unreachable response), likely <1ms. Of course, there are less benign scenarios (e.g. inter-node packet loss) where no replies will come back and spegel's forwarder will eventually time out before moving on to the next available peer. Spegel uses the standard library's httputil.ReverseProxy to forward requests, which in turn depends on DefaultTransport to decide how long to wait before giving up. Similarly, spegel depends on libp2p default algorithm and options for detecting dead peers. The exact length of the window between failure and eviction can vary, but the max TTL for a resolved peer is currently 10 minutes, so that should be the upper bound.

Please note that a client is likely to request several layers in parallel and spegel will try to spread its forwards across those peers that announces a particular layer, the benign scenario is unlikely to impact pod startup time. Only when multiple spegel instances fail simultaneously or when an image dominated by one large layer is affected is pod startup time materially increased.

@guettli
Copy link
Contributor Author

guettli commented Apr 17, 2024

@bittrance it would be great to have this in the FAQ. Thank you!

@phillebaba
Copy link
Member

I will have a look at #443 tomorrow but over all what @bittrance stated is true.

I have been looking at future solutions to do preemptive distribution of images to make sure that replication is >1. This will most likely be a feature in Spegel in the future but i don't know when and how it will look. There is a lot of aspects to take into account when building these features and I want to hit as many use cases with as small changes as possible.

@guettli
Copy link
Contributor Author

guettli commented Apr 18, 2024

@phillebaba great to hear, your plans. At the moment this question was mostly about missing documentation. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants