-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preferred nodes #139
base: master
Are you sure you want to change the base?
Preferred nodes #139
Conversation
If a cluster is distributed geographically, it can be preferable to query the local nodes. For readonly commands, we have the posibility to chose a subset of the clsuter nodes that will be prioritized when choosing a node. A specific local slave can therefore be used.
That's cool. If I'm not mistaken, you built a single redis cluster across multiple data centers. Before we dive into the implementation, I'm so curious on how you can build a geographically distributed redis cluster. |
We have as many redis clusters as we have DCs with the masters for a given cluster in a single DC and the data replicated to the slaves in all the other DCs.
Indeed, because of the possible network issues between the DCs, we had to disable slave promotion by setting cluster-node-timeout to a very high value.
That way, we are sure the masters for a cluster stay in the same DC.
Now, of course this means our clusters won’t heal in case a master goes down but this is acceptable to us.
We have applications in each DC that need to read data from the clusters and preferably from the local nodes in order to keep the latency down.
Hence the PR that lets us configure corvus in each DC with the list of preferred nodes.
I understand our use-case is a little special but I believe others could find this useful.
… On 02 Apr 2018, at 12:07, doyoubi ***@***.***> wrote:
That's cool. If I'm not mistaken, you built a single redis cluster across multiple data centers. Before we dive into the implementation, I'm so curious on how you can build a distributed geographically redis cluster.
As far as I know, due to the unreliable network between data centers, the slaves in such distributed geographically cluster will keep promoting to master when suffering high network latency. It's just not reliable to simply deploy the unmodified official redis cluster in multiple data centers. The only team using this solution I know did a great change in the origin redis source code and reimplement the failure detection and master promotion themselves.
So how can you do that? Does it run perfectly in your production environment?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#139 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADiA_bf3kIRDNKZMyQKsHVjrjzs5q5fUks5tkfhQgaJpZM4TCFRe>.
|
To be honest I'm not in favor of doing this.
There're also some great tools for you: Please check them out. |
Here's one more: https://github.com/CodisLabs/redis-port |
The problem with having a local cluster that would be a copy of the master is that we lose the benefit of the multiple slaves that redis cluster provides.
In our current configuration, each master has a slave in every other DC.
With the changes to corvus, we’ll be reading from the local nodes (master or slave) but if one of them goes down, corvus will automatically switch to another node (in another DC).
The system will be in degraded mode because of the extra network latency but at least it can cope with the loss of one or more nodes.
Having completely separate clusters that are replicated doesn’t provide that flexibility.
… On 04 Apr 2018, at 04:33, Tevin ***@***.***> wrote:
Here's one more: https://github.com/CodisLabs/redis-port <https://github.com/CodisLabs/redis-port>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#139 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADiA_b_QnfOZh0t90pIQ6AGLFHR57aDLks5tlDD1gaJpZM4TCFRe>.
|
Yes, it might be easier to just use a single cluster across multiple DCs for the one-to-many replication. Further, the frequently exchanged gossip packet goes crazy in large clusters with over hundreds of nodes which occupies the bandwidth between DCs. Building your own replication tools with a large queue makes better performance and availability and has already been chosen by most of teams building multiple DC replication as far as I know. Your solution is easy to build but lacks multiple backup slaves in the DCs receiving data. I suggest try building your own one. |
|
Allow preferred nodes to be configured
If a cluster is distributed geographically, it can be preferable to query the local nodes. For readonly commands, preferred nodes let us chose a subset of the cluster nodes that will be prioritised when choosing a node. A specific local slave can therefore be used.