Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

在redis_cluster集群基础上安装了corvus后为啥tps降低了 #150

Open
sccotJiang opened this issue Nov 6, 2019 · 15 comments
Open

Comments

@sccotJiang
Copy link

sccotJiang commented Nov 6, 2019

我创建了3个主节点 3个从节点 ,然后配置如下:
bind 12345
node 127.0.0.1:6379,127.0.0.1:6380,127.0.0.1:6381
thread 4
启动后用redis-benchmark压测发现tps降了
不用corvus是这样 GET: 33046.93 requests per second
用了corvus后 GET: 5262.88 requests per second

@whk-isat
Copy link

whk-isat commented Dec 6, 2019

我也是遇到相同的问题,用corvus代理集群居然比直接访问差好多,请问你后面是怎么解决的

@sccotJiang
Copy link
Author

还没解决

@jasonjoo2010
Copy link
Contributor

jasonjoo2010 commented Dec 30, 2019

这样的情况下,需要具体研究,一般是三个大方向:

Ping值

因为加了中间层,所以需要具体调研链路情况,从client -> redis变成了client -> corvus -> redis,简单的测试方法,就是单线程地调用一定次数(例如10000次)使用平均值来对比。

如果Ping值提高了(较为显著),需要先去解决这个问题,但此时并发能力并无太大变化。

带宽

参考Ping值

多例

corvus被设计为无状态,可多例做负载均衡,例如三个实例来同时做压力均衡,这种模式下再做压测。

@savagecm
Copy link

savagecm commented Jan 13, 2020

maybe because of redis pipeline?

// wait for all cmds in cmd_queue to be done

For example, if Corvus reveive 3 command in the pipeline. Then Corvus will wait for the response for all the three command. Then send back to client.

We had did some test, if we do not use this command queue to wait all the response. This will improve performance a lot.

So could we change the code to remove the wait command queue and reply to clinet immediately?
or how can we contribute?
@jasonjoo2010

@jasonjoo2010
Copy link
Contributor

maybe because of redis pipeline?

// wait for all cmds in cmd_queue to be done

For example, if Corvus reveive 3 command in the pipeline. Then Corvus will wait for the response for all the three command. Then send back to client.

We had did some test, if we do not use this command queue to wait all the response. This will improve performance a lot.

So could we change the code to remove the wait command queue and reply to clinet immediately?
or how can we contribute?
@jasonjoo2010

I think things are not so simple.

I just do a local test with 6 nodes which the original author mentioned and only ONE corvus proxy (also locally) with 4 threads and ONLY do tests "get" and "set" (I will explain it later) and the results are acceptable:

4 clients benchmark: redis-benchmark -h 127.0.0.1 -p 6666 -c 4 -n 1000000 -t get,set

====== SET ======
  1000000 requests completed in 21.22 seconds
  4 parallel clients
  3 bytes payload
  keep alive: 1

47123.13 requests per second

====== GET ======
  1000000 requests completed in 17.19 seconds
  4 parallel clients
  3 bytes payload
  keep alive: 1

58180.12 requests per second

10 clients benchmark: redis-benchmark -h 127.0.0.1 -p 6666 -c 10 -n 1000000 -t get,set

====== SET ======
  1000000 requests completed in 13.09 seconds
  10 parallel clients
  3 bytes payload
  keep alive: 1

76382.52 requests per second

====== GET ======
  1000000 requests completed in 11.92 seconds
  10 parallel clients
  3 bytes payload
  keep alive: 1

83899.66 requests per second

20 clients benchmark: redis-benchmark -h 127.0.0.1 -p 6666 -c 20 -n 1000000 -t get,set

====== SET ======
  1000000 requests completed in 11.78 seconds
  20 parallel clients
  3 bytes payload
  keep alive: 1

84904.06 requests per second

====== GET ======
  1000000 requests completed in 10.20 seconds
  20 parallel clients
  3 bytes payload
  keep alive: 1

98048.83 requests per second

For comparing I also test with the original redis nodes and fill them into a table:

GET:

clients redis proxy
4 108448.11 58180.12
10 113019.90 83899.66
20 111969.55 98048.83

SET:

clients redis proxy
4 102997.23 47123.13
10 108389.34 76382.52
20 112019.72 84904.06

The capacity is so constant when connecting to redis directly.
Why?
Because redis-benchmark doesn't support cluster until 6.0 but I just try 6.0-rc1 and got core dump. So I don't dig it deeper because the test results under proxy are making sense for me. (Though threy are locally)

I think the guy who submitted this issue may not really put the load to redis (most reply got MOVED so it's fast).

What should be really paid attention is that the key point I mentioned in previous post: Don't miss the physical latency in the transferring route.

@jasonjoo2010
Copy link
Contributor

And attach the cluster booting commands for reference:

redis-server --port 6380 --cluster-enabled yes --cluster-config-file nodes-6380.conf
redis-server --port 6381 --cluster-enabled yes --cluster-config-file nodes-6381.conf
redis-server --port 6382 --cluster-enabled yes --cluster-config-file nodes-6382.conf

redis-server --port 6383 --cluster-enabled yes --cluster-config-file nodes-6383.conf
redis-server --port 6384 --cluster-enabled yes --cluster-config-file nodes-6384.conf
redis-server --port 6385 --cluster-enabled yes --cluster-config-file nodes-6385.conf

redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6381
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6382
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6383
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6384
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6385


redis-cli -h 127.0.0.1 -p 6380 cluster addslots $(seq 0 5000)
redis-cli -h 127.0.0.1 -p 6381 cluster addslots $(seq 5001 11000)
redis-cli -h 127.0.0.1 -p 6382 cluster addslots $(seq 11001 16383)

redis-cli -h 127.0.0.1 -p 6383 cluster replicate $(redis-cli -h 127.0.0.1 -p 6383 cluster nodes |grep '0-5000' |awk '{print $1}')
redis-cli -h 127.0.0.1 -p 6384 cluster replicate $(redis-cli -h 127.0.0.1 -p 6384 cluster nodes |grep '5001-11000' |awk '{print $1}')
redis-cli -h 127.0.0.1 -p 6385 cluster replicate $(redis-cli -h 127.0.0.1 -p 6385 cluster nodes |grep '11001-16383' |awk '{print $1}')

# because the config file cannot be omitted so I just make a simply config with "bind 6666" in it
corvus -b 6666 -c default -t 4 -n 127.0.0.1:6380,127.0.0.1:6381 a.conf

@jasonjoo2010
Copy link
Contributor

maybe because of redis pipeline?

// wait for all cmds in cmd_queue to be done

For example, if Corvus reveive 3 command in the pipeline. Then Corvus will wait for the response for all the three command. Then send back to client.

We had did some test, if we do not use this command queue to wait all the response. This will improve performance a lot.

So could we change the code to remove the wait command queue and reply to clinet immediately?
or how can we contribute?
@jasonjoo2010

And more as you mentioned in my opinion pipeline is not a correct scenario to corvus as publish/subscribe. We have implemented a branch supporting pub/sub but now we use it in client side directly (Cluster jedis client for example). corvus is efficient for those legacy projects but if you can support redis cluster directly just connect directly.

For HA purpose we have an automatic publishing agent which operating backends in lvs/dpvs[1] and the smart clients in application can always keep synchronized with the cluster.

If you are in any kind of cloud environment you can develop the agent operating the load balancer using cloud api to make the VIP updated.

[1] http://github.com/iqiyi/dpvs

@savagecm
Copy link

with 6 nodes which the original author mentioned and only ONE corvus pro

I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.

@jasonjoo2010
Copy link
Contributor

with 6 nodes which the original author mentioned and only ONE corvus pro

I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.

Nope

 -P <numreq>        Pipeline <numreq> requests. Default 1 (no pipeline).

By default benchmark will not enable pipeline feature.

@jasonjoo2010
Copy link
Contributor

And what I agree with you is that it's better connect to redis cluster directly when we want use this feature.

@savagecm
Copy link

with 6 nodes which the original author mentioned and only ONE corvus pro

I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.

Nope

 -P <numreq>        Pipeline <numreq> requests. Default 1 (no pipeline).

By default benchmark will not enable pipeline feature.

I just want to mention that - the code I mentioned will hold severial request and will enlarge the latency. Corvus will get better performance after we change this logic.

I am asking the maintainer if they can change or I can contribute. You can do the same thing to test if you can get better performance

@jasonjoo2010
Copy link
Contributor

with 6 nodes which the original author mentioned and only ONE corvus pro

I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.

Nope

 -P <numreq>        Pipeline <numreq> requests. Default 1 (no pipeline).

By default benchmark will not enable pipeline feature.

I just want to mention that - the code I mentioned will hold severial request and will enlarge the latency. Corvus will get better performance after we change this logic.

I am asking the maintainer if they can change or I can contribute. You can do the same thing to test if you can get better performance

Yeah you're right here. It would be great if you can shot it a patch on it.

@savagecm
Copy link

with 6 nodes which the original author mentioned and only ONE corvus pro

I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.

Nope

 -P <numreq>        Pipeline <numreq> requests. Default 1 (no pipeline).

By default benchmark will not enable pipeline feature.

I just want to mention that - the code I mentioned will hold severial request and will enlarge the latency. Corvus will get better performance after we change this logic.
I am asking the maintainer if they can change or I can contribute. You can do the same thing to test if you can get better performance

Yeah you're right here. It would be great if you can shot it a patch on it.

Our change may introduce bugs. Maybe we can ask if the maintainer could fix that.

Or maybe they have their reason to do so......

Btw who is the acitve maintainer of this rep now?

@jasonjoo2010
Copy link
Contributor

with 6 nodes which the original author mentioned and only ONE corvus pro

I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.

Nope

 -P <numreq>        Pipeline <numreq> requests. Default 1 (no pipeline).

By default benchmark will not enable pipeline feature.

I just want to mention that - the code I mentioned will hold severial request and will enlarge the latency. Corvus will get better performance after we change this logic.
I am asking the maintainer if they can change or I can contribute. You can do the same thing to test if you can get better performance

Yeah you're right here. It would be great if you can shot it a patch on it.

Our change may introduce bugs. Maybe we can ask if the maintainer could fix that.

Or maybe they have their reason to do so......

Btw who is the acitve maintainer of this rep now?

So far as I know they have turned to use semi-client-side layer like sidecar in containers environment. Something more like connecting to cluster directly. We can cue @tevino

@tevino
Copy link
Contributor

tevino commented May 27, 2020

Yes, it's here, take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants