Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The slave sync thread model is not reasonable #2637

Closed
cheniujh opened this issue May 7, 2024 · 1 comment · Fixed by #2638
Closed

The slave sync thread model is not reasonable #2637

cheniujh opened this issue May 7, 2024 · 1 comment · Fixed by #2638
Labels
☢️ Bug Something isn't working

Comments

@cheniujh
Copy link
Collaborator

cheniujh commented May 7, 2024

Is this a regression?

No

Description

当前,Pika从节点消费Binlog部分的线程模型是:
1 取conf文件中的sync-thread-num值,产生 sync-thread-num * 2数量的worker线程,这批worker的前一半会被选取来Apply Binlog,后一半Worker用于Apply DB
2 消费Binlog时,为了保整消费顺序,每个DB的Binlog都确保是同一个worker处理的,此时的worker选取策略是对db_name做hash来得到worker index,从worker vector的前一半中取一个固定的worker
3 某个worker完成Apply Binlog以后,会使用key做hash来取得index,从worker数组的后一半中取得一个worker,提交异步的WriteDB任务

问题在于:
针对1, 用户并不知情pika内部对sync-thread-num乘以了2,是否不太合适,而且这样一来其实用户无法精确控制具体的线程数:比如为了保证WriteDB部分线程数不会太少,该配置项的默认值是6,那么Pika内部就一共有12个Worker,其中前6个用于写Binlog,后6个用于写DB,而在单DB的情况下,前6个worker中有5个是闲置且永远不会被使用的。
针对2,使用db_name做hash来取得index,存在倾斜问题,经过实测,在DB数量为8,且sync-thread-num为8的情况下,根据hash映射:
DB 1,4,7会都绑定到worker 3上;
DB 3,6绑定到worker 0上;
DB0会绑定到worker2;
DB2会绑定到worker7上;
DB5会绑定到worker6上;
(这里说的绑定是指:DB只向该worker线程委托(Schedule)写Binlog的任务)
worker1,4,5完全闲置,但此时有8个DB,也有8个worker也是专门用于写Binlog的,完全可以每个DB使用一个worker
这种倾斜不仅会带来资源浪费,更重要的是如果某个DB暂时因为WriteStall阻塞了,这种阻塞可能会被放大(因为共用worker的缘故,会把其他DB的WriteBinlog任务也给阻塞了)

Please provide a link to a minimal reproduction of the bug

No response

Screenshots or videos

No response

Please provide the version you discovered this bug in (check about page for version information)

No response

Anything else?

No response

@cheniujh cheniujh added the ☢️ Bug Something isn't working label May 7, 2024
@cheniujh
Copy link
Collaborator Author

cheniujh commented May 7, 2024

Description:

Currently, Pika's thread model for consuming Binlog from nodes is as follows:

  1. The value of sync-thread-num from the configuration file is taken to generate sync-thread-num * 2 worker threads. The first half of these workers are selected to apply Binlog, while the second half are used for applying operations to the database (DB).
  2. To ensure the order of consumption, each DB's Binlog is processed by the same worker. The worker selection strategy involves hashing the db_name to obtain a worker index, and a fixed worker is chosen from the first half of the worker vector.
  3. After a worker completes applying Binlog, it uses a hash of the key to obtain an index and selects a worker from the second half of the worker array to submit an asynchronous WriteDB task.

The issues are:
Regarding point 1, users are not informed that Pika internally multiplies sync-thread-num by two, which may not be appropriate as it prevents users from precisely controlling the actual number of threads. For instance, to ensure that the number of threads for WriteDB is not too few, the default value of this configuration item is 6, resulting in a total of 12 workers inside Pika. Of these, the first six are used for writing Binlog, and the last six for the DB. In a single DB scenario, five of the first six workers remain idle and are never used.

Regarding point 2, using the db_name hash to obtain an index can lead to an imbalance. For example, in a test with 8 DBs and sync-thread-num set to 8, the hash mapping results in:

  • DB 1, 4, and 7 all binding to worker 3;
  • DB 3 and 6 binding to worker 0;
  • DB 0 binding to worker 2;
  • DB 2 binding to worker 7;
  • DB 5 binding to worker 6.
    In this setup, workers 1, 4, and 5 are completely idle. Although there are 8 DBs and an equal number of workers dedicated to writing Binlog, it would be more efficient for each DB to have its own worker. This imbalance not only leads to resource wastage but also risks amplifying issues such as WriteStall, where a stall in one DB could block Binlog writing tasks for other DBs sharing the same worker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
☢️ Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant