Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use-after-free bug when updates on SSL server handshaker #36693

Open
gengliqi opened this issue May 22, 2024 · 4 comments
Open

Use-after-free bug when updates on SSL server handshaker #36693

gengliqi opened this issue May 22, 2024 · 4 comments

Comments

@gengliqi
Copy link

gengliqi commented May 22, 2024

What version of gRPC and what language are you using?

v1.44.0 c++
The related code is the same as the master branch.

What operating system (Linux, Windows,...) and version?

Doesn't matter

What runtime / compiler are you using (e.g. python version or version of gcc)

Doesn't matter

What did you do?

TiFlash use gRPC grpc_ssl_server_credentials_create_options_using_config_fetcher(pingcap/tiflash#6346) to support reloading SSL online.
However, we found TiFlash crashed sometimes in gRPC core.(pingcap/tiflash#8535 (comment))

After some investigation, we figured out the root cause is that server_handshaker_factory may be used after free.

void add_handshakers(const grpc_core::ChannelArgs& args,
grpc_pollset_set* /*interested_parties*/,
grpc_core::HandshakeManager* handshake_mgr) override {
// Instantiate TSI handshaker.
try_fetch_ssl_server_credentials();
tsi_handshaker* tsi_hs = nullptr;
tsi_result result = tsi_ssl_server_handshaker_factory_create_handshaker(
server_handshaker_factory_, /*network_bio_buf_size=*/0,
/*ssl_bio_buf_size=*/0, &tsi_hs);

Imagine a case:
Thread 1: call try_fetch_ssl_server_credentials but nothing has changed.
Thread 1: call tsi_ssl_server_handshaker_factory_create_handshaker with the server_handshaker_factory_.
Thread 2: call try_fetch_ssl_server_credentials where server_handshaker_factory_ has been updated and the original server_handshaker_factory_ has been destroyed due to zero ref count.
impl->outgoing_bytes_buffer =
static_cast<unsigned char*>(gpr_zalloc(impl->outgoing_bytes_buffer_size));
impl->base.vtable = &handshaker_vtable;
impl->factory_ref = tsi_ssl_handshaker_factory_ref(factory);
*handshaker = &impl->base;
return TSI_OK;

Thread 1: see code above. Line 1936 shows that the newly-created tsi_ssl_handshaker uses the destroyed server_handshaker_factory_, leading to a crash eventually.

This issue is basically the same as #22489.
Unfortunately, #22647 did not fix this bug thoroughly.
#22531 can fix this bug but it's closed.

What did you expect to see?

No crash.

What did you see instead?

Crashed.
Stack can be seen in pingcap/tiflash#8535 (comment).

Anything else we should know about your project / environment?

@gengliqi gengliqi changed the title Use-after-free bug when simultaneous updates on SSL server handshaker Use-after-free bug when updates on SSL server handshaker May 22, 2024
@yashykt
Copy link
Member

yashykt commented May 28, 2024

Hi @gengliqi, v1.44.x is an old branch. Can you please check if the issue exists with 1.63/1.64?

@yashykt yashykt assigned erm-g and unassigned yashykt May 28, 2024
@gengliqi
Copy link
Author

Hi @gengliqi, v1.44.x is an old branch. Can you please check if the issue exists with 1.63/1.64?

It does exist in 1.63/1.64. The code I referenced before comes from the master branch.

@erm-g
Copy link
Contributor

erm-g commented May 29, 2024

Hi @gengliqi, thanks for reporting the problem. Could provide a test to reproduce the crash?

@erm-g erm-g removed the untriaged label May 29, 2024
@gengliqi
Copy link
Author

gengliqi commented May 29, 2024

We didn't write a small unit test to verify this issue. We modified the code of TiFlash and gRPC core to make the crash occur very quickly in the following way:

  1. update SSL server handshaker very frequently
  2. add a random sleep before impl->factory_ref = tsi_ssl_handshaker_factory_ref(factory);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants