-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reset sx_fbtbc to 0 once we are sure it's haproxy v2 header #322
base: next
Are you sure you want to change the base?
Conversation
Reporter with the same issue here: It's pretty easy to reproduce:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to understand the flow path of the failure better.
Also, it seems like it might be better to reset sx_fbtbc down below in the switch after the call to handle_haproxy_header().
I'm also curious down there is we should do a go to again for local command also? Not quite sure how that's supposed to be handled...
I did try to reset it after the ::recv returned EAGAIN, right before https://github.com/nfs-ganesha/ntirpc/blob/next/src/svc_vc.c#L993, but there is a race condition: Before the sx_fbtbc can be reset in this thread, a new thread which is trigged by the epoll event on the same socket fd, trying to recv the body message, will call svc_vc_recv with the same xprt and crash at the same place. |
When receiving a haproxy v2 message, after processing the first fragment which only contain the proxy header, the ::recv might return EGAIN if the next fragment is not ready to read yet, then the function svc_vc_recv would return, but left xd->sx_fbtbx == 0x0d0a0d0a (PP2_SIG_UINT32), which was set when parsing the haproxy v2 header. When the next fragment is ready, the epoll event for the same xprt will be trigged, then a non-zero xd->sx_fbtbc would deceit sv_vc_recv into derefrencing a NULL uv pointer. Stack trace: Thread 76 "ganesha.nfsd" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fc9937ee640 (LWP 2931)] 0x00007fc9f3e30fa6 in svc_vc_recv (xprt=0x7fc9dc00e3b0) at /git/nfs-ganesha/src/libntirpc/src/svc_vc.c:1070 1070 flags = uv->u.uio_flags; (gdb) bt (gdb) p uv $1 = (struct xdr_ioq_uv *) 0x0 (gdb) p xd->sx_fbtbc $2 = 218762506
7cb5246
to
17e95c3
Compare
@ffilz I revised the patch by resetting the sx_fbtbc after the call to handle_haproxy_header, I can't reproduce the crash with it, works perfect! please take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad this works. That looks great and easily understood (and the code is tight enough here to not need a comment explaining why sx_fbtbc is being reset).
When receiving a haproxy v2 message, after processing the first fragment which only contain the proxy header, the ::recv might return EGAIN if the next fragment is not ready to read yet, then the function svc_vc_recv would return, but left xd->sx_fbtbx == 0x0d0a0d0a (PP2_SIG_UINT32), which was set when parsing the haproxy v2 header. When the next fragment is ready, the epoll event for the same xprt will be trigged, then a non-zero xd->sx_fbtbc would deceit sv_vc_recv into derefrencing a NULL uv pointer.
Stack trace:
Thread 76 "ganesha.nfsd" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fc9937ee640 (LWP 2931)]
0x00007fc9f3e30fa6 in svc_vc_recv (xprt=0x7fc9dc00e3b0) at /git/nfs-ganesha/src/libntirpc/src/svc_vc.c:1070 1070 flags = uv->u.uio_flags;
(gdb) bt
#0 0x00007fc9f3e30fa6 in svc_vc_recv (xprt=0x7fc9dc00e3b0) at /git/nfs-ganesha/src/libntirpc/src/svc_vc.c:1070
#1 0x00007fc9f3e2baab in svc_rqst_xprt_task_recv (wpe=0x7fc9dc00e680) at /git/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1210
#2 0x00007fc9f3e2c740 in svc_rqst_epoll_loop (wpe=0x3003cb8) at /git/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1585
#3 0x00007fc9f3e39be8 in work_pool_thread (arg=0x7fc9e007e030) at /git/nfs-ganesha/src/libntirpc/src/work_pool.c:187
#4 0x00007fc9f3aa0c52 in start_thread () from /lib64/libc.so.6
#5 0x00007fc9f3b24f74 in clone () from /lib64/libc.so.6
(gdb) p uv
$1 = (struct xdr_ioq_uv *) 0x0
(gdb) p xd->sx_fbtbc
$2 = 218762506