Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] writing hangs after receiving EAGAIN #1114

Open
jiankyu opened this issue Jan 27, 2023 · 1 comment
Open

[Bug] writing hangs after receiving EAGAIN #1114

jiankyu opened this issue Jan 27, 2023 · 1 comment

Comments

@jiankyu
Copy link
Contributor

jiankyu commented Jan 27, 2023

We run an application that uses pistache to provide an API server. One of the API endpoint returns relatively large amount of data (about 160KB), we observed "incomplete" responses that make the clients stuck.

When we use the curl command to inspect the response data, the symptom is, the response header is complete (with 200 response code, content length, and the "Connection: Close" indication, however, the actual data is less than specified in the "Content-Length" field. As a result, curl waits for the remaining data and doesn't finish, until we manually press "Ctrl+C" to stop it.

> Host: 10.42.15.15
> User-Agent: curl/7.68.0
> Accept: */*
>
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [217 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [217 bytes data]
* old SSL session ID is stale, removing
{ [5 bytes data]
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Connection: Close
< Content-Length: 166272
<
{ [16322 bytes data]
 59  162k   59 98242    0     0   8042      0  0:00:20  0:00:12  0:00:08     0^C

(The ^C at the end is the result as we pressed Ctrl+C)

Initially we thought it might be due to networking issue, since this program runs on Kubernetes and we have ipvs and complicated iptables rules on the host. But later all the suspicions are all ruled out. During the diagnosis we found the observation itself also makes impacts to the program behavior, for example, when we use strace to attach to the process, the issue has less chance to appear. But we do find the pattern and think this should be a bug in pistache.

Several things are happening when we hit this issue, they are always the case:

  1. We use tcpdump to capture the packets on the server interface. We always see "TCP Window Full" diagnostic information. (Packet No. 1553), later all data are acknowledged by the client (Packet No.1563), but server doesn't send more data until we hit ^C, which triggers the client to close the socket (Packet No.2187).

image

  1. When the client is waiting on more data, before we hit the ^C to terminate the connection, we use netstat to inspect both client and server side connection information. From both side, regarding that connection, the tcp send queue and receive queue are all empty.

Those evidences make us believe it is either the application itself or the kernel that doesn't behave correctly, so we focus more on strace. Below is the comparison of a successful response and a stuck one. First column in each line is the line number, second column is the pid of that thread, we have multiple pistache threads in the application.

When the issue doesn't occur, all data are successfully written. (The socket fd is 200). Then in L24821 it calls epool_ctl to just wait EPOLLIN events. The client, after it gets all the data, closes the socket, which makes the epoll_wait in L24822 return and we proceed to the socket closure on L24826.

24807 10352 18:41:28.831194 epoll_wait(68, [{EPOLLOUT, {u32=200, u64=200}}], 1024, -1) = 1
24808 10352 18:41:28.831236 epoll_ctl(68, EPOLL_CTL_MOD, 200, {EPOLLIN|EPOLLET, {u32=200, u64=200}}) = 0
24809 10352 18:41:28.831297 write(200, "\27\3\3@\21\\\352\321\344\37\24\264\262\2703\272d\277&a\371j(\201 \367\242\376.fs=<\301\201-\340I\240\274\360B\246\301j$\307\316G\17\332\335\340\321\335\333j\220\273S\24|\370\270hL\3552O2\321\3711\3229\31      3\341\35D\245\32{\\K!\264i\357\31\325\226jr\370L?gKU\362\177\341\265\\!#\37b\346\v\6\5Nh\"\372\236\345\177J:`%\2475\177\251\340"..., 16406) = 16406
24810 10352 18:41:28.831363 write(200, "\27\3\3@\21\27\322\2563\343*\340]`O\251\177\240G8\273\246\355t!\214\357\222\"Y\3144\257\357\346\346@\233f\321l\223\265\207\265\252\\,b}8\340\26Gk\361d\3478FV\17\331\263\235\223h47q*\311:]\326\17\37y\272\3      21kerOEQ\2534\372\272!w\270%=\374\310\350\32\221N\0\0274\5\303V\226\2402\325\236c\310\210\256/\331\3\364\361\374dj\320-\31\213"..., 16406) = 16406
24811 10352 18:41:28.831422 write(200, "\27\3\3@\21J\267O+\214\357\304\273k\314\363B\270\210,Y\305\220^s\370.esjJ\271\346\221|)\"\360\333\2\371\25\371\231>g\26*\247\367\352'C\317k\351\273\311\7mU2\373\f@P\3578\244\251\24\351\26\325BXdm\25\330\2      01\356\n\332\250H\304\363?\264\225\360\215\220\373qT\1\23%M)\343\324\323\352\5\337\266V\243o\224yr\324\246Y\266\0k\345\177\6\33\232\24\202"..., 16406) = 16406
24812 10352 18:41:28.831476 write(200, "\27\3\3@\21k\346\3\333\177\360<P,\225\243]\325[@9\376\232\342\244\1\254Tf\236Ji\212xrk*\3X\321\317\4\325\337\"g\372\272\16\177\347\\\310\274\200\4R\311\20\f\21\354QdU\203H\342\"\271\233\n\214t'z$\255\343\      370\32\3349\23\333`\276\21\272\10\25\217q\224@6\34\315\230X2D\377\5\265F!_\353\27\354bb\356\223L\25\303By\366\313\255\362\266\37 u"..., 16406) = 16406
24813 10352 18:41:28.831527 write(200, "\27\3\3@\21\274\377\372\236\23\213\0\271M\34\307\250\222U\246\361u\25:\230\177\3708\306\262a\31\245\205\16\26\237\275\313u\272@$\306\350\374=X\364\214;c{}\366\177\241yf\2545\237\6\314\335\211\312\336X\356      D\250l\307\20\257\216\241\261T\272\26~\365\270\270\24\271\243\361\230m\315\204Bl<\351Gj\227\313\371\216\346\3360\342\30}\3C\212\303P.\244T\324\340<^[\212VW\236\240"..., 16406) = 16406
24814 10352 18:41:28.831578 write(200, "\27\3\3@\21Z\r:\36\341D\3220(\20\30\f(\\\271\244\233\351\353\5KWN\374\310\3624,\210N]\350\37u\335\363\332\303\356\354\241\375\356\0013\30\367\242\271\v\300\333w\304\350l\350_\332\341\235\17\3678]5\203\204      \234\361\373[\27\37\242d\30\276\223\370F\202\373dE0\376\325\312\337\314\213\371wkJ\271\374\343\336\322V\244\354\361\246\377\266/4\211\273\25\276^\331\210\2660\33\370\0\312"..., 16406) = 16406
24815 10352 18:41:28.831626 write(200, "\27\3\3@\21\210\243c\274\313V\22\3169\375\250\334\240\246M\344\254\36\246\300h!\360\361L\357\334\377_&\362\16\272\6\272[w\262O\253\346\274\232\334\37J\0&dMGQ\217\214\341n\354\222\211\207/\24\346\243R \"p\      "q\262=\324BA\376\212u\36\225B\v\376\227\27\204\325\276!8$\367\326\202\322)\362\22Z\237-^\256I\3n\217\373~\277(\332\366\305u{~\233*\305s$\366"..., 16406) = 16406
24816 10352 18:41:28.831661 write(200, "\27\3\3@\21\224\221\t\7\f,\360\221\346\312\347\302\5\201 PT^\374Hi\335\227\16M\264X\261R\332\354g\234\225\342\315\276\267\30\376bo\272V\1U%Vj\211\200\233\315\242\352\\\10\236\227 \243-\"\266\275\343Z\24\3      30?h\5\240V\4\4\314wu%\205\262\341\355\224\240\346\223]\262e\345\201D\211G\243\23@\347m\332\220E\275&t\332\255\33%\203@\313Y0#L\371L\r\243\347"..., 16406) = 16406
24817 10352 18:41:28.831695 write(200, "\27\3\3@\21\t\256u\223\360\315o\217`\377\376\277b_5D<\244\356 \350w\272\233\312\261GP\364\332\312Q\304\\:\362^\312DT\254rC\t\263!\334\276a\v\7\0\266[\250L\307N\267\303\276\241\245$\243\332Y\26/o$\233|\177      \206\16G\\\372Hw\223\227i]\20)F`q5\314\323/\302%;\322\341{\225v\v\266\226Ht\26r\346\264N\376S\321\310\4\31\247G\320x\364"..., 16406) = 11412
24818 10352 18:41:28.831719 write(200, "\266\22\227Q\211M\10(t\272\261\217\230\323\364\236\326>Ef\\xj\300\23\363\254\234\236/\177\356o\204_}YY\20\5\374j\3}\232\314W'\236\306Z}\233\246L\363m\21\303\261\27gI\303\210\361\270\326]\371\v\206\3\321\2      75\35&?\\\35\10\314r@\202\336\252m\320v\242\1\214\262\21776\215vL\323i\244\223k\223A\35\356D\31\212\252\214\277\265\264\212\315\336\216m\317\335\360\271\266\7"..., 4994) = 4994
24819 10352 18:41:28.831756 write(200, "\27\3\3@\218\305\211C\37tY(\354\340\343M\232j\352\16\262~\5\276E\371\272\377h\35\277\345@\367(Bm\346\344\263Q\357\234n:\n\333\";\271\212\346\203*\374\353\317\3519\323L\fR\377z\310P\204\372g\2514=y\372\f\2      63Jg\337\257\335\30D2\36\3526b\333\260]\17\237\247\332c1\22\342G\345W8\20\313-\23\350\203\22\236\327\20\22\206\233\354\236\361\214\340&Z\275\266\16"..., 16406) = 16406
24820 10352 18:41:28.831791 write(200, "\27\3\3\t\317\277x\352\230\3\355;\354\nz\353w\222\276\372\225\f\6m\243_7#\204\312\20\7T\v\273\223\241\0\201(\212eE\233R~\356\272\331\236\215\204~\2136\3569$\232\"\344\2024@O\322\327\357^ew\227*\215\10\202      ?\203\266\233Y\31\272\356\246\303\353/\230\322[l\4]\333g\325\350M\34M\3143\330x\\\0j\\\371\257#a#_\224^\0\257w}\4\314\223\243\t\250A"..., 2516) = 2516
24821 10352 18:41:28.831823 epoll_ctl(68, EPOLL_CTL_MOD, 200, {EPOLLIN|EPOLLET, {u32=200, u64=200}}) = 0
24822 10352 18:41:28.831864 epoll_wait(68, [{EPOLLIN, {u32=200, u64=200}}], 1024, -1) = 1
24823 10352 18:41:28.832242 read(200, "\27\3\3\0\23", 5) = 5
24824 10352 18:41:28.832267 read(200, "~\3X\36\375:\253\200\236\366\233\231k,\305l\36`\f", 19) = 19
24825 10352 18:41:28.832323 epoll_ctl(68, EPOLL_CTL_DEL, 200, 0x7efc08feea24) = 0
24826 10352 18:41:28.832345 close(200)        = 0
24827 10352 18:41:28.832375 epoll_wait(68,  <unfinished ...>

Below is the related strace log when the issue happens. The socket fd is still 200. The second to last write (L27420), the write intention is 16406, but only 11612 bytes are written, then the next write got an EAGAIN (L27421), this is because the tcp send buffer is full, the kernel needs some ack packets from the client to release the buffer.
As a response, in L27422 the epoll_ctl is called to subscribe to both EPOLLIN and EPOLLOUT event. If the tcp send buffer has available space, we expect a EPOLLOUT event to resume the writing.
The problem is in L27423, another epoll_ctl is immediately called to only subscribe EPOLLIN event. When the send buffer is available again, the kernel won't send any event since EPOLLOUT is unsubscribed.

27412 10352 18:41:30.414047 epoll_wait(68, [{EPOLLOUT, {u32=200, u64=200}}], 1024, -1) = 1
27413 10352 18:41:30.414072 epoll_ctl(68, EPOLL_CTL_MOD, 200, {EPOLLIN|EPOLLET, {u32=200, u64=200}}) = 0
27414 10352 18:41:30.414117 write(200, "\27\3\3@\21\f{\204J\377^x6C\n\21\346\t\364a\335\v\275\220\330\305\23\345\202\2260&\364qfo}\n\261c\36V]\253\237\305\265\1\264Ry\267@\373z\2\356\357h\236\344\303q\221\36\304\332\2240\345^\326\323\345\373\31      2m\274K\371\224\31\260m`\352I\212\226^\253\335nz\272pD/\256\340Hd\224\n\367\226\21\266\251M\260\222\345d\363\242\310\310\240\264+\303\247~MZ\210\245"..., 16406) = 16406
27415 10352 18:41:30.414165 write(200, "\27\3\3@\21\305\371x6\361\243#\3\257Y\216\4E\214\vU\226\376\212\302\376\233?\214\320\306\235\342\272`\36Z7\364\3342\252\10xK)\2338^\334}AA\321\372\26b\26\203\324\205\356\255\270\352\34\235{B\255o\33\271\2      71\215\211\210\244\312\36)m\363\205\1\350~\361\f\327\225G\232\327Y\362\33KWIG\0001\361\247\267\25\0043\23I\210\276\210\234\262\344\210\247\n\311+P\264+Nd\372"..., 16406) = 16406
27416 10352 18:41:30.414199 write(200, "\27\3\3@\21\370\371h\00665\357\207\201\222y@\25\327L9\364\4\347\17\5a\233(\316V\376\20\317@6M\36\0051V[\245\333\302\21\343\244\211{\366-\210C\236\360o\301m\177\353\266\325<g|\330\35\340\366\30\366\375\351      :F\225v\203\f\312=n\30/\303\245b4i\35b\210'\244\360\221\6tZ\375!\372\365\277\304\272\332\267\374[\22\211\240\254+\365\273\202a\345Y\177(\16\267\24\370"..., 16406) = 16406
27417 10352 18:41:30.414235 write(200, "\27\3\3@\21n4ld\241\264\n\2078\10\322\311\17T\357T\26^Y\317\304\265\30\225\3032\210$\332\243y5\256k\34\233@z\267P\214\342\247\256Mxq)uP:.\364\203\v\17+4\303\fr\255\303~\17\200.h\3\36\2735Vb\20\270eir\207c      X\241\31;\2540\r\322pW\nda\203/\2461s//G\256\"\255\30x\217\255>\27i\273\315\230\262\237x\225m\326{\276"..., 16406) = 16406
27418 10352 18:41:30.414268 write(200, "\27\3\3@\21wF\310&Bd\231\255\353P\354\241+\321\21iXqf\344\341\316k\207v\0\347\3539\354\264c\270\323\260\325\246\255\2543\370Y:5m\320\377\232\n\36\373\326h\232\377jV\273a\316?\357\321\375\362\300\373\227xN      \276~W\177\354\240\21\260\257\n\340\2602\236!\211%{@(\f6\244l\233\364X\277\373\323v\203F\252\247\334\255\252B\33\360\370&\200jI\206\350\t\235\227`\217"..., 16406) = 16406
27419 10352 18:41:30.414309 write(200, "\27\3\3@\21\356\254\6\221S<\205(\271\332\334\1\200\22et\262p\325\246r\221\207\331O\274?js\30M\3727D\233\234:\313\310,\250~\t7\31\226\35\257}\305wI>A6\270\303F\27@\363P\36?^.\327z\372\212I\245Q.\24@\213k4\      33\317<b\3714\234\226\334\336\3\317\302-\246c\255\341\352\354\216\372\5<1\221\342\5GO\217\37\250\264\245\205u\306Z\21v\256\254\237"..., 16406) = 16406
27420 10352 18:41:30.414351 write(200, "\27\3\3@\21\3665\234\357\304X\303\23s\223\331\256lE\302\v\1\215\210\344\356\n\235_\316\35\351jBd\341\321\353$\267u\305\242]\"\250o9\222\34t\237\234\330\215\234\321\35\247\317\10\303\231\23L\277R\214^5/\37      2c\360\266e\3627\232;\255\347\241\215\306\224\250\326k\270|e\230<\361\353nX\212.\36\3503\33\324\7N\360i;\211\320\27\235\236UW57\20\204\7\244(\3305\2108"..., 16406) = 11612
27421 10352 18:41:30.414375 write(200, "\33\313\226\27\235Y\17\317k\215sdI\0\350\350]\260\25\221\270\221\243s\336(^\243\306\360\227\312@mx\1a\342\303%@?\201]\3147\263~1P\266\365Hq>\276\t\2139 \235\177H\234\302\371\20\232\37\225I\260r\332\31Y\21      5\22\202\307j\242\260\241\357\277\256$e\350mU\27\277\327W+\210\337)\261>\373\23\207p\r#yx\360a\342b\334\rK\69\25\207N\362\211\207g\275N"..., 4794) = -1 EAGAIN (Resource temporarily unavailable)
27422 10352 18:41:30.414421 epoll_ctl(68, EPOLL_CTL_MOD, 200, {EPOLLIN|EPOLLOUT|EPOLLET, {u32=200, u64=200}}) = 0
27423 10352 18:41:30.414459 epoll_ctl(68, EPOLL_CTL_MOD, 200, {EPOLLIN|EPOLLET, {u32=200, u64=200}}) = 0
27424 10352 18:41:30.414496 epoll_wait(68,  <unfinished ...>
27425 10348 18:41:30.432663 <... epoll_wait resumed>[{EPOLLIN|EPOLLOUT, {u32=3024135585, u64=139619526026657}}], 100, -1) = 1
27426 10348 18:41:30.432692 futex(0x7efd76eb1b40, FUTEX_WAKE_PRIVATE, 1) = 1

The next event on this socket is the close event, triggered by the ^C from client side, several seconds later.

50361 10352 18:41:42.733970 <... epoll_wait resumed>[{EPOLLIN, {u32=200, u64=200}}], 1024, -1) = 1
50362 10352 18:41:42.734041 epoll_ctl(68, EPOLL_CTL_DEL, 200, 0x7efc08feea24) = 0
50363 10352 18:41:42.734066 close(200)        = 0
50364 10352 18:41:42.734098 epoll_wait(68,  <unfinished ...>

The line 27423 seems the root cause of the issue. That call should not be there.

@kiplingw
Copy link
Member

Thanks for reporting @jiankyu. I'm probably not much help with this issue, but I can't help but wonder if it might be related to another issue I recently discovered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants