Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress tcp-stream garbage referring to a closed fd. #362

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

5ch-cl-thread-882
Copy link

On errornous situation, make-tcp-socket function seems to make a temporal tcp-stream object referring a closed FD.
I saw some strange crashes probably because of it.

This PR tries to release the objects correctly.


Consider make-tcp-socket failed by 'Connection Refused'.
In this situation, %socket-connect raises an error. If I selected to abort, stack unwinds, and the FD will be closed by the unwind-protected form. But the socket object made by this function still exists in memory and continues to refer the closed FD. After that, I sometimes see CCL randomly crashes at processing I/Os.

Here is a case going to crash I found on my machine (MacOS High Sierra 10.13.6).

$ ./dx86cl64
Clozure Common Lisp Version 1.12 (v1.12-39-g6c1a9458) DarwinX8664

For more information about CCL, please see http://ccl.clozure.com.

CCL is free software.  It is distributed under the terms of the Apache
Licence, Version 2.0.
? (machine-version)
"MacBookAir6,2"
? (ql:quickload "usocket")
To load "usocket":
  Load 1 ASDF system:
    usocket
; Loading "usocket"

("usocket")
? (usocket:socket-connect "ja.wikipedia.org" 65500)	;; Fails. Go to abort.
> Error: Error #<USOCKET:CONNECTION-REFUSED-ERROR #x302000AC95BD>
> While executing: USOCKET::RAISE-ERROR-FROM-ID, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.
1 > :pop

? (usocket:socket-connect "nicovideo.jp" 443)
#<USOCKET:STREAM-USOCKET #x302000AC30AD>
? (ql:quickload "hunchentoot")	;; It doesn't seem to matter which library you use.
To load "hunchentoot":
  Load 1 ASDF system:
    hunchentoot
Killed: 9

MacOS's crash report indicated close caused this crash:

...(omitted)...

Exception Type:        EXC_GUARD
Exception Codes:       0x4000000100000007, 0x2c548ffaf07016e3
Exception Subtype:     GUARD_TYPE_FD, id=0x2c548ffaf07016e3, fd=7, flavor=0x00000001 (CLOSE)

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	0x00007fff7fd3d4fa close + 10
1   dx86cl64                      	0x000000000001b223 SPffcall + 99 (x86-spentry64.s:4282)

Thread 1:
0   libsystem_kernel.dylib        	0x00007fff7fd3320a mach_msg_trap + 10
1   libsystem_kernel.dylib        	0x00007fff7fd32724 mach_msg + 60
2   libsystem_kernel.dylib        	0x00007fff7fd32f64 mach_msg_server + 369
3   dx86cl64                      	0x000000000002d031 exception_handler_proc + 49 (x86-exceptions.c:3552)
4   libsystem_pthread.dylib       	0x00007fff7ff04661 _pthread_body + 340
5   libsystem_pthread.dylib       	0x00007fff7ff0450d _pthread_start + 377
6   libsystem_pthread.dylib       	0x00007fff7ff03bf9 thread_start + 13

Thread 2:
0   libsystem_kernel.dylib        	0x00007fff7fd3e142 read + 10
1   dx86cl64                      	0x000000000001b223 SPffcall + 99 (x86-spentry64.s:4282)

...(omitted)...

So I tried to take a trace around ccl::fd-close. I got below:

0> Calling (CCL::FD-CLOSE 7)
 (647B48) : 0 (PRINT-CALL-HISTORY :CONTEXT NIL :PROCESS NIL :ORIGIN NIL :DETAILED-P NIL :COUNT 1152921504606846975 :START-FRAME-NUMBER 0 :STREAM #<SYNONYM-STREAM to *TERMINAL-IO* #x30200047792D> :PRINT-LEVEL 2 :PRINT-LENGTH 5 :PRINT-STRING-LENGTH :DEFAULT :SHOW-INTERNAL-FRAMES NIL :FORMAT :TRADITIONAL) 957
 (647CE0) : 1 (FUNCALL #'#<(CCL::TRACED CCL::FD-CLOSE)> 7) 1317
 (647D60) : 2 (%%IOBLOCK-CLOSE #S(CCL::IOBLOCK :STREAM #<BASIC-TCP-STREAM :CLOSED #x302000A0A9CD> :UNTYI-CHAR NIL ...)) 253
 (647D88) : 3 (%%BEFORE-AND-AFTER-COMBINED-METHOD-DCODE (NIL #<CCL::STANDARD-KERNEL-METHOD CLOSE :AFTER #> . 823242)) 1261
 (647DF8) : 4 (%%STANDARD-COMBINED-METHOD-DCODE ((#<#>) (#<#>) #<CCL::STANDARD-KERNEL-METHOD CLOSE #>) 823242) 237
 (647E70) : 5 (%%CHECK-KEYWORDS #(1 #(:ABORT) #<Combined-Method CLOSE #x302000475ACF>) 823255) 261
 (647ED8) : 6 (DRAIN-TERMINATION-QUEUE) 493
 (647F48) : 7 (FUNCALL #'#<(:INTERNAL CCL::ADD-GC-HOOK)>) 85
 (647F68) : 8 (HOUSEKEEPING) 37
 (647F78) : 9 (HOUSEKEEPING-LOOP) 501
 (647FB8) : 10 (Killed: 9

I think, when the GC collects the temporal socket object, GC tries to close the already closed FD, and strangely go to crash...?

I found resembled codes in make-udp-socket and make-stream-file-socket, so I applied same fixes to them.


This problem is repored at a thread in 5ch.net (Japanese BBS) at first.
I saw the discussions in this thread and made this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant